diff --git a/README_SETUP.md b/README_SETUP.md index bf4f25d92106c19cccc276389b6c97aa22904923..3a7f0197a4b06694c7ae787d0baa6e8a89de0e5e 100644 --- a/README_SETUP.md +++ b/README_SETUP.md @@ -64,6 +64,7 @@ Build documentation in `build/` with `make doc`. - `sphinx` - `sphinx-autoapi` +- `sphinx-rtd-theme` - `recommonmark >= 0.6.0` ### How to contribute ### diff --git a/src/caosadvancedtools/table_json_conversion/convert.py b/src/caosadvancedtools/table_json_conversion/convert.py index c0601a17e03c9328341698c28b0d00c9c7b58b40..9485616705e0131504d6a215b628073de9f43243 100644 --- a/src/caosadvancedtools/table_json_conversion/convert.py +++ b/src/caosadvancedtools/table_json_conversion/convert.py @@ -103,8 +103,6 @@ Look at ``xlsx_utils.get_path_position`` for the specification of the "proper na data_column_paths = {col.index: col.path for col in data_columns.values()} # Parent path, insert in correct order. parent, proper_name = xlsx_utils.get_path_position(sheet) - # print(parent, proper_name, sheet.title) - # breakpoint() if parent: parent_sheetname = xlsx_utils.get_worksheet_for_path(parent, self._defining_path_index) if parent_sheetname not in self._handled_sheets: @@ -148,7 +146,6 @@ Look at ``xlsx_utils.get_path_position`` for the specification of the "proper na value = self._validate_and_convert(value, path) _set_in_nested(mydict=data, path=path, value=value, prefix=parent, skip=1) continue - continue # Find current position in tree parent_dict = self._get_parent_dict(parent_path=parent, foreign=foreign) @@ -157,11 +154,7 @@ Look at ``xlsx_utils.get_path_position`` for the specification of the "proper na if proper_name not in parent_dict: parent_dict[proper_name] = [] parent_dict[proper_name].append(data) - # breakpoint() - # if sheet.title == "Training.Organisation": - # breakpoint() self._handled_sheets.add(sheet.title) - # print(f"Added sheet: {sheet.title}") def _is_multiple_choice(self, path: list[str]) -> bool: """Test if the path belongs to a multiple choice section.""" @@ -309,7 +302,7 @@ mydict: dict path: list A list of keys, denoting the location of the value. value - The value inside the dict. + The value which shall be set inside the dict. prefix: list A list of keys which shall be removed from ``path``. A KeyError is raised if ``path`` does not start with the elements of ``prefix``. diff --git a/src/doc/table-json-conversion/specs.md b/src/doc/table-json-conversion/specs.md deleted file mode 100644 index 5a5197473d82886fcb3ee54f8ac9c5865c456710..0000000000000000000000000000000000000000 --- a/src/doc/table-json-conversion/specs.md +++ /dev/null @@ -1,341 +0,0 @@ -# Conversion between LinkAhead data models, JSON schema, and XLSX (and vice versa) # - -This file describes the conversion between JSON schema files and XLSX templates, and between JSON -data files following a given schema and XLSX files with data. This conversion is handled by the -Python modules in the `table_json_conversion` library. - -Requirements: When converting from a json schema, the top level of the json schema must be a -dict. The keys of the dict are RecordType names. - -## Data models in JSON Schema and JSON data ## - -The data model in LinkAhead defines the types of records present in a LinkAhead instance and their -structure. This data model can also be represented in a JSON Schema, which defines the structure of -JSON files containing records pertaining to the data model. - -For example, the following JSON can describe a singe "Person" Record: - -```JSON -{ - "Person": [ - { - "family_name": "Steve", - "given_name": "Stevie" - } - ] -} -``` - -A *JSON Schema* specifies a concrete structure, and the associated JSON files can be used to -represent data for specific record structures. For instance, one could create a JSON Schema allowing -the storage of "Training" Records containing information about conducted trainings. This is -particularly valuable for data import and export. One could generate web forms from the JSON Schema -or use it to export objects stored in LinkAhead as JSON. - -### Note: Data models and data arrays ### - -The schema as created by ``json_schema_exporter.recordtype_to_json_schema(...)`` is, from a broad -view, a dict with all the top level recordtypes (the recordtype names are the keys). While this is -appropriate for the generation of user input forms, data often consists of multiple entries of the -same type. XLSX files are no exception, users expect that they may enter multiple rows of data. - -Since the data model schema does not match multiple data sets, there is a utility function which -create a *data array* schema out of the *data model* schema: It basically replaces the top-level -entries of the data model by lists which may contain data. - -A **short example** illustrates this well. Consider a *data model* schema which fits to this data -content: - -```JSON -{ - "Person": { - "name": "Charly" - } -} -``` - -Now the automatically generated *data array* schema would accept the following data: - -```JSON -{ - "Person": [ - { - "name": "Charly" - }, - { - "name": "Sam" - } - ] -} -``` - -## From JSON to XLSX: Data Representation ## - -The following describes how JSON files representing LinkAhead records are converted into XLSX files, -or how JSON files with records are created from XLSX files. - -The attribute name (e.g., "Person" above) determines the RecordType, and the value of this attribute -can either be an object or a list. If it is an object (as in the example above), a single record is -represented. In the case of a list, multiple records sharing the same RecordType as the parent are -represented. - -The *Properties* of the record (e.g., `family_name` and `given_name` above) become *columns* in the -XLSX file. These properties have an attribute name and a value. The value can be: - -a. A primitive (text, number, boolean, ...) -b. A record -c. A list of primitive types -d. A list of unique enums (multiple choice) -e. A list of records - -In cases *a.* and *c.*, a cell is created in the column corresponding to the property in the XLSX -file. In case *b.*, columns are created for the Properties of the record, where for each of the -Properties the cases *a.* - *e.* are considered recursively. Case *d.* leads to a number of -columns, one for each of the possible choices. - -For case *e.* however, the two-dimensional structure of an XLSX sheet is not sufficient. Therefore, -for such cases, *new* XLSX sheets/tables are created. - -In these sheets/tables, the referenced records are treated as described above (new columns for the -Properties). However, there are now additional columns that indicate from which "external" record -these records are referenced. - -Let's now consider these four cases in detail and with examples: - -### a. Properties with primitive data types ### - -```JSON -{ - "Training": [ - { - "date": "2023-01-01", - "url": "www.indiscale.com", - "duration": 1.0, - "participants": 1, - "remote": false - }, - { - "date": "2023-06-15", - "url": "www.indiscale.com/next", - "duration": 2.5, - "participants": None, - "remote": true - } - ] -} -``` - -This entry will be represented in an XLSX sheet with the following content: - -| date | url | duration | participants | remote | -|------------|------------------------|----------|--------------|--------| -| 2023-01-01 | www.indiscale.com | 1.0 | 1 | false | -| 2023-06-15 | www.indiscale.com/next | 2.5 | | true | - -### b. Property referencing a record ### - -```JSON -{ - "Training": [ - { - "date": "2023-01-01", - "supervisor": { - "family_name": "Stevenson", - "given_name": "Stevie", - } - } - ] -} -``` - -This entry will be represented in an XLSX sheet with the following content: - -| date | `supervisor.family_name` | `supervisor.given_name` | -|------------|--------------------------|-------------------------| -| 2023-01-01 | Stevenson | Stevie | - -Note that column names may be renamed. The mapping of columns to properties of records is ensured -through the content of hidden rows. (See below for the definition of hidden rows.) - -### c. Properties containing lists of primitive data types ### - -```JSON -{ - "Training": [ - { - "url": "www.indiscale.com", - "subjects": ["Math", "Physics"], - } - ] -} -``` - -This entry would be represented in an XLSX sheet with the following content: - -| url | subjects | -|-------------------|--------------| -| www.indiscale.com | Math;Physics | - -The list elements are written into the cell separated by `;` (semicolon). If the elements contain -the separator `;`, it is escaped with `\\`. - -### d. Multiple choice properties ### - -```JSON -{ - "Training": [ - { - "date": "2024-04-17", - "skills": [ - "Planning", - "Evaluation" - ] - } - ] -} -``` - -If the `skills` list is denoted as an `enum` array with `"uniqueItems": true` in the json schema, -this entry would be represented like this in an XLSX: - -| date | skills.Planning | skills.Communication | skills.Evaluation | -|------------|-----------------|----------------------|-------------------| -| 2024-04-17 | x | | x | - -Note that this example assumes that the list of possible choices, as given in the json schema, was -"Planning, Communication, Evaluation". - -### e. Properties containing lists with references ### - -```JSON -{ - "Training": [ - { - "date": "2023-01-01", - "coach": [ - { - "family_name": "Sky", - "given_name": "Max", - }, - { - "family_name": "Sky", - "given_name": "Min", - } - ] - } - ] -} -``` - -Since the two coaches cannot be represented properly in a single cell, another worksheet is needed -to contain the properties of the coaches. - -The sheet for the Trainings in this example only contains the "date" column - -| date | -|------------| -| 2023-01-01 | - -Additionally, there is *another* sheet where the coaches are stored. Here, it is crucial to define -how the correct element is chosen from potentially multiple "Trainings". In this case, it means that -the "date" must be unique. - -Note: This uniqueness requirement is not strictly checked right now, it is your responsibility as a -user that such "foreign properties" are truly unique. - -The second sheet looks like this: - -| date | `coach.family_name` | `coach.given_name` | -|------------|---------------------|--------------------| -| 2023-01-01 | Sky | Max | -| 2023-01-01 | Sky | Min | - -## Data in XLSX: Hidden automation logic ## - -### First column: Marker for row types ### - -The first column in each sheet will be hidden and it will contain an entry in each row that needs -special treatment. The following values are used: - -- ``IGNORE``: This row is ignored. It can be used for explanatory texts or layout. -- ``COL_TYPE``: Typically the first row that is not `IGNORE`. It indicates the row that defines the - type of columns (`FOREIGN`, `SCALAR`, `LIST`, `MULTIPLE_CHOICE`, `IGNORE`). This row must occur - exactly once per sheet. -- ``PATH``: Indicates that the row is used to define the path within the JSON. These rows are - typically hidden for users. - -An example table could look like this: - -| `IGNORE` | | Welcome | to this | file! | | -| `IGNORE` | | Please | enter your | data here: | | -| `COL_TYPE` | `IGNORE` | `SCALAR` | `SCALAR` | `LIST` | `SCALAR` | -| `PATH` | | `Training` | `Training` | `Training` | `Training` | -| `PATH` | | `url` | `date` | `subjects` | `supervisor` | -| `PATH` | | | | | `email` | -| `IGNORE` | Please enter one training per line. | Training URL | Training date | Subjects | Supervisor's email | -|------------|-------------------------------------|----------------|---------------|--------------|--------------------| -| | | example.com/mp | 2024-02-27 | Math;Physics | steve@example.com | -| | | example.com/m | 2024-02-27 | Math | stella@example.com | - -### Parsing XLSX data ### - -To extract the value of a given cell, we traverse all path elements (in ``PATH`` rows) from top to -bottom. The final element of the path is the name of the Property to which the value belongs. In -the example above, `steve@example.com` is the value of the `email` Property in the path -`["Training", "supervisor", "email"]`. - -The path elements are sufficient to identify the object within a JSON, at least if the corresponding -JSON element is a single object. If the JSON element is an array, the appropriate object within the -array needs to be selected. - -For this selection additional ``FOREIGN`` columns are used. The paths in these columns must all have -the same *base* and one additional *unique key* component. For example, two `FOREIGN` columns could -be `["Training", "date"]` and `["Training", "url"]`, where `["Training"]` is the *base path* and -`"date"` and `"url"` are the *unique keys*. - -The base path defines the table (or recordtype) to which the entries belong, and the values of the -unique keys define the actual rows to which data belongs. - -For example, this table defines three coaches for the two trainings from the last table: - -| `COL_TYPE` | `FOREIGN` | `FOREIGN` | `SCALAR` | -| `PATH` | `Training` | `Training` | `Training` | -| `PATH` | `date` | `url` | `coach` | -| `PATH` | | | `given_name` | -| `IGNORE` | Date of training | URL of training | The coach's given name | -| `IGNORE` | from sheet 'Training' | from sheet 'Training' | | -|------------|-----------------------|-----------------------|------------------------| -| | 2024-02-27 | example.com/mp | Ada | -| | 2024-02-27 | example.com/mp | Berta | -| | 2024-02-27 | example.com/m | Chris | - -#### Sepcial case: multiple choice "checkboxes" #### - -As a special case, enum arrays with `"uniqueItems": true` can be represented as multiple columns, -with one column per choice. The choices are denoted as the last `PATH` component, the column type -must be `MULTIPLE_CHOICE`. - -Stored data is denoted as an "x" character in the respective cell, empty cells denote that the item -was not selected. Additionally, the implementation also allows `TRUE` or `1` for selected items, -and `FALSE`, `0` or cells with only whitespace characters for deselected items: - -| `COL_TYPE` | `MULTIPLE_CHOICE` | `MULTIPLE_CHOICE` | `MULTIPLE_CHOICE` | -| `PATH` | `skills` | `skills` | `skills` | -| `PATH` | `Planning` | `Communication` | `Evaluation` | -| `IGNORE` | skills.Planning | skills.Communication | skills.Evaluation | -|------------|-------------------|----------------------|-------------------| -| | x | | X | -| | `" "` | `TRUE` | `FALSE` | -| | 0 | x | 1 | - -These rows correspond to: - -1. Planning, Evaluation -2. Communication -3. Communication, Evaluation - -## Current limitations ## - -The current implementation still lacks the following: - -- Files handling is not implemented yet. diff --git a/src/doc/table-json-conversion/specs.rst b/src/doc/table-json-conversion/specs.rst new file mode 100644 index 0000000000000000000000000000000000000000..c98eddc1180f552f1d2389b1bb57979e93550ab8 --- /dev/null +++ b/src/doc/table-json-conversion/specs.rst @@ -0,0 +1,527 @@ +Conversion between LinkAhead data models, JSON schema, and XLSX (and vice versa) +================================================================================ + +This file describes the conversion between JSON schema files and XLSX +templates, and between JSON data files following a given schema and XLSX +files with data. This conversion is handled by the Python modules in the +``table_json_conversion`` library. + +Data models in JSON Schema and JSON data +---------------------------------------- + +Let’s start simple! If you would describe a ``Person`` Record with the +Properties ``family_name`` and ``given_name`` in JSON, it would probably +look like this: + +.. code:: json + + { + "Person": + { + "family_name": "Steve", + "given_name": "Stevie" + } + } + +The data model in LinkAhead defines the types of records present in a +LinkAhead instance and their structure. This data model can also be +represented in a JSON Schema, which defines the structure of JSON files +containing records pertaining to the data model. + +You can define this kind of structure with the following JSON schema: + +.. code:: json + + { + "type": "object", + "properties": { + "Person": { + "type": "object", + "properties": { + "family_name": { + "type": "string" + }, + "given_name": { + "type": "string" + } + } + } + }, + "$schema": "https://json-schema.org/draft/2020-12/schema" + } + +The above schema (and schemas created by +``json_schema_exporter.merge_schemas(...)``) is, from a broad view, a +dict with all the top level recordtypes (the recordtype names are the +keys). This is sufficient to describe the data model. However, actual +data often consists of multiple entries of the same type (e.g. multiple +Persons). + +Since the data model schema does not match multiple data sets, there is +a utility function which creates a *data array* schema out of the *data +model* schema: It basically replaces the top-level entries of the data +model by lists which may contain data. + +For example, the following JSON describes two “Person” Records: + +.. code:: json + + { + "Person": [ + { + "family_name": "Steve", + "given_name": "Stevie" + }, + { + "family_name": "Star", + "given_name": "Stella" + } + ] + } + +The *JSON Schema* for a JSON like the above one could look like the +following: + +.. code:: json + + { + "type": "object", + "properties": { + "Person": { + "type": "array", + "items": { + "type": "object", + "properties": { + "family_name": { + "type": "string" + }, + "given_name": { + "type": "string" + } + } + } + } + }, + "$schema": "https://json-schema.org/draft/2020-12/schema" + } + +This would define that the top level object/dict may have a key +``Person`` which has as value an array of objects that in turn have the +properties ``family_name`` and ``given_name``. + +You can create a data array schema from a data model schema using +``xlsx_utils.array_schema_from_model_schema``. + +From JSON to XLSX: Data Representation +-------------------------------------- + +The following describes how JSON files representing LinkAhead records +are converted into XLSX files, or how JSON files with records are +created from XLSX files. + +The attribute name (e.g., “Person” above) determines the RecordType, and +the value of this attribute can either be an object or a list. If it is +an object (as in the example above), a single record is represented. In +the case of a list, multiple records sharing the same RecordType as the +parent are represented. + +The *Properties* of the record (e.g., ``family_name`` and ``given_name`` +above) become *columns* in the XLSX file. Thus the XLSX file created +from the above example would have a sheet “Person” with the following +table: + +========== =========== +given_name family_name +========== =========== +Stevie Steve +Stella Star +========== =========== + +The properties of objects (Records) in the JSON have an attribute name +and a value. The value can be: + +a. A primitive (text, number, boolean, …) +b. A record +c. A list of primitive types +d. A list of unique enums (multiple choice) +e. A list of records + +In cases *a.* and *c.*, a cell is created in the column corresponding to +the property in the XLSX file. In case *b.*, columns are created for the +Properties of the record, where for each of the Properties the cases +*a.* - *e.* are considered recursively. Case *d.* leads to a number of +columns, one for each of the possible choices. + +For case *e.* however, the two-dimensional structure of an XLSX sheet is +not sufficient. Therefore, for such cases, *new* XLSX sheets/tables are +created. + +In these sheets/tables, the referenced records are treated as described +above (new columns for the Properties). However, there are now +additional columns that indicate from which “external” record these +records are referenced. + +Let’s now consider these five cases in detail and with examples: + +a. Properties with primitive data types +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code:: json + + { + "Training": [ + { + "date": "2023-01-01", + "url": "www.indiscale.com", + "duration": 1.0, + "participants": 1, + "remote": false + }, + { + "date": "2023-06-15", + "url": "www.indiscale.com/next", + "duration": 2.5, + "participants": None, + "remote": true + } + ] + } + +This entry will be represented in an XLSX sheet with the following +content: + ++------------+------------------------+----------+--------------+--------+ +| date | url | duration | participants | remote | ++============+========================+==========+==============+========+ +| 2023-01-01 | www.indiscale.com | 1.0 | 1 | false | ++------------+------------------------+----------+--------------+--------+ +| 2023-06-15 | www.indiscale.com/next | 2.5 | | true | ++------------+------------------------+----------+--------------+--------+ + +b. Property referencing a record +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code:: json + + { + "Training": [ + { + "date": "2023-01-01", + "supervisor": { + "family_name": "Stevenson", + "given_name": "Stevie" + } + } + ] + } + +This entry will be represented in an XLSX sheet named "Training" with the following +content: + +========== ========================== ========================= +date supervisor.family_name supervisor.given_name +========== ========================== ========================= +2023-01-01 Stevenson Stevie +========== ========================== ========================= + + +c. Properties containing lists of primitive data types +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code:: json + + { + "Training": [ + { + "url": "www.indiscale.com", + "subjects": ["Math", "Physics"], + } + ] + } + +This entry would be represented in an XLSX sheet with the following +content: + +================= ============ +url subjects +================= ============ +www.indiscale.com Math;Physics +================= ============ + +The list elements are written into the cell separated by ``;`` +(semicolon). If the elements contain the separator ``;``, it is escaped +with ``\``. + +d. Multiple choice properties +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code:: json + + { + "Training": [ + { + "date": "2024-04-17", + "skills": [ + "Planning", + "Evaluation" + ] + } + ] + } + +If the ``skills`` list is denoted as an ``enum`` array with +``"uniqueItems": true`` in the json schema, this entry would be +represented like this in an XLSX: + ++------------+-----------------+----------------------+-------------------+ +| date | skills.Planning | skills.Communication | skills.Evaluation | ++============+=================+======================+===================+ +| 2024-04-17 | x | | x | ++------------+-----------------+----------------------+-------------------+ + +Note that this example assumes that the list of possible choices, as +given in the json schema, was “Planning, Communication, Evaluation”. + +e. Properties containing lists with references +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code:: json + + { + "Training": [ + { + "date": "2023-01-01", + "coach": [ + { + "family_name": "Sky", + "given_name": "Max", + }, + { + "family_name": "Sky", + "given_name": "Min", + } + ] + } + ] + } + +Since the two coaches cannot be represented properly in a single cell, +another worksheet is needed to contain the properties of the coaches. + +The sheet for the Trainings in this example only contains the “date” +column + ++------------+ +| date | ++============+ +| 2023-01-01 | ++------------+ + +Additionally, there is *another* sheet where the coaches are stored. +Here, it is crucial to define how the correct element is chosen from +potentially multiple “Trainings”. In this case, it means that the “date” +must be unique. + + +The second sheet looks like this: + +========== ===================== ==================== +date ``coach.family_name`` ``coach.given_name`` +========== ===================== ==================== +2023-01-01 Sky Max +2023-01-01 Sky Min +========== ===================== ==================== + +Note: This uniqueness requirement is not strictly checked right now, it +is your responsibility as a user that such “foreign properties” are +truly unique. + +When converting JSON files that contain Records that were exported from LinkAhead +it might be a good idea to use the LinkAhead ID as a unique identifier for Records. However, if +your Records do not yet have LinkAhead IDs you need to find some other identifying +properties/foreign keys. Note, that those properties only need to identify a Record uniquely within +the list of Records: In the above example the "coach" Record needs to be identified in the list of +coaches. + + +Data in XLSX: Hidden automation logic +------------------------------------- + +First column: Marker for row types +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The first column in each sheet will be hidden and it will contain an +entry in each row that needs special treatment. The following values are +used: + +- ``IGNORE``: This row is ignored. It can be used for explanatory texts + or layout. +- ``COL_TYPE``: Typically the first row that is not ``IGNORE``. It + indicates the row that defines the type of columns (``FOREIGN``, + ``SCALAR``, ``LIST``, ``MULTIPLE_CHOICE``, ``IGNORE``). This row must + occur exactly once per sheet. +- ``PATH``: Indicates that the row is used to define the path within + the JSON. These rows are typically hidden for users. + +An example table could look like this: + ++----------+-------------------------------------+----------------+---------------+--------------+---------------------+ +| IGNORE | | Welcome | to | this | file | ++==========+=====================================+================+===============+==============+=====================+ +| IGNORE | | Please | enter your | data here: | | ++----------+-------------------------------------+----------------+---------------+--------------+---------------------+ +| COL_TYPE | IGNORE | SCALAR | SCALAR | LIST | SCALAR | ++----------+-------------------------------------+----------------+---------------+--------------+---------------------+ +| PATH | | Training | Training | Training | Training | ++----------+-------------------------------------+----------------+---------------+--------------+---------------------+ +| PATH | | url | date | subjects | supervisor | ++----------+-------------------------------------+----------------+---------------+--------------+---------------------+ +| PATH | | | | | email | ++----------+-------------------------------------+----------------+---------------+--------------+---------------------+ +| IGNORE | Please enter one training per line. | Training URL | Training date | Subjects | Supervisor's email | ++----------+-------------------------------------+----------------+---------------+--------------+---------------------+ +| | | example.com/mp | 2024-02-27 | Math;Physics | steve@example.com | ++----------+-------------------------------------+----------------+---------------+--------------+---------------------+ +| | | example.com/m | 2024-02-28 | Math | stella@example.com | ++----------+-------------------------------------+----------------+---------------+--------------+---------------------+ + + +Parsing XLSX data +~~~~~~~~~~~~~~~~~ + +To extract the value of a given cell, we traverse all path elements (in +``PATH`` rows) from top to bottom. The final element of the path is the +name of the Property to which the value belongs. In the example above, +``steve@example.com`` is the value of the ``email`` Property in the path +``["Training", "supervisor", "email"]``. + +The path elements are sufficient to identify the object within a JSON, +at least if the corresponding JSON element is a single object. If the +JSON element is an array, the appropriate object within the array needs +to be selected. + +For this selection additional ``FOREIGN`` columns are used. The paths in +these columns must all have the same *base* and one additional *unique +key* component. For example, two ``FOREIGN`` columns could be +``["Training", "date"]`` and ``["Training", "url"]``, where +``["Training"]`` is the *base path* and ``"date"`` and ``"url"`` are the +*unique keys*. + +The base path defines the table (or recordtype) to which the entries +belong, and the values of the unique keys define the actual rows to +which data belongs. + +For example, this table defines three coaches for the two trainings from +the last table: + ++----------+-----------------------+-----------------------+------------------------+ +| COL_TYPE | FOREIGN | FOREIGN | SCALAR | ++----------+-----------------------+-----------------------+------------------------+ +| PATH | Training | Training | Training | ++----------+-----------------------+-----------------------+------------------------+ +| PATH | date | url | coach | ++----------+-----------------------+-----------------------+------------------------+ +| PATH | | | given_name | ++----------+-----------------------+-----------------------+------------------------+ +| IGNORE | Date of training | URL of training | The coach’s given name | ++----------+-----------------------+-----------------------+------------------------+ +| IGNORE | from sheet ‘Training’ | from sheet ‘Training’ | | ++----------+-----------------------+-----------------------+------------------------+ +| | 2024-02-27 | example.com/mp | Ada | ++----------+-----------------------+-----------------------+------------------------+ +| | 2024-02-27 | example.com/mp | Berta | ++----------+-----------------------+-----------------------+------------------------+ +| | 2024-02-28 | example.com/m | Chris | ++----------+-----------------------+-----------------------+------------------------+ + +Sepcial case: multiple choice “checkboxes” +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +As a special case, enum arrays with ``"uniqueItems": true`` can be +represented as multiple columns, with one column per choice. The choices +are denoted as the last PATH component, the column type must be +MULTIPLE_CHOICE. + +Stored data is denoted as an “x” character in the respective cell, empty +cells denote that the item was not selected. Additionally, the +implementation also allows TRUE or 1 for selected items, and +FALSE, 0 or cells with only whitespace characters for deselected +items: + ++----------+-----------------+----------------------+-------------------+ +| COL_TYPE | MULTIPLE_CHOICE | MULTIPLE_CHOICE | MULTIPLE_CHOICE | ++----------+-----------------+----------------------+-------------------+ +| PATH | skills | skills | skills | ++----------+-----------------+----------------------+-------------------+ +| PATH | Planning | Communication | Evaluation | ++----------+-----------------+----------------------+-------------------+ +| IGNORE | skills.Planning | skills.Communication | skills.Evaluation | ++----------+-----------------+----------------------+-------------------+ +| | x | | X | ++----------+-----------------+----------------------+-------------------+ +| | " " | TRUE | FALSE | ++----------+-----------------+----------------------+-------------------+ +| | 0 | x | 1 | ++----------+-----------------+----------------------+-------------------+ + +These rows correspond to: + +1. Planning, Evaluation +2. Communication +3. Communication, Evaluation + + +User Interaction +---------------- +The primary and most straight forward use case of this utility is to export +LinkAhead data as JSON and then as XLSX tables. This can be done fully +automatic. + +TODO show how! + +The hidden cells for automation are designed such that the XLSX template that +is created can be customized such that it is a nicely formatted table. The +hidden content must remain. See below for tips on how to manipulate the table. + +The second use case is to use XLSX to collect data and then import it into +LinkAhead. Here, it may be necessary to define foreign keys in order to +identify Records in lists. + +Table Manipulation +~~~~~~~~~~~~~~~~~~ + +- All formatting is ignored +- Nothing has to be observed when adding new data rows +- When adding new descriptory rows (for example one for descriptions of the + columns), the ``COL_TYPE`` must be set to ``IGNORE`` +- You can freely rename sheets. +- You can freely rename columns (since the row containing the column names is + set to ``IGNROE``; the Property name is taken from the last path element) +- You can change the order of columns. However, you have to make sure to move + the full column including hidden elements. Thus you should not select a range + of cells, but click on the column index in your spread sheet program. + +Note: Requirements +------------------ + +This conversion does not allow arbitrary JSON schema files nor does it +support arbitrary JSON files since conversion to XLSX files would not +make sense. Instead, this utility is tailored to supported conversion of +data (models) that are structured like data (models) in LinkAhead: + +- The JSON schema describes a data model of RecordTypes and Properties as it would be generated by the caosadvancedtools.json_schema_exporter module. +- The JSON files must contain arrays of Records complying with such a data model. + +Thus, when converting from a JSON schema, the top level of the JSON +schema must be a dict. The keys of the dict are RecordType names. + + + + +Current limitations +------------------- + +The current implementation still lacks the following: + +- Files handling is not implemented yet. +