Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
C
caosdb-advanced-user-tools
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Iterations
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
caosdb
Software
caosdb-advanced-user-tools
Commits
26079795
Commit
26079795
authored
4 months ago
by
I. Nüske
Browse files
Options
Downloads
Patches
Plain Diff
STY: Docstring indentation
parent
ba2c56a8
No related branches found
Branches containing commit
No related tags found
Tags containing commit
2 merge requests
!128
MNT: Added a warning when column metadata is not configured, and a better...
,
!120
XLSX-Konverter: Bessere Fehlermeldung bei inkorrektem Typ in Spalte, zusätzlicher Spalte
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
src/caosadvancedtools/table_json_conversion/xlsx_utils.py
+79
-79
79 additions, 79 deletions
src/caosadvancedtools/table_json_conversion/xlsx_utils.py
with
79 additions
and
79 deletions
src/caosadvancedtools/table_json_conversion/xlsx_utils.py
+
79
−
79
View file @
26079795
...
...
@@ -68,22 +68,22 @@ class RowType(Enum):
def
array_schema_from_model_schema
(
model_schema
:
dict
)
->
dict
:
"""
Convert a *data model* schema to a *data array* schema.
Practically, this means that the top level properties are converted into lists. In a simplified
notation, this can be expressed as:
``array_schema = { elem: [elem typed data...] for elem in model_schema }``
Parameters
----------
model_schema: dict
The schema description of the data model. Must be a json schema *object*, with a number of
*object* typed properties.
Returns
-------
array_schema: dict
A corresponding json schema, where the properties are arrays with the types of the input
'
s
top-level properties.
Practically, this means that the top level properties are converted into lists. In a simplified
notation, this can be expressed as:
``array_schema = { elem: [elem typed data...] for elem in model_schema }``
Parameters
----------
model_schema: dict
The schema description of the data model. Must be a json schema *object*, with a number of
*object* typed properties.
Returns
-------
array_schema: dict
A corresponding json schema, where the properties are arrays with the types of the input
'
s
top-level properties.
"""
assert
model_schema
[
"
type
"
]
==
"
object
"
result
=
deepcopy
(
model_schema
)
...
...
@@ -100,30 +100,30 @@ array_schema: dict
def
get_defining_paths
(
workbook
:
Workbook
)
->
dict
[
str
,
list
[
list
[
str
]]]:
"""
For all sheets in ``workbook``, list the paths which they define.
A sheet is said to define a path, if it has data columns for properties inside that path. For
example, consider the following worksheet:
| `COL_TYPE` | `SCALAR` | `SCALAR` | `LIST` | `SCALAR` |
| `PATH` | `Training` | `Training` | `Training` | `Training` |
| `PATH` | `url` | `date` | `subjects` | `supervisor` |
| `PATH` | | | | `email` |
|------------|----------------|---------------|--------------|--------------------|
| | example.com/mp | 2024-02-27 | Math;Physics | steve@example.com |
| | example.com/m | 2024-02-27 | Math | stella@example.com |
This worksheet defines properties for the paths `[
"
Training
"
]` and `[
"
Training
"
,
"
supervisor
"
]`, and
thus these two path lists would be returned for the key with this sheet
'
s sheetname.
Parameters
----------
workbook: Workbook
The workbook to analyze.
Returns
-------
out: dict[str, list[list[str]]
A dict with worksheet names as keys and lists of paths (represented as string lists) as values.
"""
A sheet is said to define a path, if it has data columns for properties inside that path. For
example, consider the following worksheet:
| `COL_TYPE` | `SCALAR` | `SCALAR` | `LIST` | `SCALAR` |
| `PATH` | `Training` | `Training` | `Training` | `Training` |
| `PATH` | `url` | `date` | `subjects` | `supervisor` |
| `PATH` | | | | `email` |
|------------|----------------|---------------|--------------|--------------------|
| | example.com/mp | 2024-02-27 | Math;Physics | steve@example.com |
| | example.com/m | 2024-02-27 | Math | stella@example.com |
This worksheet defines properties for the paths `[
"
Training
"
]` and `[
"
Training
"
,
"
supervisor
"
]`, and
thus these two path lists would be returned for the key with this sheet
'
s sheetname.
Parameters
----------
workbook: Workbook
The workbook to analyze.
Returns
-------
out: dict[str, list[list[str]]
A dict with worksheet names as keys and lists of paths (represented as string lists) as values.
"""
result
:
dict
[
str
,
list
[
list
[
str
]]]
=
{}
for
sheet
in
workbook
.
worksheets
:
paths
=
[]
...
...
@@ -140,11 +140,11 @@ out: dict[str, list[list[str]]
def
get_data_columns
(
sheet
:
Worksheet
)
->
dict
[
str
,
SimpleNamespace
]:
"""
Return the data paths of the worksheet.
Returns
-------
out: dict[str, SimpleNamespace]
The keys are the stringified paths. The values are SimpleNamespace objects with ``index``,
``path`` and ``column`` attributes.
Returns
-------
out: dict[str, SimpleNamespace]
The keys are the stringified paths. The values are SimpleNamespace objects with ``index``,
``path`` and ``column`` attributes.
"""
column_types
=
_get_column_types
(
sheet
)
path_rows
=
get_path_rows
(
sheet
)
...
...
@@ -171,11 +171,11 @@ out: dict[str, SimpleNamespace]
def
get_foreign_key_columns
(
sheet
:
Worksheet
)
->
dict
[
str
,
SimpleNamespace
]:
"""
Return the foreign keys of the worksheet.
Returns
-------
out: dict[str, SimpleNamespace]
The keys are the stringified paths. The values are SimpleNamespace objects with ``index``,
``path`` and ``column`` attributes.
Returns
-------
out: dict[str, SimpleNamespace]
The keys are the stringified paths. The values are SimpleNamespace objects with ``index``,
``path`` and ``column`` attributes.
"""
column_types
=
_get_column_types
(
sheet
)
path_rows
=
get_path_rows
(
sheet
)
...
...
@@ -198,20 +198,20 @@ out: dict[str, SimpleNamespace]
def
get_path_position
(
sheet
:
Worksheet
)
->
tuple
[
list
[
str
],
str
]:
"""
Return a path which represents the parent element, and the sheet
'
s
"
proper name
"
.
For top-level sheets / entries (those without foreign columns), the path is an empty list.
For top-level sheets / entries (those without foreign columns), the path is an empty list.
A sheet
'
s
"
proper name
"
is detected from the data column paths: it is the first component after the
parent components.
A sheet
'
s
"
proper name
"
is detected from the data column paths: it is the first component after the
parent components.
Returns
-------
parent: list[str]
Path to the parent element. Note that there may be list elements on the path which are **not**
represented in this return value.
Returns
-------
parent: list[str]
Path to the parent element. Note that there may be list elements on the path which are **not**
represented in this return value.
proper_name: str
The
"
proper name
"
of this sheet. This defines an array where all the data lives, relative to the
parent path.
proper_name: str
The
"
proper name
"
of this sheet. This defines an array where all the data lives, relative to the
parent path.
"""
# Parent element: longest common path shared among any foreign column and all the data columns
parent
:
list
[
str
]
=
[]
...
...
@@ -285,7 +285,7 @@ def is_exploded_sheet(sheet: Worksheet) -> bool:
"""
Return True if this is a an
"
exploded
"
sheet.
An exploded sheet is a sheet whose data entries are LIST valued properties of entries in another
sheet. A sheet is detected as exploded if
f
it has FOREIGN columns.
sheet. A sheet is detected as exploded if it has FOREIGN columns.
"""
column_types
=
_get_column_types
(
sheet
)
return
ColumnType
.
FOREIGN
.
name
in
column_types
.
values
()
...
...
@@ -308,22 +308,22 @@ def p2s(path: list[str]) -> str:
def
parse_multiple_choice
(
value
:
Any
)
->
bool
:
"""
Interpret ``value`` as a multiple choice input.
*Truthy* values are:
- The boolean ``True``.
- The number
"
1
"
.
- The (case-insensitive) strings ``true``, ``wahr``, ``x``, ``√``, ``yes``, ``ja``, ``y``, ``j``.
*Falsy* values are:
- The boolean ``False``.
- ``None``, empty strings, lists, dicts.
- The number
"
0
"
.
- The (case-insensitive) strings ``false``, ``falsch``, ``-``, ``no``, ``nein``, ``n``.
- Everything else.
Returns
-------
out: bool
The interpretation result of ``value``.
*Truthy* values are:
- The boolean ``True``.
- The number
"
1
"
.
- The (case-insensitive) strings ``true``, ``wahr``, ``x``, ``√``, ``yes``, ``ja``, ``y``, ``j``.
*Falsy* values are:
- The boolean ``False``.
- ``None``, empty strings, lists, dicts.
- The number
"
0
"
.
- The (case-insensitive) strings ``false``, ``falsch``, ``-``, ``no``, ``nein``, ``n``.
- Everything else.
Returns
-------
out: bool
The interpretation result of ``value``.
"""
# Non-string cases first:
# pylint: disable-next=too-many-boolean-expressions
...
...
@@ -349,7 +349,7 @@ out: bool
def
read_or_dict
(
data
:
Union
[
dict
,
str
,
TextIO
])
->
dict
:
"""
If data is a json file name or input stream, read data from there.
If it is a dict already, just return it.
"""
If it is a dict already, just return it.
"""
if
isinstance
(
data
,
dict
):
return
data
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment