Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
C
caosdb-advanced-user-tools
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Iterations
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
caosdb
Software
caosdb-advanced-user-tools
Commits
ba2c56a8
Commit
ba2c56a8
authored
4 months ago
by
I. Nüske
Browse files
Options
Downloads
Patches
Plain Diff
STY: Docstring indentation
parent
c6b1da47
No related branches found
No related tags found
2 merge requests
!128
MNT: Added a warning when column metadata is not configured, and a better...
,
!120
XLSX-Konverter: Bessere Fehlermeldung bei inkorrektem Typ in Spalte, zusätzlicher Spalte
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
src/caosadvancedtools/table_json_conversion/convert.py
+99
-100
99 additions, 100 deletions
src/caosadvancedtools/table_json_conversion/convert.py
with
99 additions
and
100 deletions
src/caosadvancedtools/table_json_conversion/convert.py
+
99
−
100
View file @
ba2c56a8
...
@@ -55,10 +55,9 @@ class ForeignError(KeyError):
...
@@ -55,10 +55,9 @@ class ForeignError(KeyError):
class
XLSXConverter
:
class
XLSXConverter
:
"""
Class for conversion from XLSX to JSON.
"""
Class for conversion from XLSX to JSON.
For a detailed description of the required formatting of the XLSX files, see ``specs.md`` in the
For a detailed description of the required formatting of the XLSX files, see ``specs.md`` in the
documentation.
documentation.
"""
"""
PARSER
:
dict
[
str
,
Callable
]
=
{
PARSER
:
dict
[
str
,
Callable
]
=
{
"
string
"
:
str
,
"
string
"
:
str
,
"
number
"
:
float
,
"
number
"
:
float
,
...
@@ -69,17 +68,17 @@ documentation.
...
@@ -69,17 +68,17 @@ documentation.
def
__init__
(
self
,
xlsx
:
Union
[
str
,
BinaryIO
],
schema
:
Union
[
dict
,
str
,
TextIO
],
def
__init__
(
self
,
xlsx
:
Union
[
str
,
BinaryIO
],
schema
:
Union
[
dict
,
str
,
TextIO
],
strict
:
bool
=
False
):
strict
:
bool
=
False
):
"""
"""
Parameters
Parameters
----------
----------
xlsx: Union[str, BinaryIO]
xlsx: Union[str, BinaryIO]
Path to the XLSX file or opened file object.
Path to the XLSX file or opened file object.
schema: Union[dict, str, TextIO]
schema: Union[dict, str, TextIO]
Schema for validation of XLSX content.
Schema for validation of XLSX content.
strict: bool, optional
strict: bool, optional
If True, fail faster.
If True, fail faster.
"""
"""
self
.
_workbook
=
load_workbook
(
xlsx
)
self
.
_workbook
=
load_workbook
(
xlsx
)
self
.
_schema
=
read_or_dict
(
schema
)
self
.
_schema
=
read_or_dict
(
schema
)
self
.
_defining_path_index
=
xlsx_utils
.
get_defining_paths
(
self
.
_workbook
)
self
.
_defining_path_index
=
xlsx_utils
.
get_defining_paths
(
self
.
_workbook
)
...
@@ -91,20 +90,20 @@ strict: bool, optional
...
@@ -91,20 +90,20 @@ strict: bool, optional
def
to_dict
(
self
,
validate
:
bool
=
False
,
collect_errors
:
bool
=
True
)
->
dict
:
def
to_dict
(
self
,
validate
:
bool
=
False
,
collect_errors
:
bool
=
True
)
->
dict
:
"""
Convert the xlsx contents to a dict.
"""
Convert the xlsx contents to a dict.
Parameters
Parameters
----------
----------
validate: bool, optional
validate: bool, optional
If True, validate the result against the schema.
If True, validate the result against the schema.
collect_errors: bool, optional
collect_errors: bool, optional
If True, do not fail at the first error, but try to collect as many errors as possible. After an
If True, do not fail at the first error, but try to collect as many errors as possible. After an
Exception is raised, the errors can be collected with ``get_errors()`` and printed with
Exception is raised, the errors can be collected with ``get_errors()`` and printed with
``get_error_str()``.
``get_error_str()``.
Returns
Returns
-------
-------
out: dict
out: dict
A dict representing the JSON with the extracted data.
A dict representing the JSON with the extracted data.
"""
"""
self
.
_handled_sheets
=
set
()
self
.
_handled_sheets
=
set
()
self
.
_result
=
{}
self
.
_result
=
{}
...
@@ -177,17 +176,17 @@ out: dict
...
@@ -177,17 +176,17 @@ out: dict
def
_handle_sheet
(
self
,
sheet
:
Worksheet
,
fail_later
:
bool
=
False
)
->
None
:
def
_handle_sheet
(
self
,
sheet
:
Worksheet
,
fail_later
:
bool
=
False
)
->
None
:
"""
Add the contents of the sheet to the result (stored in ``self._result``).
"""
Add the contents of the sheet to the result (stored in ``self._result``).
Each row in the sheet corresponds to one entry in an array in the result. Which array exactly is
Each row in the sheet corresponds to one entry in an array in the result. Which array exactly is
defined by the sheet
'
s
"
proper name
"
and the content of the foreign columns.
defined by the sheet
'
s
"
proper name
"
and the content of the foreign columns.
Look at ``xlsx_utils.get_path_position`` for the specification of the
"
proper name
"
.
Look at ``xlsx_utils.get_path_position`` for the specification of the
"
proper name
"
.
Parameters
Parameters
----------
----------
fail_later: bool, optional
fail_later: bool, optional
If True, do not fail with unresolvable foreign definitions, but collect all errors.
If True, do not fail with unresolvable foreign definitions, but collect all errors.
"""
"""
row_type_column
=
xlsx_utils
.
get_row_type_column_index
(
sheet
)
row_type_column
=
xlsx_utils
.
get_row_type_column_index
(
sheet
)
foreign_columns
=
xlsx_utils
.
get_foreign_key_columns
(
sheet
)
foreign_columns
=
xlsx_utils
.
get_foreign_key_columns
(
sheet
)
foreign_column_paths
=
{
col
.
index
:
col
.
path
for
col
in
foreign_columns
.
values
()}
foreign_column_paths
=
{
col
.
index
:
col
.
path
for
col
in
foreign_columns
.
values
()}
...
@@ -267,9 +266,9 @@ fail_later: bool, optional
...
@@ -267,9 +266,9 @@ fail_later: bool, optional
def
_get_parent_dict
(
self
,
parent_path
:
list
[
str
],
foreign
:
list
[
list
])
->
dict
:
def
_get_parent_dict
(
self
,
parent_path
:
list
[
str
],
foreign
:
list
[
list
])
->
dict
:
"""
Return the dict into which values can be inserted.
"""
Return the dict into which values can be inserted.
This method returns, from the current result-in-making, the entry at ``parent_path`` which matches
This method returns, from the current result-in-making, the entry at ``parent_path`` which matches
the values given in the ``foreign`` specification.
the values given in the ``foreign`` specification.
"""
"""
foreign_groups
=
_group_foreign_paths
(
foreign
,
common
=
parent_path
)
foreign_groups
=
_group_foreign_paths
(
foreign
,
common
=
parent_path
)
current_object
=
self
.
_result
current_object
=
self
.
_result
...
@@ -296,9 +295,9 @@ the values given in the ``foreign`` specification.
...
@@ -296,9 +295,9 @@ the values given in the ``foreign`` specification.
def
_validate_and_convert
(
self
,
value
:
Any
,
path
:
list
[
str
]):
def
_validate_and_convert
(
self
,
value
:
Any
,
path
:
list
[
str
]):
"""
Apply some basic validation and conversion steps.
"""
Apply some basic validation and conversion steps.
This includes:
This includes:
- Validation against the type given in the schema
- Validation against the type given in the schema
- List typed values are split at semicolons and validated individually
- List typed values are split at semicolons and validated individually
"""
"""
if
value
is
None
:
if
value
is
None
:
return
value
return
value
...
@@ -340,29 +339,29 @@ This includes:
...
@@ -340,29 +339,29 @@ This includes:
def
_group_foreign_paths
(
foreign
:
list
[
list
],
common
:
list
[
str
])
->
list
[
SimpleNamespace
]:
def
_group_foreign_paths
(
foreign
:
list
[
list
],
common
:
list
[
str
])
->
list
[
SimpleNamespace
]:
"""
Group the foreign keys by their base paths.
"""
Group the foreign keys by their base paths.
Parameters
Parameters
----------
----------
foreign: list[list]
foreign: list[list]
A list of foreign definitions, consisting of path components, property and possibly value.
A list of foreign definitions, consisting of path components, property and possibly value.
common: list[list[str]]
common: list[list[str]]
A common path which defines the final target of the foreign definitions. This helps to understand
A common path which defines the final target of the foreign definitions. This helps to understand
where the ``foreign`` paths shall be split.
where the ``foreign`` paths shall be split.
Returns
Returns
-------
-------
out: list[dict[str, list[list]]]
out: list[dict[str, list[list]]]
A list of foreign path segments, grouped by their common segments. Each element is a namespace
A list of foreign path segments, grouped by their common segments. Each element is a namespace
with detailed information of all those elements which form the group. The namespace has the
with detailed information of all those elements which form the group. The namespace has the
following attributes:
following attributes:
- ``path``: The full path to this path segment. This is always the previous segment
'
s ``path``
- ``path``: The full path to this path segment. This is always the previous segment
'
s ``path``
plus this segment
'
s ``subpath``.
plus this segment
'
s ``subpath``.
- ``stringpath``: The stringified ``path``, might be useful for comparison or sorting.
- ``stringpath``: The stringified ``path``, might be useful for comparison or sorting.
- ``subpath``: The path, relative from the previous segment.
- ``subpath``: The path, relative from the previous segment.
- ``definitions``: A list of the foreign definitions for this segment, but stripped of the
- ``definitions``: A list of the foreign definitions for this segment, but stripped of the
``path`` components.
``path`` components.
"""
"""
# Build a simple dict first, without subpath.
# Build a simple dict first, without subpath.
results
=
{}
results
=
{}
...
@@ -405,31 +404,31 @@ def _set_in_nested(mydict: dict, path: list, value: Any, prefix: list = [], skip
...
@@ -405,31 +404,31 @@ def _set_in_nested(mydict: dict, path: list, value: Any, prefix: list = [], skip
overwrite
:
bool
=
False
,
append_to_list
:
bool
=
False
)
->
dict
:
overwrite
:
bool
=
False
,
append_to_list
:
bool
=
False
)
->
dict
:
"""
Set a value in a nested dict.
"""
Set a value in a nested dict.
Parameters
Parameters
----------
----------
mydict: dict
mydict: dict
The dict into which the ``value`` shall be inserted.
The dict into which the ``value`` shall be inserted.
path: list
path: list
A list of keys, denoting the location of the value.
A list of keys, denoting the location of the value.
value
value
The value which shall be set inside the dict.
The value which shall be set inside the dict.
prefix: list
prefix: list
A list of keys which shall be removed from ``path``. A KeyError is raised if ``path`` does not
A list of keys which shall be removed from ``path``. A KeyError is raised if ``path`` does not
start with the elements of ``prefix``.
start with the elements of ``prefix``.
skip: int = 0
skip: int = 0
Remove this many additional levels from the path, *after* removing the prefix.
Remove this many additional levels from the path, *after* removing the prefix.
overwrite: bool = False
overwrite: bool = False
If True, allow overwriting existing content. Otherwise, attempting to overwrite existing values
If True, allow overwriting existing content. Otherwise, attempting to overwrite existing values
leads to an exception.
leads to an exception.
append_to_list: bool = False
append_to_list: bool = False
If True, assume that the element at ``path`` is a list and append the value to it. If the list
If True, assume that the element at ``path`` is a list and append the value to it. If the list
does not exist, create it. If there is a non-list at ``path`` already, overwrite it with a new
does not exist, create it. If there is a non-list at ``path`` already, overwrite it with a new
list, if ``overwrite`` is True, otherwise raise a ValueError.
list, if ``overwrite`` is True, otherwise raise a ValueError.
Returns
Returns
-------
-------
mydict: dict
mydict: dict
The same dictionary that was given as a parameter, but modified.
The same dictionary that was given as a parameter, but modified.
"""
"""
for
idx
,
el
in
enumerate
(
prefix
):
for
idx
,
el
in
enumerate
(
prefix
):
if
path
[
idx
]
!=
el
:
if
path
[
idx
]
!=
el
:
...
@@ -473,25 +472,25 @@ def to_dict(xlsx: Union[str, BinaryIO], schema: Union[dict, str, TextIO],
...
@@ -473,25 +472,25 @@ def to_dict(xlsx: Union[str, BinaryIO], schema: Union[dict, str, TextIO],
validate
:
bool
=
None
,
strict
:
bool
=
False
)
->
dict
:
validate
:
bool
=
None
,
strict
:
bool
=
False
)
->
dict
:
"""
Convert the xlsx contents to a dict, it must follow a schema.
"""
Convert the xlsx contents to a dict, it must follow a schema.
Parameters
Parameters
----------
----------
xlsx: Union[str, BinaryIO]
xlsx: Union[str, BinaryIO]
Path to the XLSX file or opened file object.
Path to the XLSX file or opened file object.
schema: Union[dict, str, TextIO]
schema: Union[dict, str, TextIO]
Schema for validation of XLSX content.
Schema for validation of XLSX content.
validate: bool, optional
validate: bool, optional
If True, validate the result against the schema.
If True, validate the result against the schema.
strict: bool, optional
strict: bool, optional
If True, fail faster.
If True, fail faster.
Returns
Returns
-------
-------
out: dict
out: dict
A dict representing the JSON with the extracted data.
A dict representing the JSON with the extracted data.
"""
"""
converter
=
XLSXConverter
(
xlsx
,
schema
,
strict
=
strict
)
converter
=
XLSXConverter
(
xlsx
,
schema
,
strict
=
strict
)
return
converter
.
to_dict
()
return
converter
.
to_dict
()
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment