Skip to content
Snippets Groups Projects
Commit 3c32b1f0 authored by Florian Spreckelsen's avatar Florian Spreckelsen
Browse files

FEAT(scanner): Auto-generate converter_registry and registered_transformer_functions

parent 4e021ace
No related branches found
No related tags found
2 merge requests!222Release 0.12.0,!221F auto converter and transformer registry
Pipeline #62215 passed
...@@ -11,11 +11,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ...@@ -11,11 +11,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Changed ### ### Changed ###
- `scanner.scan_structure_elements` now auto-generates the
`converter_registry` and the `registered_transformer_functions` from
the `crawler_definition` if none are given. Therefore, the
`converter_registry` argument is now optional.
### Deprecated ### ### Deprecated ###
### Removed ### ### Removed ###
### Fixed ### ### Fixed ###
- A RecordType with multiple Parents no longer causes an error during - A RecordType with multiple Parents no longer causes an error during
collection of identifiables collection of identifiables
......
...@@ -486,7 +486,7 @@ def scan_directory(dirname: Union[str, list[str]], crawler_definition_path: str, ...@@ -486,7 +486,7 @@ def scan_directory(dirname: Union[str, list[str]], crawler_definition_path: str,
def scan_structure_elements(items: Union[list[StructureElement], StructureElement], def scan_structure_elements(items: Union[list[StructureElement], StructureElement],
crawler_definition: dict, crawler_definition: dict,
converter_registry: dict, converter_registry: Optional[dict] = None,
restricted_path: Optional[list[str]] = None, restricted_path: Optional[list[str]] = None,
debug_tree: Optional[DebugTree] = None, debug_tree: Optional[DebugTree] = None,
registered_transformer_functions: Optional[dict] = None) -> ( registered_transformer_functions: Optional[dict] = None) -> (
...@@ -508,6 +508,15 @@ def scan_structure_elements(items: Union[list[StructureElement], StructureElemen ...@@ -508,6 +508,15 @@ def scan_structure_elements(items: Union[list[StructureElement], StructureElemen
Traverse the data tree only along the given path. When the end of the Traverse the data tree only along the given path. When the end of the
given path is reached, traverse the full tree as normal. See docstring given path is reached, traverse the full tree as normal. See docstring
of 'scanner' for more details. of 'scanner' for more details.
converter_registry: dict, optional
Optional dictionary containing the converter definitions
needed for the crawler definition. If none is given, it will
be generated from the `crawler_definition`. Default is None.
registered_transformer_functions: dict, optional
Optional dictionary containing the transformer function
definitions needed for the crawler definition. If none is
given, it will be generated from the
`crawler_definition`. Default is None.
Returns Returns
------- -------
...@@ -519,6 +528,10 @@ def scan_structure_elements(items: Union[list[StructureElement], StructureElemen ...@@ -519,6 +528,10 @@ def scan_structure_elements(items: Union[list[StructureElement], StructureElemen
if not isinstance(items, list): if not isinstance(items, list):
items = [items] items = [items]
if converter_registry is None:
converter_registry = create_converter_registry(crawler_definition)
if registered_transformer_functions is None:
registered_transformer_functions = create_transformer_registry(crawler_definition)
# TODO: needs to be covered somewhere else # TODO: needs to be covered somewhere else
# self.run_id = uuid.uuid1() # self.run_id = uuid.uuid1()
converters = initialize_converters(crawler_definition, converter_registry) converters = initialize_converters(crawler_definition, converter_registry)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment