Skip to content
Snippets Groups Projects
Commit 3c836b10 authored by Alexander Schlemmer's avatar Alexander Schlemmer
Browse files

Merge branch 'f-rocrate-documentation' into 'dev'

Documentation for ROCrateConverter

See merge request !209
parents 0970cc2b ca9c9d25
No related branches found
No related tags found
2 merge requests!217TST: Make NamedTemporaryFiles Windows-compatible,!209Documentation for ROCrateConverter
Pipeline #60625 passed
...@@ -52,6 +52,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ...@@ -52,6 +52,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Security ### ### Security ###
### Documentation ### ### Documentation ###
- Added documentation for ROCrateConverter, ELNFileConverter, and ROCrateEntityConverter
## [0.10.1] - 2024-11-13 ## ## [0.10.1] - 2024-11-13 ##
......
...@@ -98,3 +98,90 @@ given ``recordname``, this record can be used within the cfood. Most ...@@ -98,3 +98,90 @@ given ``recordname``, this record can be used within the cfood. Most
importantly, this record stores the internal path of this array within the HDF5 importantly, this record stores the internal path of this array within the HDF5
file in a text property, the name of which can be configured with the file in a text property, the name of which can be configured with the
``internal_path_property_name`` option which defaults to ``internal_hdf5_path``. ``internal_path_property_name`` option which defaults to ``internal_hdf5_path``.
ROCrateConverter
================
The ROCrateConverter unpacks ro-crate files, and creates one instance of the
``ROCrateEntity`` structure element for each contained object. Currently only
zipped ro-crate files are supported. The created ROCrateEntities wrap a
``rocrate.model.entity.Entity`` with a path to the folder the ROCrate data
is saved in. They are appended as children and can then be accessed via the
subtree and treated using the :ref:`ROCrateEntityConverter`.
To use the ROCrateConverter, you need to install the LinkAhead crawler with its
optional ``rocrate`` dependency.
ELNFileConverter
----------------
As .eln files are zipped ro-crate files, the ELNFileConverter works analogously
to the ROCrateConverter and also creates ROCrateEntities for contained objects.
ROCrateEntityConverter
----------------------
The ROCrateEntityConverter unpacks the ``rocrate.model.entity.Entity`` wrapped
within a ROCrateEntity, and appends all properties, contained files, and parts
as children. Properties are converted to a basic element matching their value
(``BooleanElement``, ``IntegerElement``, etc.) and can be matched using
match_properties. Each ``rocrate.model.file.File`` is converted to a crawler
File object, which can be matched with SimpleFile. And each subpart of the
ROCrateEntity is also converted to a ROCrateEntity, which can then again be
treated using this converter.
The ``match_entity_type`` keyword can be used to match a ROCrateEntity using its
entity_type. With the ``match_properties`` keyword, properties of a ROCrateEntity
can be either matched or extracted, as seen in the cfood example below:
* with ``match_properties: "@id": ro-crate-metadata.json`` the ROCrateEntities
can be filtered to only match the metadata json files.
* with ``match_properties: dateCreated: (?P<dateCreated>.*)`` the ``dateCreated``
entry of that metadata json file is extracted and accessible through the
``dateCreated`` variable.
* the example could then be extended to use any other entry present in the metadata
json to filter the results, or insert the extracted information into generated records.
Example cfood
-------------
One short cfood to generate records for each .eln file in a directory and
their metadata files could be:
.. code-block:: yaml
---
metadata:
crawler-version: 0.9.0
---
Converters:
ELNFile:
converter: ELNFileConverter
package: caoscrawler.converters.rocrate
ROCrateEntity:
converter: ROCrateEntityConverter
package: caoscrawler.converters.rocrate
ParentDirectory:
type: Directory
match: (.*)
subtree:
ELNFile:
type: ELNFile
match: (?P<filename>.*)\.eln
records:
ELNExampleRecord:
filename: $filename
subtree:
ROCrateEntity:
type: ROCrateEntity
match_properties:
"@id": ro-crate-metadata.json
dateCreated: (?P<dateCreated>.*)
records:
MDExampleRecord:
parent: $ELNFile
filename: ro-crate-metadata.json
time: $dateCreated
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment