Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
CaosDB Crawler
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Iterations
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Container registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
caosdb
Software
CaosDB Crawler
Commits
0186c0ee
Commit
0186c0ee
authored
3 months ago
by
Florian Spreckelsen
Browse files
Options
Downloads
Plain Diff
Merge branch 'dev' into f-fix-rocrate
parents
dd3f75bf
3c836b10
Branches
Branches containing commit
Tags
Tags containing commit
2 merge requests
!217
TST: Make NamedTemporaryFiles Windows-compatible
,
!215
Fix issues in rocrate support
Pipeline
#60539
passed
3 months ago
Stage: info
Stage: setup
Stage: cert
Stage: style
Stage: test
Changes
2
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
CHANGELOG.md
+1
-0
1 addition, 0 deletions
CHANGELOG.md
src/doc/converters/further_converters.rst
+87
-0
87 additions, 0 deletions
src/doc/converters/further_converters.rst
with
88 additions
and
0 deletions
CHANGELOG.md
+
1
−
0
View file @
0186c0ee
...
...
@@ -55,6 +55,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Security ###
### Documentation ###
-
Added documentation for ROCrateConverter, ELNFileConverter, and ROCrateEntityConverter
## [0.10.1] - 2024-11-13 ##
...
...
This diff is collapsed.
Click to expand it.
src/doc/converters/further_converters.rst
+
87
−
0
View file @
0186c0ee
...
...
@@ -98,3 +98,90 @@ given ``recordname``, this record can be used within the cfood. Most
importantly, this record stores the internal path of this array within the HDF5
file in a text property, the name of which can be configured with the
``internal_path_property_name`` option which defaults to ``internal_hdf5_path``.
ROCrateConverter
================
The ROCrateConverter unpacks ro-crate files, and creates one instance of the
``ROCrateEntity`` structure element for each contained object. Currently only
zipped ro-crate files are supported. The created ROCrateEntities wrap a
``rocrate.model.entity.Entity`` with a path to the folder the ROCrate data
is saved in. They are appended as children and can then be accessed via the
subtree and treated using the :ref:`ROCrateEntityConverter`.
To use the ROCrateConverter, you need to install the LinkAhead crawler with its
optional ``rocrate`` dependency.
ELNFileConverter
----------------
As .eln files are zipped ro-crate files, the ELNFileConverter works analogously
to the ROCrateConverter and also creates ROCrateEntities for contained objects.
ROCrateEntityConverter
----------------------
The ROCrateEntityConverter unpacks the ``rocrate.model.entity.Entity`` wrapped
within a ROCrateEntity, and appends all properties, contained files, and parts
as children. Properties are converted to a basic element matching their value
(``BooleanElement``, ``IntegerElement``, etc.) and can be matched using
match_properties. Each ``rocrate.model.file.File`` is converted to a crawler
File object, which can be matched with SimpleFile. And each subpart of the
ROCrateEntity is also converted to a ROCrateEntity, which can then again be
treated using this converter.
The ``match_entity_type`` keyword can be used to match a ROCrateEntity using its
entity_type. With the ``match_properties`` keyword, properties of a ROCrateEntity
can be either matched or extracted, as seen in the cfood example below:
* with ``match_properties: "@id": ro-crate-metadata.json`` the ROCrateEntities
can be filtered to only match the metadata json files.
* with ``match_properties: dateCreated: (?P<dateCreated>.*)`` the ``dateCreated``
entry of that metadata json file is extracted and accessible through the
``dateCreated`` variable.
* the example could then be extended to use any other entry present in the metadata
json to filter the results, or insert the extracted information into generated records.
Example cfood
-------------
One short cfood to generate records for each .eln file in a directory and
their metadata files could be:
.. code-block:: yaml
---
metadata:
crawler-version: 0.9.0
---
Converters:
ELNFile:
converter: ELNFileConverter
package: caoscrawler.converters.rocrate
ROCrateEntity:
converter: ROCrateEntityConverter
package: caoscrawler.converters.rocrate
ParentDirectory:
type: Directory
match: (.*)
subtree:
ELNFile:
type: ELNFile
match: (?P<filename>.*)\.eln
records:
ELNExampleRecord:
filename: $filename
subtree:
ROCrateEntity:
type: ROCrateEntity
match_properties:
"@id": ro-crate-metadata.json
dateCreated: (?P<dateCreated>.*)
records:
MDExampleRecord:
parent: $ELNFile
filename: ro-crate-metadata.json
time: $dateCreated
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment