Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
CaosDB Crawler
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Iterations
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Container registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
caosdb
Software
CaosDB Crawler
Commits
56b73398
Commit
56b73398
authored
1 year ago
by
Florian Spreckelsen
Browse files
Options
Downloads
Patches
Plain Diff
DOC: Explain datamodel requirements
parent
f3e8ec1e
Branches
Branches containing commit
Tags
Tags containing commit
2 merge requests
!160
STY: styling
,
!143
ENH: HDF5 Converter
Pipeline
#47453
passed
1 year ago
Stage: info
Stage: setup
Stage: cert
Stage: style
Stage: test
Changes
1
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
src/doc/converters.rst
+28
-1
28 additions, 1 deletion
src/doc/converters.rst
with
28 additions
and
1 deletion
src/doc/converters.rst
+
28
−
1
View file @
56b73398
...
@@ -243,7 +243,34 @@ arrays that are in turn treated by the :ref:`H5GroupConverter`, the
...
@@ -243,7 +243,34 @@ arrays that are in turn treated by the :ref:`H5GroupConverter`, the
need to install the LinkAhead crawler with its optional ``h5crawler`` dependency
need to install the LinkAhead crawler with its optional ``h5crawler`` dependency
for using these converters.
for using these converters.
The basic idea when crawling HDF5 files is to treat them very similar to
:ref:`dictionaries <DictElement Converter>` in which the attributes on root,
group, or dataset level are essentially treated like ``BooleanElement``,
``TextElement``, ``FloatElement``, and ``IntegerElement`` in a dictionary: They
are appended as children and can be accessed via the ``subtree``. The file
itself and the groups within may contain further groups and datasets, which can
have their own attributes, subgroups, and datasets, very much like
``DictElements`` within a dictionary. The main difference to any other
dictionary type is the presence of multi-dimensional arrays within HDF5
datasets. Since LinkAhead doesn't have any datatype corresponding to these, and
since it isn't desirable to store these arrays directly within LinkAhead for
reasons of performance and of searchability, we wrap them within a specific
Record as explained :ref:`below <H5NdarrayConverter>`, together with more
metadata and their internal path within the HDF5 file. Users can thus query for
datasets and their arrays according to their metadata within LinkAhead and then
use the internal path information to access the dataset within the file
directly. The type of this record and the property for storing the internal path
need to be reflected in the datamodel. Using the default names, you would need a datamodel like
.. code-block:: yaml
H5Ndarray:
obligatory_properties:
internal_hdf5-path:
datatype: TEXT
although the names of both property and record type can be configured within the
cfood definition.
H5FileConverter
H5FileConverter
---------------
---------------
...
@@ -267,7 +294,7 @@ H5DatasetConverter
...
@@ -267,7 +294,7 @@ H5DatasetConverter
This is an extension of the
This is an extension of the
:py:class:`~caoscrawler.converters.DictElementConverter` class. Most
:py:class:`~caoscrawler.converters.DictElementConverter` class. Most
importantly, it stores the array data in HDF5 dataset into
importantly, it stores the array data in HDF5 dataset into
:py:class:`~caoscrawler.hdf5_converter
s
.H5NdarrayElement` which is added to its
:py:class:`~caoscrawler.hdf5_converter.H5NdarrayElement` which is added to its
children, as well as the dataset attributes.
children, as well as the dataset attributes.
H5NdarrayConverter
H5NdarrayConverter
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment