@@ -8,10 +8,6 @@ existing StructureElements, Converters create a tree of StructureElements.
...
@@ -8,10 +8,6 @@ existing StructureElements, Converters create a tree of StructureElements.
.. image:: img/converter.png
.. image:: img/converter.png
:height: 170
:height: 170
The ``cfood.yml`` definition also describes which
Converters shall be used to treat the generated child StructureElements. The
definition therefore itself also defines a tree.
Each StructureElement in the tree has a set of properties, organized as
Each StructureElement in the tree has a set of properties, organized as
key-value pairs.
key-value pairs.
Some of those properties are specified by the type of StructureElement. For example,
Some of those properties are specified by the type of StructureElement. For example,
...
@@ -19,15 +15,18 @@ a file could have the file name as property: ``'filename': myfile.dat``.
...
@@ -19,15 +15,18 @@ a file could have the file name as property: ``'filename': myfile.dat``.
Converters may define additional functions that create further values. For
Converters may define additional functions that create further values. For
example, a regular expression could be used to get a date from a file name.
example, a regular expression could be used to get a date from a file name.
CFood definition
++++++++++++++++
A converter is defined via a yml file or part of it. The definition states
Converter application to data is specified via a tree-like yml file (called ``cfood.yml``, by
what kind of StructureElement it treats (typically one).
convention). The yml file specifies which Converters shall be used on which StructureElements, and
Also, it defines how children of the current StructureElement are
how to treat the generated *child* StructureElements.
created and what Converters shall be used to treat those.
The yaml definition may look like this:
The yaml definition may look like this:
TODO: outdated, see cfood-schema.yml
.. todo::
This is outdated, see ``cfood-schema.yml`` for the current specification of a ``cfood.yml``.
.. code-block:: yaml
.. code-block:: yaml
...
@@ -47,13 +46,16 @@ TODO: outdated, see cfood-schema.yml
...
@@ -47,13 +46,16 @@ TODO: outdated, see cfood-schema.yml
subtree:
subtree:
(...)
(...)
The **<NodeName>** is a description of what it represents (e.g.
The **<NodeName>** is a description of what the current block represents (e.g.
'experiment-folder') and is used as identifier.
``experiment-folder``) and is used as an identifier.
**<type>** selects the converter that is going to be matched against the current structure
**<type>** selects the converter that is going to be matched against the current structure
element. If the structure element matches (this is a combination of a typecheck and a detailed
element. If the structure element matches (this is a combination of a typecheck and a detailed
match, see :py:class:`~caoscrawler.converters.Converter` for details) the converter is used
match, see the :py:class:`~caoscrawler.converters.Converter` source documentation for details), the
to generate records (see :py:meth:`~caoscrawler.converters.Converter.create_records`) and to possibly process a subtree, as defined by the function :func:`caoscrawler.converters.create_children`.
converter will:
- generate records (with :py:meth:`~caoscrawler.converters.Converter.create_records`)
- possibly process a subtree (with :py:meth:`caoscrawler.converters.Converter.create_children`)