Skip to content
Snippets Groups Projects

F dict heuristic

Merged Florian Spreckelsen requested to merge f-dict-heuristic into dev
All threads resolved!
Compare and Show latest version
1 file
+ 104
0
Compare changes
  • Side-by-side
  • Inline
+ 104
0
@@ -245,6 +245,110 @@ CSVTableConverter
CSV File → DictElement
PropertiesFromDictConverter
===========================
The :py:class:`~caoscrawler.converters.PropertiesFromDictConverter` is
a specialization of the
:py:class:`~caoscrawler.converters.DictElementConverter` and offers
all its functionality. It is meant to operate on dictionaries (e.g.,
from reading in a json or a table file) the keys of which correspond
closely to LinkAhead properties. This is especially handy in cases
where properties may be added to the data model and data sources after
the writing of the CFood definition.
The converter definition of the
:py:class:`~caoscrawler.converters.PropertiesFromDictConverter` has an
additional required entry ``record_from_dict`` which specifies the
Record to which the properties extracted from the dict are attached
to. This Record is identified by its ``variable_name`` by which it can
be referred to further down the subtree. You can also use the name of
a Record that was specified earlier in the CFood definition in order
to extend it by the properties extracted from a dict. Let's have a
look at a simple example. A CFood definition
.. code-block:: yaml
PropertiesFromDictElement:
type: PropertiesFromDictElement
match: ".*"
record_from_dict:
variable_name: MyRec
parents:
- MyType1
- MyType2
applied on a dictionary
.. code-block:: json
{
"name": "New name",
"a": 5,
"b": ["a", "b", "c"],
"author": {
"full_name": "Silvia Scientist"
}
}
will create a Record ``New name`` with parents ``MyType1`` and
``MyType2``. It has a scalar property ``a`` with value 5, a list
property ``b`` with values "a", "b" and "c", and an ``author``
property which references an ``author`` with a ``full_name`` property
with value "Silvia Scientist". Note how the different dictionary keys
are handled differently depending on their types: scalar and list
values are understood automatically, and a dictionary-valued entry
like ``author`` is translated into a reference to an ``author`` Record
automatically.
You can further specify how references are treated with an optional
``references key`` in ``record_from_dict``. Let's assume that in the
above example, we have an ``author`` **Property** with datatype
``Person`` in our data model. We could add this information by
extending the above example definition by
.. code-block:: yaml
PropertiesFromDictElement:
type: PropertiesFromDictElement
match: ".*"
record_from_dict:
variable_name: MyRec
parents:
- MyType1
- MyType2
references:
author:
parents:
- Person
so that now, a ``Person`` record with a ``full_name`` property with
value "Silvia Scientist" is created as the value of the ``author``
property.
Properties can be blacklisted with the ``properties_blacklist``
keyword. Since the
:py:class:`~caoscrawler.converters.PropertiesFromDictConverter` has
all the functionality of the
:py:class:`~caoscrawler.converters.DictElementConverter`, individual
properties can still be used in a subtree. Together with
``properties_blacklist`` this can be used to add custom treatment to
specific properties by blacklisting them in ``record_from_dict`` and
then treating them in the subtree the same as you would do it in the
standard :py:class:`~caoscrawler.converters.DictElementConverter`.
For further customization, the
:py:class:`~caoscrawler.converters.PropertiesFromDictConverter` can be
used as a basis for :ref:`custom converters<Custom Converters>` which
can make use of its ``referenced_record_callback`` argument. The
``referenced_record_callback`` can be a callable object which takes
exactly one Record, the one specified via ``record_from_dict``, as an
argument and needs to return that Record after doing whatever custom
treatment is needed. ``referenced_record_callback`` is applied
**after** the properties from the dictionary have been applied as
explained above.
Further converters
++++++++++++++++++
Loading