From aed256c98ce1ce992c1270f3716f4a59420d01f3 Mon Sep 17 00:00:00 2001 From: Florian Spreckelsen <f.spreckelsen@indiscale.com> Date: Tue, 13 Aug 2024 11:13:50 +0200 Subject: [PATCH] DOC: Add documentation for the PropertiesFromDictConverter --- src/doc/converters.rst | 104 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 104 insertions(+) diff --git a/src/doc/converters.rst b/src/doc/converters.rst index d7e11c23..df4a258f 100644 --- a/src/doc/converters.rst +++ b/src/doc/converters.rst @@ -245,6 +245,110 @@ CSVTableConverter CSV File → DictElement +PropertiesFromDictConverter +=========================== + +The :py:class:`~caoscrawler.converters.PropertiesFromDictConverter` is +a specialization of the +:py:class:`~caoscrawler.converters.DictElementConverter` and offers +all its functionality. It is meant to operate on dictionaries (e.g., +from reading in a json or a table file) the keys of which correspond +closely to LinkAhead properties. This is especially handy in cases +where properties may be added to the data model and data sources after +the writing of the CFood definition. + +The converter definition of the +:py:class:`~caoscrawler.converters.PropertiesFromDictConverter` has an +additional required entry ``record_from_dict`` which specifies the +Record to which the properties extracted from the dict are attached +to. This Record is identified by its ``variable_name`` by which it can +be referred to further down the subtree. You can also use the name of +a Record that was specified earlier in the CFood definition in order +to extend it by the properties extracted from a dict. Let's have a +look at a simple example. A CFood definition + +.. code-block:: yaml + + PropertiesFromDictElement: + type: PropertiesFromDictElement + match: ".*" + record_from_dict: + variable_name: MyRec + parents: + - MyType1 + - MyType2 + +applied on a dictionary + +.. code-block:: json + + { + "name": "New name", + "a": 5, + "b": ["a", "b", "c"], + "author": { + "full_name": "Silvia Scientist" + } + } + +will create a Record ``New name`` with parents ``MyType1`` and +``MyType2``. It has a scalar property ``a`` with value 5, a list +property ``b`` with values "a", "b" and "c", and an ``author`` +property which references an ``author`` with a ``full_name`` property +with value "Silvia Scientist". Note how the different dictionary keys +are handled differently depending on their types: scalar and list +values are understood automatically, and a dictionary-valued entry +like ``author`` is translated into a reference to an ``author`` Record +automatically. + +You can further specify how references are treated with an optional +``references key`` in ``record_from_dict``. Let's assume that in the +above example, we have an ``author`` **Property** with datatype +``Person`` in our data model. We could add this information by +extending the above example definition by + + +.. code-block:: yaml + + PropertiesFromDictElement: + type: PropertiesFromDictElement + match: ".*" + record_from_dict: + variable_name: MyRec + parents: + - MyType1 + - MyType2 + references: + author: + parents: + - Person + +so that now, a ``Person`` record with a ``full_name`` property with +value "Silvia Scientist" is created as the value of the ``author`` +property. + +Properties can be blacklisted with the ``properties_blacklist`` +keyword. Since the +:py:class:`~caoscrawler.converters.PropertiesFromDictConverter` has +all the functionality of the +:py:class:`~caoscrawler.converters.DictElementConverter`, individual +properties can still be used in a subtree. Together with +``properties_blacklist`` this can be used to add custom treatment to +specific properties by blacklisting them in ``record_from_dict`` and +then treating them in the subtree the same as you would do it in the +standard :py:class:`~caoscrawler.converters.DictElementConverter`. + +For further customization, the +:py:class:`~caoscrawler.converters.PropertiesFromDictConverter` can be +used as a basis for :ref:`custom converters<Custom Converters>` which +can make use of its ``referenced_record_callback`` argument. The +``referenced_record_callback`` can be a callable object which takes +exactly one Record, the one specified via ``record_from_dict``, as an +argument and needs to return that Record after doing whatever custom +treatment is needed. ``referenced_record_callback`` is applied +**after** the properties from the dictionary have been applied as +explained above. + Further converters ++++++++++++++++++ -- GitLab