diff --git a/src/doc/converters/transform_functions.rst b/src/doc/converters/transform_functions.rst index ecd47d2dc004c6f1382279901dfec2d96e0e4a2d..35c11093714f59cf3139a2544b5eae2f5a9c17f2 100644 --- a/src/doc/converters/transform_functions.rst +++ b/src/doc/converters/transform_functions.rst @@ -70,3 +70,88 @@ the usual ``$`` notation: There are a number of transform functions that are defined by default (see ``src/caoscrawler/default_transformers.yml``). You can define custom transform functions by adding them to the cfood definition (see :doc:`CFood Documentation<../cfood>`). + + +Custom Transformers +=================== + +Custom transformers are basically python functions having a special form/signature. They need to +be registered in the cfood definition in order to be available during the scanning process. + +Let's assume we want to implement a transformer that replaces all occurrences of single letters +in the value of a variable with a different letter each. So passing "abc" as `in_letters` and +"xyz" as `out_letters` would result in a replacement of a value of "scan started" to +"szxn stxrted". We could implement this in python using the +following code: + +.. code-block:: python + + def replace_letters(in_value: Any, in_parameters: dict) -> Any: + """ + Replace letters in variables + """ + + # The arguments to the transformer (as given by the definition in the cfood) + # are contained in `in_parameters`. We need to make sure they are set or + # set their defaults otherwise: + + if "in_letters" not in in_parameters: + raise RuntimeError("Parameter `in_letters` missing.") + + if "out_letters" not in in_parameters: + raise RuntimeError("Parameter `out_letters` missing.") + + l_in = in_parameters["in_letters"] + l_out = in_parameters["out_letters"] + + + if len(l_in) != len(l_out): + raise RuntimeError("`in_letters` and `out_letters` must have the same length.") + + for l1, l2 in zip(l_in, l_out): + in_value = in_value.replace(l1, l2) + + return in_value + + +This code needs to be put into a module that can be found during runtime of the crawler. +One possibility is to install the package into the same virtual environment that is used +to run the crawler. + +In the cfood the transfomer needs to be registered: + +.. code-block:: yaml + + --- + metadata: + crawler-version: 0.10.2 + macros: + --- + #Converters: # put custom converters here + Transformers: + replace_letters: # This name will be made available in the cfood + function: replace_letters + package: utilities.replace_letters + +This would assume that the code for the function `replace_letters` is residing in a file +called `replace_letters.py` that is stored in a package called `utilities`. + +The transformer can then be used in a converter, e.g.: + + +.. code-block:: yaml + + Experiment: + type: Dict + match: ".*" + transform: + replace_letters: + in: $a + out: $b + functions: + - replace_letters: # This is the name of our custom transformer + in_letters: "abc" + out_letters: "xyz" + records: + Report: + tags: $b