Skip to content
Snippets Groups Projects
Verified Commit ea758fd7 authored by Daniel Hornung's avatar Daniel Hornung
Browse files

WIP: Rename caosdb -> linkahead

parent 22669759
No related branches found
No related tags found
1 merge request!79MAINT: linkahead rename
Pipeline #36703 failed
...@@ -10,6 +10,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ...@@ -10,6 +10,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- TableImporter now accepts a `existing_columns` argument which demands that certain columns exist - TableImporter now accepts a `existing_columns` argument which demands that certain columns exist
### Changed ### ### Changed ###
- Name change: `caosdb` -> `linkahead`
- The converters and datatype arguments of TableImporter now may have keys for nonexisting columns - The converters and datatype arguments of TableImporter now may have keys for nonexisting columns
### Deprecated ### ### Deprecated ###
......
...@@ -19,7 +19,7 @@ authors: ...@@ -19,7 +19,7 @@ authors:
- family-names: Luther - family-names: Luther
given-names: Stefan given-names: Stefan
orcid: https://orcid.org/0000-0001-7214-8125 orcid: https://orcid.org/0000-0001-7214-8125
title: CaosDB - Advanced User Tools title: LinkAhead - Advanced User Tools
version: 0.7.0 version: 0.7.0
doi: 10.3390/data4020083 doi: 10.3390/data4020083
date-released: 2023-01-20 date-released: 2023-01-20
\ No newline at end of file
======================== ===========================
The concepts of pycaosdb The concepts of pylinkahead
======================== ===========================
Some text... Some text...
...@@ -14,7 +14,7 @@ ...@@ -14,7 +14,7 @@
# #
# import os # import os
# import sys # import sys
# sys.path.insert(0, os.path.abspath('../caosdb')) # sys.path.insert(0, os.path.abspath('../linkahead'))
import sphinx_rtd_theme import sphinx_rtd_theme
...@@ -190,7 +190,7 @@ epub_exclude_files = ['search.html'] ...@@ -190,7 +190,7 @@ epub_exclude_files = ['search.html']
# Example configuration for intersphinx: refer to the Python standard library. # Example configuration for intersphinx: refer to the Python standard library.
intersphinx_mapping = { intersphinx_mapping = {
"python": ("https://docs.python.org/", None), "python": ("https://docs.python.org/", None),
"caosdb-pylib": ("https://caosdb.gitlab.io/caosdb-pylib/", None), "linkahead-pylib": ("https://linkahead.gitlab.io/linkahead-pylib/", None),
} }
......
============== ==============
CaosDB Crawler LinkAhead Crawler
============== ==============
The `CaosDB The `LinkAhead
crawler <https://gitlab.com/caosdb/caosdb-advanced-user-tools/blob/main/src/linkaheadadvancedtools/crawler.py>`__ crawler <https://gitlab.com/linkahead/linkahead-advanced-user-tools/blob/main/src/linkaheadadvancedtools/crawler.py>`__
is a tool for the automated insertion or update of entities in CaosDB. is a tool for the automated insertion or update of entities in LinkAhead.
Typically, a file structure is crawled, but other things can be crawled as well. Typically, a file structure is crawled, but other things can be crawled as well.
For example tables or HDF5 files. For example tables or HDF5 files.
...@@ -14,7 +14,7 @@ Introduction ...@@ -14,7 +14,7 @@ Introduction
In simple terms, the crawler is a program that scans a directory In simple terms, the crawler is a program that scans a directory
structure, identifies files that will be treated, and generates structure, identifies files that will be treated, and generates
corresponding Entities in CaosDB, possibly filling meta data. During corresponding Entities in LinkAhead, possibly filling meta data. During
this process the crawler can also open files and derive content from this process the crawler can also open files and derive content from
within, for example reading CSV tables and processing individual rows of within, for example reading CSV tables and processing individual rows of
these tables. these tables.
...@@ -32,7 +32,7 @@ is the following: ...@@ -32,7 +32,7 @@ is the following:
Technically, this behaviour can be adjusted to your needs using so Technically, this behaviour can be adjusted to your needs using so
called CFood (pun intended! :-)) Python classes. More details on the called CFood (pun intended! :-)) Python classes. More details on the
different components of the CaosDB Crawler can be found in the different components of the LinkAhead Crawler can be found in the
`developers’ information <#extending-the-crawlers>`__ below. `developers’ information <#extending-the-crawlers>`__ below.
In case you are happy with our suggestion of a standard crawler, feel In case you are happy with our suggestion of a standard crawler, feel
...@@ -75,10 +75,10 @@ The crawler can be executed directly via a Python script (usually called ...@@ -75,10 +75,10 @@ The crawler can be executed directly via a Python script (usually called
``crawl.py``). The script prints the progress and reports potential ``crawl.py``). The script prints the progress and reports potential
problems. The exact behavior depends on your setup. However, you can problems. The exact behavior depends on your setup. However, you can
have a look at the example in the have a look at the example in the
`tests <https://gitlab.indiscale.com/caosdb/src/caosdb-advanced-user-tools/-/blob/main/integrationtests/crawl.py>`__. `tests <https://gitlab.com/linkahead/linkahead-advanced-user-tools/-/blob/main/integrationtests/crawl.py>`__.
.. Note:: The crawler depends on the CaosDB Python client, so make sure to install :doc:`pycaosdb .. Note:: The crawler depends on the LinkAhead Python client, so make sure to install :doc:`pylinkahead
<caosdb-pylib:getting_started>`. <linkahead-pylib:getting_started>`.
Call ``python3 crawl.py --help`` to see what parameters can be provided. Call ``python3 crawl.py --help`` to see what parameters can be provided.
...@@ -90,11 +90,11 @@ Typically, an invocation looks like: ...@@ -90,11 +90,11 @@ Typically, an invocation looks like:
.. Note:: For trying out the above mentioned example crawler from the integration tests, .. Note:: For trying out the above mentioned example crawler from the integration tests,
make sure that the ``extroot`` directory in the ``integrationtests`` folder is used as make sure that the ``extroot`` directory in the ``integrationtests`` folder is used as
CaosDB's extroot directory, and call the crawler indirectly via ``./test.sh``. LinkAhead's extroot directory, and call the crawler indirectly via ``./test.sh``.
In this case ``/someplace/`` identifies the path to be crawled **within In this case ``/someplace/`` identifies the path to be crawled **within
CaosDB's file system**. You can browse the CaosDB file system by LinkAhead's file system**. You can browse the LinkAhead file system by
opening the WebUI of your CaosDB instance and clicking on “File System”. opening the WebUI of your LinkAhead instance and clicking on “File System”.
In the backend, ``crawl.py`` starts a CQL query In the backend, ``crawl.py`` starts a CQL query
``FIND File WHICH IS STORED AT /someplace/**`` and crawls the resulting ``FIND File WHICH IS STORED AT /someplace/**`` and crawls the resulting
...@@ -114,18 +114,18 @@ function ``loadFiles`` contained in the package: ...@@ -114,18 +114,18 @@ function ``loadFiles`` contained in the package:
:: ::
python3 -m linkaheadadvancedtools.loadFiles /opt/caosdb/mnt/extroot python3 -m linkaheadadvancedtools.loadFiles /opt/linkahead/mnt/extroot
``/opt/caosdb/mnt/extroot`` is the root of the file system to be crawled ``/opt/linkahead/mnt/extroot`` is the root of the file system to be crawled
as seen by the CaosDB server (The actual path may vary. This is the used as seen by the LinkAhead server (The actual path may vary. This is the used
in the LinkAhead distribution of CaosDB). In this case the root file in the LinkAhead distribution of LinkAhead). In this case the root file
system as seen from within the CaosDB docker process is used. system as seen from within the LinkAhead docker process is used.
You can provide a ``.caosdbignore`` file as a commandline option to the above You can provide a ``.linkaheadignore`` file as a commandline option to the above
loadFiles command. The syntax of that file is the same as for `gitignore loadFiles command. The syntax of that file is the same as for `gitignore
<https://git-scm.com/docs/gitignore>`_ files. Note, that you can have additional <https://git-scm.com/docs/gitignore>`_ files. Note, that you can have additional
``.caosdbignore`` files at lower levels which are appended to the current ignore ``.linkaheadignore`` files at lower levels which are appended to the current ignore
file and have an effect of the respective subtree. file and have an effect of the respective subtree.
...@@ -168,15 +168,15 @@ It should be independent of other data and define the following methods: ...@@ -168,15 +168,15 @@ It should be independent of other data and define the following methods:
updated on the Server must be in ``AbstractCFood.to_be_updated`` after this call. updated on the Server must be in ``AbstractCFood.to_be_updated`` after this call.
As hinted above, the main feature of an ``identifiable`` is its fingerprinting ability: it has As hinted above, the main feature of an ``identifiable`` is its fingerprinting ability: it has
sufficient properties to identify an existing Record in CaosDB so that the CFood can decide which sufficient properties to identify an existing Record in LinkAhead so that the CFood can decide which
Records should be updated by the Crawler instead of inserting a new one. Obviously, this allows the Records should be updated by the Crawler instead of inserting a new one. Obviously, this allows the
Crawler to run twice on the same file structure without duplicating the data in CaosDB. Crawler to run twice on the same file structure without duplicating the data in LinkAhead.
An ``identifiable`` is a Python :py:class:`~caosdb.common.models.Record` object with the features to An ``identifiable`` is a Python :py:class:`~linkahead.common.models.Record` object with the features to
identify the correct Record in CaosDB. This object is used to create a query in order to determine identify the correct Record in LinkAhead. This object is used to create a query in order to determine
whether the Record exists. If the Record does not exist, the ``identifiable`` is used to insert the whether the Record exists. If the Record does not exist, the ``identifiable`` is used to insert the
Record. Thus, after this step the Crawler guarantees that a Record with the features of the Record. Thus, after this step the Crawler guarantees that a Record with the features of the
``identifiable`` exists in CaosDB (either previously existing or newly created). ``identifiable`` exists in LinkAhead (either previously existing or newly created).
An example: An experiment might be uniquely identified by the date when it was conducted and a An example: An experiment might be uniquely identified by the date when it was conducted and a
...@@ -208,13 +208,13 @@ In short, the Crawler interacts with the available CFoods in the following way: ...@@ -208,13 +208,13 @@ In short, the Crawler interacts with the available CFoods in the following way:
#. ``cfood.create_identifiables()`` As described :ref:`above<c-food-introduction>`, create #. ``cfood.create_identifiables()`` As described :ref:`above<c-food-introduction>`, create
identifiables. identifiables.
#. All the identifiables in ``cfood.identifiables`` are searched for existence in the CaosDB #. All the identifiables in ``cfood.identifiables`` are searched for existence in the LinkAhead
instance, and inserted if they do not exist. instance, and inserted if they do not exist.
#. ``cfood.update_identifiables()`` As described :ref:`above<c-food-introduction>`, update the #. ``cfood.update_identifiables()`` As described :ref:`above<c-food-introduction>`, update the
identifiables if their content needs to change. identifiables if their content needs to change.
#. All the identifiables in ``cfood.to_be_updated`` are synced to the CaosDB instance. #. All the identifiables in ``cfood.to_be_updated`` are synced to the LinkAhead instance.
The following sketch aims to visualize this procedure. The following sketch aims to visualize this procedure.
...@@ -223,11 +223,11 @@ The following sketch aims to visualize this procedure. ...@@ -223,11 +223,11 @@ The following sketch aims to visualize this procedure.
Sketch of how the Crawler uses the CFoods to process objects. Of the four identifiables Sketch of how the Crawler uses the CFoods to process objects. Of the four identifiables
(fingerprints) on the right, only the second does not exist yet and is thus inserted in the (fingerprints) on the right, only the second does not exist yet and is thus inserted in the
second step. Only the identifiables number 2 and 4 have new or changed content, so only these second step. Only the identifiables number 2 and 4 have new or changed content, so only these
are synced to CaosDB in the last step. are synced to LinkAhead in the last step.
.. note:: **Practical hint:** After the call to .. note:: **Practical hint:** After the call to
:py:meth:`~linkaheadadvancedtools.cfood.AbstractCFood.create_identifiables`, the Crawler :py:meth:`~linkaheadadvancedtools.cfood.AbstractCFood.create_identifiables`, the Crawler
guarantees that an ``Experiment`` with those properties exists in CaosDB. In the call to guarantees that an ``Experiment`` with those properties exists in LinkAhead. In the call to
:py:meth:`~linkaheadadvancedtools.cfood.AbstractCFood.update_identifiables`, further properties :py:meth:`~linkaheadadvancedtools.cfood.AbstractCFood.update_identifiables`, further properties
might be added to this Record, e.g. references to data files that were recorded in that might be added to this Record, e.g. references to data files that were recorded in that
experiment or to the person that did the experiment. experiment or to the person that did the experiment.
...@@ -241,7 +241,7 @@ Let’s look at the following Example: ...@@ -241,7 +241,7 @@ Let’s look at the following Example:
>>> # Example CFood >>> # Example CFood
>>> from linkaheadadvancedtools.cfood import AbstractFileCFood, assure_has_property >>> from linkaheadadvancedtools.cfood import AbstractFileCFood, assure_has_property
>>> import caosdb as db >>> import linkahead as db
>>> >>>
>>> class ExampleCFood(AbstractFileCFood): >>> class ExampleCFood(AbstractFileCFood):
... @staticmethod ... @staticmethod
...@@ -333,7 +333,7 @@ In the ``crawl.py`` file, you should set this appropriately: ...@@ -333,7 +333,7 @@ In the ``crawl.py`` file, you should set this appropriately:
>>> fileguide.access = lambda path: "/main/data/" + path >>> fileguide.access = lambda path: "/main/data/" + path
This prefixes all paths that are used in CaosDB with “/main/data/”. In This prefixes all paths that are used in LinkAhead with “/main/data/”. In
CFoods, files can then be accessed using the fileguide as follows: CFoods, files can then be accessed using the fileguide as follows:
.. code:: python .. code:: python
...@@ -342,16 +342,16 @@ CFoods, files can then be accessed using the fileguide as follows: ...@@ -342,16 +342,16 @@ CFoods, files can then be accessed using the fileguide as follows:
# do stuff # do stuff
pass pass
Changing data in CaosDB Changing data in LinkAhead
----------------------- -----------------------
As described above, a Record matching the identifiable will be inserted As described above, a Record matching the identifiable will be inserted
if no such Record existed before. This is typically unproblematic. if no such Record existed before. This is typically unproblematic.
However, what if existing Records need to be modified? Many However, what if existing Records need to be modified? Many
manipulations have the potential of overwriting changes in made in manipulations have the potential of overwriting changes in made in
CaosDB. Thus, unless the data being crawled is a single source of truth LinkAhead. Thus, unless the data being crawled is a single source of truth
for the information in CaosDB (and changes to the respective data in for the information in LinkAhead (and changes to the respective data in
CaosDB should thus not be possible) changes have to be done with some LinkAhead should thus not be possible) changes have to be done with some
considerations. considerations.
Use the functions ``assure_has_xyz`` defined in the :py:mod:`cfood module <.cfood>` to Use the functions ``assure_has_xyz`` defined in the :py:mod:`cfood module <.cfood>` to
...@@ -374,10 +374,10 @@ function, a security level can be given. ...@@ -374,10 +374,10 @@ function, a security level can be given.
... interactive=False) # the crawler runs without asking intermediate questions ... interactive=False) # the crawler runs without asking intermediate questions
>>> c.crawl(security_level=INSERT) >>> c.crawl(security_level=INSERT)
This assures that every manipulation of data in CaosDB that is done via the functions provided by This assures that every manipulation of data in LinkAhead that is done via the functions provided by
the :py:class:`~linkaheadadvancedtools.guard` class is checked against the provided security level: the :py:class:`~linkaheadadvancedtools.guard` class is checked against the provided security level:
- ``RETRIEVE``: allows only to retrieve data from CaosDB. No manipulation is allowed - ``RETRIEVE``: allows only to retrieve data from LinkAhead. No manipulation is allowed
- ``INSERT``: allows only to insert new entities and the manipulation of those newly inserted ones - ``INSERT``: allows only to insert new entities and the manipulation of those newly inserted ones
- ``UPDATE``: allows all manipulations - ``UPDATE``: allows all manipulations
...@@ -390,7 +390,7 @@ If you provide the ``to_be_updated`` member variable of CFoods to those ...@@ -390,7 +390,7 @@ If you provide the ``to_be_updated`` member variable of CFoods to those
``assure...`` functions, the crawler provides another convenient ``assure...`` functions, the crawler provides another convenient
feature: When an update is prevented due to the security level, the feature: When an update is prevented due to the security level, the
update is saved and can be subsequently be authorized. If the crawler update is saved and can be subsequently be authorized. If the crawler
runs on the CaosDB server, it will try to send a mail which allows to runs on the LinkAhead server, it will try to send a mail which allows to
authorize the change. If it runs as a local script it will notify you authorize the change. If it runs as a local script it will notify you
that there are unauthorized changes and provide a code with which the that there are unauthorized changes and provide a code with which the
crawler can be started to authorize the change. crawler can be started to authorize the change.
...@@ -404,7 +404,7 @@ shows how a set of CFoods can be defined to deal with a complex file structure. ...@@ -404,7 +404,7 @@ shows how a set of CFoods can be defined to deal with a complex file structure.
You can find detailed information on files need to be structured `here You can find detailed information on files need to be structured `here
<https://gitlab.com/salexan/check-sfs/-/blob/f-software/filesystem_structure.md>`__ and the source <https://gitlab.com/salexan/check-sfs/-/blob/f-software/filesystem_structure.md>`__ and the source
code of the CFoods `here <https://gitlab.com/caosdb/caosdb-advanced-user-tools>`__. code of the CFoods `here <https://gitlab.com/linkahead/linkahead-advanced-user-tools>`__.
Sources Sources
======= =======
......
...@@ -633,7 +633,7 @@ ...@@ -633,7 +633,7 @@
id="tspan6473" id="tspan6473"
x="89.143967" x="89.143967"
y="95.349686" y="95.349686"
style="font-size:3.88055563px;stroke-width:0.26458332">CaosDB-Server</tspan></text> style="font-size:3.88055563px;stroke-width:0.26458332">LinkAhead-Server</tspan></text>
<image <image
width="20.404667" width="20.404667"
height="25.484667" height="25.484667"
......
...@@ -909,7 +909,7 @@ ...@@ -909,7 +909,7 @@
x="212.41925" x="212.41925"
y="197.43471" y="197.43471"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:4.23333311px;font-family:monospace;-inkscape-font-specification:monospace;stroke-width:0.26458332" style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:4.23333311px;font-family:monospace;-inkscape-font-specification:monospace;stroke-width:0.26458332"
id="tspan88248">-crawler.push_identifiables_to_CaosDB()</tspan></tspan></text> id="tspan88248">-crawler.push_identifiables_to_LinkAhead()</tspan></tspan></text>
</g> </g>
<g <g
id="g44584" id="g44584"
Welcome to linkaheadadvancedtools' documentation! Welcome to linkaheadadvancedtools' documentation!
============================================ ============================================
Welcome to the advanced Python tools for CaosDB! Welcome to the advanced Python tools for LinkAhead!
This documentation helps you to :doc:`get started<getting_started>`, explains the most important This documentation helps you to :doc:`get started<getting_started>`, explains the most important
...@@ -13,7 +13,7 @@ This documentation helps you to :doc:`get started<getting_started>`, explains th ...@@ -13,7 +13,7 @@ This documentation helps you to :doc:`get started<getting_started>`, explains th
Getting started <README_SETUP> Getting started <README_SETUP>
Concepts <concepts> Concepts <concepts>
The Caosdb Crawler <crawler> The LinkAhead Crawler <crawler>
YAML data model specification <yaml_interface> YAML data model specification <yaml_interface>
_apidoc/modules _apidoc/modules
......
...@@ -4,10 +4,10 @@ ...@@ -4,10 +4,10 @@
=============================== ===============================
The ``linkaheadadvancedtools`` library features the possibility to create and update The ``linkaheadadvancedtools`` library features the possibility to create and update
CaosDB models using a simplified definition in YAML format. LinkAhead models using a simplified definition in YAML format.
Let's start with an example taken from `model.yml Let's start with an example taken from `model.yml
<https://gitlab.indiscale.com/caosdb/src/caosdb-advanced-user-tools/-/blob/dev/unittests/model.yml>`__ <https://gitlab.com/linkahead/linkahead-advanced-user-tools/-/blob/dev/unittests/model.yml>`__
in the library sources. in the library sources.
.. code-block:: yaml .. code-block:: yaml
...@@ -57,9 +57,9 @@ This example defines 3 ``RecordTypes``: ...@@ -57,9 +57,9 @@ This example defines 3 ``RecordTypes``:
``Textfile``. ``Textfile``.
One major advantage of using this interface (in contrast to the standard python interface) is that properties can be defined and added to record types "on-the-fly". E.g. the three lines for ``firstName`` as sub entries of ``Person`` have two effects on CaosDB: One major advantage of using this interface (in contrast to the standard python interface) is that properties can be defined and added to record types "on-the-fly". E.g. the three lines for ``firstName`` as sub entries of ``Person`` have two effects on LinkAhead:
- A new property with name ``firstName``, datatype ``TEXT`` and description ``first name`` is inserted (or updated, if already present) into CaosDB. - A new property with name ``firstName``, datatype ``TEXT`` and description ``first name`` is inserted (or updated, if already present) into LinkAhead.
- The new property is added as a recommended property to record type ``Person``. - The new property is added as a recommended property to record type ``Person``.
Any further occurrences of ``firstName`` in the yaml file will reuse the definition provided for ``Person``. Any further occurrences of ``firstName`` in the yaml file will reuse the definition provided for ``Person``.
...@@ -70,7 +70,7 @@ Note the difference between the three property declarations of ``LabbookEntry``: ...@@ -70,7 +70,7 @@ Note the difference between the three property declarations of ``LabbookEntry``:
- ``responsible``: This defines and adds a property with name "responsible" to ``LabbookEntry`, which has a datatype ``Person``. ``Person`` is defined above. - ``responsible``: This defines and adds a property with name "responsible" to ``LabbookEntry`, which has a datatype ``Person``. ``Person`` is defined above.
- ``firstName``: This defines and adds a property with the standard data type ``TEXT`` to record type ``Person``. - ``firstName``: This defines and adds a property with the standard data type ``TEXT`` to record type ``Person``.
If the data model depends on record types or properties which already exist in CaosDB, those can be If the data model depends on record types or properties which already exist in LinkAhead, those can be
added using the ``extern`` keyword: ``extern`` takes a list of previously defined names. added using the ``extern`` keyword: ``extern`` takes a list of previously defined names.
...@@ -78,7 +78,7 @@ added using the ``extern`` keyword: ``extern`` takes a list of previously define ...@@ -78,7 +78,7 @@ added using the ``extern`` keyword: ``extern`` takes a list of previously define
Datatypes Datatypes
========= =========
You can use any data type understood by CaosDB as datatype attribute in the yaml model. You can use any data type understood by LinkAhead as datatype attribute in the yaml model.
List attributes are a bit special: List attributes are a bit special:
...@@ -126,7 +126,7 @@ You can use the yaml parser directly in python as follows: ...@@ -126,7 +126,7 @@ You can use the yaml parser directly in python as follows:
This creates a DataModel object containing all entities defined in the yaml file. This creates a DataModel object containing all entities defined in the yaml file.
You can then use the functions from linkaheadadvancedtools.models.data_model.DataModel to synchronize You can then use the functions from linkaheadadvancedtools.models.data_model.DataModel to synchronize
the model with a CaosDB instance, e.g.: the model with a LinkAhead instance, e.g.:
.. code-block:: python .. code-block:: python
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment