diff --git a/README_SETUP.md b/README_SETUP.md index 9b7b27ec056583708a8773ebac49f37ff45d9fd4..19f051636952945fe76b2ab752264031ac43378d 100644 --- a/README_SETUP.md +++ b/README_SETUP.md @@ -34,7 +34,7 @@ For testing: 3. Start an empty (!) CaosDB instance (with the mounted extroot). The database will be cleared during testing, so it's important to use an empty instance. -4. Run `test.sh`. +4. Run `test.sh`. Note that this may modify content of the `integrationtest/extroot/` directory. ## Code Formatting `autopep8 -i -r ./` diff --git a/integrationtests/crawl.py b/integrationtests/crawl.py index bf72b5f74b463f9ece2bd047548dcb22e8d71dac..65600016ed5dff97d3794b61cf540b9d0505698d 100755 --- a/integrationtests/crawl.py +++ b/integrationtests/crawl.py @@ -43,7 +43,7 @@ except ModuleNotFoundError: return argparse.ArgumentParser() def print_success(text): - print("Success: "+text) + print("Success: " + text) def get_parser(): diff --git a/integrationtests/extroot/.cerate_dir b/integrationtests/extroot/.create_dir similarity index 100% rename from integrationtests/extroot/.cerate_dir rename to integrationtests/extroot/.create_dir diff --git a/src/doc/crawler.rst b/src/doc/crawler.rst index 7c95dad9ed10c025bc1baf811feb4534fc994175..391c5458801235f132cb21e2c7911ba670c1322c 100644 --- a/src/doc/crawler.rst +++ b/src/doc/crawler.rst @@ -75,7 +75,7 @@ The crawler can be executed directly via a python script (usually called ``crawl.py``). The script prints the progress and reports potential problems. The exact behavior depends on your setup. However, you can have a look at the example in the -`tests <https://gitlab.com/caosdb/caosdb-advanced-user-tools/-/blob/main/integrationtests/full_test/crawl.py>`__. +`tests <https://gitlab.indiscale.com/caosdb/src/caosdb-advanced-user-tools/-/blob/main/integrationtests/crawl.py>`__. .. Note:: The crawler depends on the CaosDB Python client, so make sure to install :doc:`pycaosdb <caosdb-pylib:getting_started>`. @@ -86,14 +86,18 @@ Typically, an invocation looks like: .. code:: python - python3 crawl.py "/TestData/" + python3 crawl.py /someplace/ -In this case ``/TestData/`` identifies the path to be crawled **within -the CaosDB file system**. You can browse the CaosDB file system by +.. Note:: For trying this out with the above mentioned example crawler from the integration tests, + make sure that the ``extroot`` directory in the ``integrationtests`` folder is used as + CaosDB's extroot directory,and call the crawler with ``python3 crawl.py /``. + +In this case ``/someplace/`` identifies the path to be crawled **within +CaosDB's file system**. You can browse the CaosDB file system by opening the WebUI of your CaosDB instance and clicking on “File Systemâ€. In the backend, ``crawl.py`` starts a CQL query -``FIND File WHICH IS STORED AT /TestData/**`` and crawls the resulting +``FIND File WHICH IS STORED AT /someplace/**`` and crawls the resulting files according to your customized ``CFoods``. Crawling may consist of two distinct steps: 1. Insertion of files (use