Skip to content
Snippets Groups Projects
Commit 9213fee1 authored by Henrik tom Wörden's avatar Henrik tom Wörden
Browse files

Merge branch 'dev' into f-openpyxl

parents 3c7177db 8fef480b
No related branches found
No related tags found
1 merge request!2FIX: Use openpyxl instead of xlrd
Pipeline #8037 failed
...@@ -34,7 +34,7 @@ For testing: ...@@ -34,7 +34,7 @@ For testing:
3. Start an empty (!) CaosDB instance (with the mounted extroot). The 3. Start an empty (!) CaosDB instance (with the mounted extroot). The
database will be cleared during testing, so it's important to use database will be cleared during testing, so it's important to use
an empty instance. an empty instance.
4. Run `test.sh`. 4. Run `test.sh`. Note that this may modify content of the `integrationtest/extroot/` directory.
## Code Formatting ## Code Formatting
`autopep8 -i -r ./` `autopep8 -i -r ./`
......
...@@ -43,7 +43,7 @@ except ModuleNotFoundError: ...@@ -43,7 +43,7 @@ except ModuleNotFoundError:
return argparse.ArgumentParser() return argparse.ArgumentParser()
def print_success(text): def print_success(text):
print("Success: "+text) print("Success: " + text)
def get_parser(): def get_parser():
......
--- ---
responsible: responsible:
- Only Responsible - Only Responsible
description: A description of another example analysis. description: A description of this example analysis.
sources: sources:
- file: "/ExperimentalData/2010_TestProject/2019-02-03/*.dat" - file: "/ExperimentalData/2010_TestProject/2019-02-03/*.dat"
......
--- ---
responsible: responsible:
- Only Responsible - Only Responsible
description: A description of another example experiment. description: A description of this example experiment.
results: results:
- file: "/ExperimentalData/2010_TestProject/2019-02-03/*.dat" - file: "/ExperimentalData/2010_TestProject/2019-02-03/*.dat"
......
--- ---
responsible: responsible:
- Only Responsible - Only Responsible
description: A description of another example experiment. description: A description of this example experiment.
sources: sources:
- /DataAnalysis/2010_TestProject/2019-02-03/results.pdf - /DataAnalysis/2010_TestProject/2019-02-03/results.pdf
......
--- ---
responsible: responsible:
- Only Responsible - Only Responsible
description: A description of another example experiment. description: A description of this example experiment.
results: results:
- file: "*.dat" - file: "*.dat"
......
...@@ -156,6 +156,7 @@ def setup_package(): ...@@ -156,6 +156,7 @@ def setup_package():
author_email='h.tomwoerden@indiscale.com', author_email='h.tomwoerden@indiscale.com',
install_requires=["caosdb>=0.4.0", install_requires=["caosdb>=0.4.0",
"openpyxl>=3.0.0", "openpyxl>=3.0.0",
"pandas>=1.2.0",
"xlrd>=2.0", "xlrd>=2.0",
], ],
packages=find_packages('src'), packages=find_packages('src'),
......
...@@ -310,8 +310,6 @@ class Crawler(object): ...@@ -310,8 +310,6 @@ class Crawler(object):
if self.interactive and "y" != input("Do you want to continue? (y)"): if self.interactive and "y" != input("Do you want to continue? (y)"):
return return
logger.info("Inserting or updating Records...")
for cfood in cfoods: for cfood in cfoods:
try: try:
cfood.create_identifiables() cfood.create_identifiables()
...@@ -544,6 +542,10 @@ carefully and if the changes are ok, click on the following link: ...@@ -544,6 +542,10 @@ carefully and if the changes are ok, click on the following link:
logger.debug(cfood.to_be_updated) logger.debug(cfood.to_be_updated)
try: try:
if len(cfood.to_be_updated) > 0:
logger.info(
"Updating {} Records...".format(
len(cfood.to_be_updated)))
guard.safe_update(cfood.to_be_updated, unique=False) guard.safe_update(cfood.to_be_updated, unique=False)
except FileNotFoundError as e: except FileNotFoundError as e:
logger.info("Cannot access {}. However, it might be needed for" logger.info("Cannot access {}. However, it might be needed for"
...@@ -605,6 +607,9 @@ carefully and if the changes are ok, click on the following link: ...@@ -605,6 +607,9 @@ carefully and if the changes are ok, click on the following link:
logger.debug("No new entities to be inserted.") logger.debug("No new entities to be inserted.")
else: else:
try: try:
logger.info(
"Inserting {} Records...".format(
len(missing_identifiables)))
guard.safe_insert(missing_identifiables, unique=False) guard.safe_insert(missing_identifiables, unique=False)
except Exception as e: except Exception as e:
DataModelProblems.evaluate_exception(e) DataModelProblems.evaluate_exception(e)
......
...@@ -75,7 +75,7 @@ The crawler can be executed directly via a python script (usually called ...@@ -75,7 +75,7 @@ The crawler can be executed directly via a python script (usually called
``crawl.py``). The script prints the progress and reports potential ``crawl.py``). The script prints the progress and reports potential
problems. The exact behavior depends on your setup. However, you can problems. The exact behavior depends on your setup. However, you can
have a look at the example in the have a look at the example in the
`tests <https://gitlab.com/caosdb/caosdb-advanced-user-tools/-/blob/main/integrationtests/full_test/crawl.py>`__. `tests <https://gitlab.indiscale.com/caosdb/src/caosdb-advanced-user-tools/-/blob/main/integrationtests/crawl.py>`__.
.. Note:: The crawler depends on the CaosDB Python client, so make sure to install :doc:`pycaosdb .. Note:: The crawler depends on the CaosDB Python client, so make sure to install :doc:`pycaosdb
<caosdb-pylib:getting_started>`. <caosdb-pylib:getting_started>`.
...@@ -86,14 +86,18 @@ Typically, an invocation looks like: ...@@ -86,14 +86,18 @@ Typically, an invocation looks like:
.. code:: python .. code:: python
python3 crawl.py "/TestData/" python3 crawl.py /someplace/
In this case ``/TestData/`` identifies the path to be crawled **within .. Note:: For trying out the above mentioned example crawler from the integration tests,
the CaosDB file system**. You can browse the CaosDB file system by make sure that the ``extroot`` directory in the ``integrationtests`` folder is used as
CaosDB's extroot directory, and call the crawler indirectly via ``./test.sh``.
In this case ``/someplace/`` identifies the path to be crawled **within
CaosDB's file system**. You can browse the CaosDB file system by
opening the WebUI of your CaosDB instance and clicking on “File System”. opening the WebUI of your CaosDB instance and clicking on “File System”.
In the backend, ``crawl.py`` starts a CQL query In the backend, ``crawl.py`` starts a CQL query
``FIND File WHICH IS STORED AT /TestData/**`` and crawls the resulting ``FIND File WHICH IS STORED AT /someplace/**`` and crawls the resulting
files according to your customized ``CFoods``. files according to your customized ``CFoods``.
Crawling may consist of two distinct steps: 1. Insertion of files (use Crawling may consist of two distinct steps: 1. Insertion of files (use
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment