MAINT: allow to start the crawler with generic structure elements
Summary
Early tests were always done with dictionaries. However, it should be possible to simply start the crawler using any other StructureElement as root. This required a small change in crawl.py
were the first directory was previously treated specially.
Focus
Unfortunately, this change needs some changes to cfood.yml
(root directory needs to be added) files and to the unittests (One more match exists.)
You might want to look at the changes using: git diff dev -w
Test Environment
How to set up a test environment for manual testing?
Check List for the Author
Please, prepare your MR for a review. Be sure to write a summary and a focus and create gitlab comments for the reviewer. They should guide the reviewer through the changes, explain your changes and also point out open questions. For further good practices have a look at our review guidelines
-
All automated tests pass -
Reference related issues -
Up-to-date CHANGELOG.md (or not necessary) -
Annotations in code (Gitlab comments) - Intent of new code
- Problems with old code
- Why this implementation?
Check List for the Reviewer
-
I understand the intent of this MR -
All automated tests pass -
Up-to-date CHANGELOG.md (or not necessary) -
The test environment setup works and the intended behavior is reproducible in the test environment -
In-code documentation and comments are up-to-date. -
Check: Are there specifications? Are they satisfied?
For further good practices have a look at our review guidelines.
Merge request reports
Activity
assigned to @henrik
286 286 287 287 return local_converters 288 288 289 def start_crawling(self, item: StructureElement, 289 def start_crawling(self, items, Union[list[StructureElement], StructureElement]
?Edited by Alexander Schlemmerchanged this line in version 3 of the diff
304 304 305 if not isinstance(item, Directory): 306 raise NotImplementedError("Currently only directories are supported as items.") 307 308 305 if self.generalStore is None: 309 306 raise RuntimeError("Should not happen.") 310 307 308 if not isinstance(items, list): 309 items = [items] 310 311 311 local_converters = Crawler.create_local_converters(crawler_definition, 312 312 converter_registry) 313 313 # This recursive crawling procedure generates the update list: 314 314 self.updateList: list[db.Record] = [] 315 self._crawl(DirectoryConverter.create_children_from_directory(item), 315 self._crawl(items, mentioned in commit bb370291