Skip to content
Snippets Groups Projects

F fix crawler overwrite

Merged Alexander Schlemmer requested to merge f-fix-crawler-overwrite into dev

Summary

Fix for: https://gitlab.com/caosdb/caosdb-crawler/-/issues/23 and #47 (closed)

Test Environment

There is an integration test provided. For manual testing, you need pylib in the most recent dev branch, otherwise merge_entities is missing the force option.

Check List for the Author

Please, prepare your MR for a review. Be sure to write a summary and a focus and create gitlab comments for the reviewer. They should guide the reviewer through the changes, explain your changes and also point out open questions. For further good practices have a look at our review guidelines

  • All automated tests pass
  • Reference related issues
  • Up-to-date CHANGELOG.md (or not necessary)
  • Annotations in code (Gitlab comments)
    • Intent of new code
    • Problems with old code
    • Why this implementation?

Check List for the Reviewer

  • I understand the intent of this MR
  • All automated tests pass
  • Up-to-date CHANGELOG.md (or not necessary)
  • The test environment setup works and the intended behavior is reproducible in the test environment
  • In-code documentation and comments are up-to-date.
  • Check: Are there specifications? Are they satisfied?

For further good practices have a look at our review guidelines.

Edited by Henrik tom Wörden

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Alexander Schlemmer changed the description

    changed the description

  • Problem, merge request causes merge problems in synchronize of the form:

    E                           Entity a (559, None) has a Property 'project' with value=558
    E                           Entity b (559, None) has a Property 'project' with value=<Record id="558">

    This could probably be resolved with a proper preparation of the objects to be merged in crawl.py:748:

    # side effect
                            record.id = identified_record.id
                            # Copy over checksum and size too if it is a file
                            if isinstance(record, db.File):
                                record._size = identified_record._size
                                record._checksum = identified_record._checksum
    
                            # TODO: Replace values that are Records which have an ID just with the ID
    
                            merge_entities(record, identified_record)
                            to_be_updated.append(record)
                            self.add_to_remote_existing_cache(record)
                            del flat[i]
                        resolved_references = True
  • Alexander Schlemmer marked this merge request as draft

    marked this merge request as draft

  • Florian Spreckelsen removed review request for @florian

    removed review request for @florian

  • Florian Spreckelsen assigned to @florian and unassigned @salexan

    assigned to @florian and unassigned @salexan

  • added 14 commits

    Compare with previous version

  • Florian Spreckelsen changed the description

    changed the description

  • added 1 commit

    • 53ce6836 - FIX: Use force merge with a deepcopy

    Compare with previous version

  • Florian Spreckelsen resolved all threads

    resolved all threads

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading