Skip to content

ENH: HDF5 Converter

Florian Spreckelsen requested to merge f-hdf5-converter into dev

Summary

Implements https://gitlab.com/linkahead/linkahead-crawler/-/issues/70 (#101 (closed)). Adds converters for HDF5 files that are already being used in the 3dmmto setup.

Focus

The new converters try to keep HDF5 files as close as possible to dictionaries, with the exeption of included datasets and ndarrays. The latter we don't even try to store in LinkAhead directly, instead we create records corresponding to the ndarray that contain the internal path (at least) as property. People may thus search for the ndarray record w.r.t to groups, datasets, attributes, and all other LinkAhead information, and then use the internal path property to access the array directly in the HDF5 file (which they can download via LinkAhead, of course).

Test Environment

In principle, the new unit test is sufficient. Feel free to also test it with the local-testing-main branch of 3dmmto's server profile by crawling their test data in our nextcloud (you may need to adjust some extroot paths in the profile, though).

Check List for the Author

Please, prepare your MR for a review. Be sure to write a summary and a focus and create gitlab comments for the reviewer. They should guide the reviewer through the changes, explain your changes and also point out open questions. For further good practices have a look at our review guidelines

  • All automated tests pass
  • Reference related issues
  • Up-to-date CHANGELOG.md (or not necessary)
  • Up-to-date JSON schema (or not necessary)
  • Appropriate user and developer documentation (or not necessary)
    • How do I use the software? Assume "stupid" users.
    • How do I develop or debug the software? Assume novice developers.
  • Annotations in code (Gitlab comments)
    • Intent of new code
    • Problems with old code
    • Why this implementation?

Check List for the Reviewer

  • I understand the intent of this MR
  • All automated tests pass
  • Up-to-date CHANGELOG.md (or not necessary)
  • Appropriate user and developer documentation (or not necessary)
  • The test environment setup works and the intended behavior is reproducible in the test environment
  • In-code documentation and comments are up-to-date.
  • Check: Are there specifications? Are they satisfied?

For further good practices have a look at our review guidelines.

Edited by Henrik tom Wörden

Merge request reports