CaosDB shall allow search on data stored in other sources.
- Research on how such a data virtualization can be achieved; 2 people (E)
- decide for the best two options and outline how it can be done
- Discuss options in ST
- Roadmap for implementation
How can rights be perserved?
FOSS Software for Data Virtualization
- Apache Drill: Ex Google, SQL <-> * (vivid FOSS project, seems well suited)
- Trino (fka PrestoSQL): Ex Facebook SQL <-> * (vivid FOSS project, focus on large scale distributed system?)
- PrestoDB: Ex Facebook, SQL <-> * (use trino instead?)
- JBoss Enterprise Data Services Platform: Part of JBoss Enterprise SOA Platform. SQL/XQuery <-> * (use teeid instead?)
- OpenLink Virtuoso Universal Server: Merger of Kubl and OpenLink, SQL/SPARQL <-> * (looks oldfashioned; documentation in bad shape, little contribution in FOSS repo)
- Teiid (FOSS without community contribution because RedHat does it?)
We should look which data sources (*) are actually supported in each of the systems
Two Architectural Ideas
Server As Delegator
graph TD A[CaosDB Server] --- B[Legacy MySQL Backend] A --- C[Virtualization] C --- D[SQL] C --- E[NoSQL] C --- F[RDF]
This allows easy prototyping and will be used for the next step.
Virtualization Layer replaces Legacy Backend
graph TD A[CaosDB Server] A --- C[Virtualization] C --- B[Legacy MySQL Backend] C --- D[SQL] C --- E[NoSQL] C --- F[RDF]
Due to differences between CQL and SQL we would loose significant capabilities here.