From fed4c9f7c0479cd8f1ed0bed10743b3ef984c42d Mon Sep 17 00:00:00 2001
From: Daniel <daniel@harvey>
Date: Wed, 11 Mar 2020 12:02:44 +0100
Subject: [PATCH] WIP DOC: Added table documentation.

---
 doc/table_structure.md | 163 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 163 insertions(+)
 create mode 100644 doc/table_structure.md

diff --git a/doc/table_structure.md b/doc/table_structure.md
new file mode 100644
index 0000000..3ec27dc
--- /dev/null
+++ b/doc/table_structure.md
@@ -0,0 +1,163 @@
+This document describes the SQL tables used by CaosDB and how they represent the
+internal data structure.
+
+# Tables #
+
+## entities ##
+All entities (RecordTypes, Records, Properties, ...) have an entry here.  The
+columns are:
+- `id` :: The (unique) ID of this entity.
+- `name` :: Name of the entity.
+- `description` :: A description of the entity.
+- `role` :: The role of the entity (e.g. data type, RecordType, Property,
+  Record, ...).
+- `acl` :: Access control ID.
+
+## isa ##
+
+Inheritance as in "A is a B" is stored in this table.  The columns:
+
+- `child` :: The child entity.
+- `parent` :: The parent entity.
+- `type` :: The inheritance type (either `INHERITANCE` or `SUBTYPING` (what's
+  "subtyping"?)).
+
+**TODO** Why is this table empty on the default example?
+
+## isa_cache ##
+
+Here, information about nested inheritance is cached (upon creation?).
+
+- `child` :: The child entity.
+- `parent` :: The parent entity.
+- `rpath` :: Possible intermediate steps on a path from `child` to `parent`.
+  See the *Inheritance* section for more details.
+
+## foo_data ##
+Property data is stored here, in tables named after the data type:
+- date_data
+- datetime_data
+- double_data
+- integer_data
+- text_data
+- enum_data
+- name_data
+- null_data
+- reference_data
+
+These tables share (generally) the same set of columns:
+- `domain_id` :: The domain of an item, see the *multipurpose subdomains*
+  section.  For "normal" data, this is 0.
+- `entity_id` :: The entity to which this property belongs.  In most cases, this
+  is a "normal" entity like a `Record` or `RecordType`.
+- `property_id` :: In most cases, the property to which the value belongs.
+- `value` :: The value of the property.
+- `status` :: The importance of the property (?).
+- `pidx` :: The property index, becomes nonzero when a property occurs more than
+  once in an entity.
+
+## data_type ##
+The type of properties is stored here.  The columns are:
+- `domain_id` :: Property identitification, same as for plain data?
+- `entity_id` :: Property identitification, same as for plain data?
+- `property_id` :: Property identitification, same as for plain data?
+- `datatype` :: The data type of the property, a reference to an entity.
+
+## files ##
+
+- `file_id` :: ID.
+- `path` :: (Relative) path to the file.
+- `size` :: Size in bytes.
+- `hash` :: Hash of the file contents, as binary. **TODO** Which algorithm?
+- `checked_timestamp` :: Timestamp when last checked? 
+
+## TODO ##
+
+- entity_acl
+- groups
+- logging
+- desc_overrides
+- name_overrides
+- passwd
+- passwords
+- permissions
+- query_template_def :: User-defined query templates?
+- roles :: possible user roles
+- rules :: SQL action/verification hooks?
+- stats :: benchmarking?
+- transaction_log :: benchmarking?
+- units_lin_con :: fraction and unit conversion?
+- user_info :: Metadata for (local?) user accounts
+- user_roles :: User permissions?
+
+# Inheritance caching #
+
+The `isa_cache` table does not only store direct inheritances, but also nested
+parentships.  This is done by adding intermediate inheritance steps in the
+`rpath` column:
+
+- For *direct inheritance*, the `rpath` column contains the child itself.
+- For all other *possible paths* between the child and parent, it contains the
+  entities between, separated by the `>` character.
+  
+## Example ##
+
+Consider this inheritance structure and the resulting table:
+
+```
+   0    Read this from top to bottom: "1 is a 0",
+  / \   "2 is a 0", and so on.
+ 1   2
+  \ /
+   3
+   |
+   4
+   |
+   5
+```
+
+|-------|--------|-------|
+| child | parent | rpath |
+|-------|--------|-------|
+| 1     | 0      | 1     |
+| 2     | 0      | 2     |
+| 3     | 0      | 1     |
+| 3     | 0      | 2     |
+| 3     | 1      | 3     |
+| 3     | 2      | 3     |
+| 4     | 0      | 3>1   |
+| 4     | 0      | 3>2   |
+| 4     | 1      | 3     |
+| 4     | 2      | 3     |
+| 4     | 3      | 4     |
+| 5     | 0      | 4>3>1 |
+| 5     | 0      | 4>3>2 |
+| 5     | 1      | 4>3   |
+| 5     | 2      | 4>3   |
+| 5     | 3      | 4     |
+| 5     | 4      | 5     |
+|-------|--------|-------|
+
+
+# Multipurpose subdomains #
+Multipurpose subdomains are generically used to work with composite
+properties. Currently implemented examples are:
+- Properties with units (if multiple Properties exist)
+- Lists
+- Name overrides
+
+## Example ##
+Let's have a look at this *Record* (simplified XML):
+
+```xml
+<R1>
+    <P1 name="Comment">Hello World</P1>
+    <P2 name="voltage" unit="V">
+        23
+    </P2>
+    <P3 comment="list of something">
+        V1, V2, V3, V4, ...
+    </P3>
+</R1>
+```
+
-- 
GitLab