Skip to content
Snippets Groups Projects
Commit 29eefbba authored by Henrik tom Wörden's avatar Henrik tom Wörden
Browse files

DOC: add query language spec

parent a5d870ff
Branches
Tags
No related merge requests found
**Table of Contents**
[[_TOC_]]
# Example queries
## Simple FIND Query
The following query will return any entity which has the name _ename_ and all its children.
`FIND ename`
The following queries are equivalent and will return any entity which has the name _ename_ and all its children, but only if they are genuin records. Of course, the returned set of entities (henceforth referred to as _resultset_) can also be restricted to recordtypes, properties and files.
`FIND RECORD ename`
`FIND RECORDS ename`
Wildcards use `*` for any characters or none at all. Wildcards for single characters (like the '_' wildcard from mysql) are not implemented yet.
`FIND RECORD en*` returns any entity which has a name beginning with _en_.
Regular expressions must be surrounded by _<<_ and '>>':
`FIND RECORD <<e[aemn]{2,5}>>`
`FIND RECORD <<[cC]am_[0-9]*>>`
*TODO* (Timm):
Describe escape sequences like `\\`, `\*`, `\<<` and `\>>`.
Currently, wildcards and regular expressions are only available for the _simple-find-part_ of the query, i. e. no wildcards/regexps for filters.
## Simple COUNT Query
This query counts entities which have certain properties.
`COUNT ename`
will return the number of entities which have the name _ename_ and all their children.
The syntax of the COUNT queries is equivalent to the FIND queries in any respect (this also applies to wildcards and regular expressions) but one: The prefix is to be `COUNT` instead of `FIND`.
Unlike the FIND queries, the COUNT queries do not return any entities. The result of the query is the number of entities which _would be_ returned if the query was a FIND query.
## Filters
### POV - Property-Operator-Value
The following queries are equivalent and will restrict the result set to entities which have a property named _pname1_ that has a value _val1_.
`FIND ename.pname1=val1`
`FIND ename WITH pname1=val1`
`FIND ename WHICH HAS A PROPERTY pname1=val1`
`FIND ename WHICH HAS A pname1=val1`
Again, the resultset can be restricted to records:
`FIND RECORD ename WHICH HAS A pname1=val1`
_currently known operators:_ `=, !=, <=, <, >=, >` (and cf. next paragraphes!)
#### Special Operator: LIKE
The _LIKE_ can be used with wildcards. The `*` is a wildcard for any (possibly empty) sequence of characters. Examples:
`FIND RECORD ename WHICH HAS A pname1 LIKE va*`
`FIND RECORD ename WHICH HAS A pname1 LIKE va*1`
`FIND RECORD ename WHICH HAS A pname1 LIKE *al1`
_Note:_ The _LIKE_ operator is will only produce expectable results with text properties.
#### Special Case: References
In general a reference can be addressed just like a POV filter. So
`FIND ename1.pname1=ename2`
will also return any entity named _ename1_ which references the entity with name or id _ename2_ via a reference property named _pname1_. However, it will also return any entity with a text property of that name with the string value _ename2_. In order to restrict the result set to reference properties one may make use of special reference operators:
_reference operators:_ `->, REFERENCES, REFERENCE TO`
The query looks like this:
`FIND ename1 WHICH HAS A pname1 REFERENCE TO ename2`
`FIND ename1 WHICH HAS A pname1->ename2`
#### Time Special Case: DateTime
_DateTime operators:_ `=, !=, <, >, IN, NOT IN`
##### `d1=d2`: Equivalence relation.
* ''True'' iff d1 and d2 are equal in every respect (same DateTime flavor, same fields are defined/undefined and all defined fields are equal respectively).
* ''False'' iff they have the same DateTime flavor but have different fields defined or fields with differing values.
* ''Undefined'' otherwise.
Examples:
* `2015-04-03=2015-04-03T00:00:00` is undefined.
* `2015-04-03T00:00:00=2015-04-03T00:00:00.0` is undefined (second precision vs. nanosecond precision).
* `2015-04-03T00:00:00.0=2015-04-03T00:00:00.0` is true.
* `2015-04-03T00:00:00=2015-04-03T00:00:00` is true.
* `2015-04=2015-05` is false.
* `2015-04=2015-04` is true.
##### `d1!=d2`: Intransitive, symmetric relation.
* ''True'' iff `d1=d2` is false.
* ''False'' iff `d1=d2` is true.
* ''Undefined'' otherwise.
Examples:
* `2015-04-03!=2015-04-03T00:00:00` is undefined.
* `2015-04-03T00:00:00!=2015-04-03T00:00:00.0` is undefined.
* `2015-04-03T00:00:00.0!=2015-04-03T00:00:00.0` is false.
* `2015-04-03T00:00:00!=2015-04-03T00:00:00` is false.
* `2015-04!=2015-05` is true.
* `2015-04!=2015-04` is false.
##### `d1>d2`: Transitive, non-symmetric relation.
Semantics depend on the flavors of d1 and d2. If both are...
###### [UTCDateTime](Datatype#datetime)
* ''True'' iff the time of d1 is after the the time of d2 according to [https://en.wikipedia.org/wiki/Coordinated_Universal_Time](UTC)
* ''False'' otherwise.
###### [SemiCompleteDateTime](Datatype#datetime)
* ''True'' iff `d1.ILB>d2.EUB` is true or `d1.ILB=d2.EUB` is true.
* ''False'' iff `d1.EUB<d2.ILB}} is true or {{{d1.EUB=d2.ILB` is true.
* ''Undefined'' otherwise.
Examples:
* `2015>2014` is true.
* `2015-04>2014` is true.
* `2015-01-01T20:15.00>2015-01-01T20:14` is true.
* `2015-04>2015` is undefined.
* `2015>2015-04` is undefined.
* `2015-01-01T20:15>2015-01-01T20:15:15` is undefined.
* `2014>2015` is false.
* `2014-04>2015` is false.
* `2014-01-01>2015-01-01T20:15:30` is false.
##### `d1<d2`: Transitive, non-symmetric relation.
Semantics depend on the flavors of d1 and d2. If both are...
###### [UTCDateTime](Datatype#datetime)
* ''True'' iff the time of d1 is before the the time of d2 according to [UTC](https://en.wikipedia.org/wiki/Coordinated_Universal_Time)
* ''False'' otherwise.
###### [SemiCompleteDateTime](Datatype#datetime)
* ''True'' iff `d1.EUB<d2.ILB` is true or `d1.EUB=d2.ILB` is true.
* ''False'' iff `d1.ILB>d2.EUB}} is true or {{{d1.ILB=d2.EUB` is true.
* ''Undefined'' otherwise.
Examples:
* `2014<2015` is true.
* `2014-04<2015` is true.
* `2014-01-01<2015-01-01T20:15:30` is true.
* `2015-04<2015` is undefined.
* `2015<2015-04` is undefined.
* `2015-01-01T20:15<2015-01-01T20:15:15` is undefined.
* `2015<2014` is false.
* `2015-04<2014` is false.
* `2015-01-01T20:15.00<2015-01-01T20:14` is false.
##### `d1 IN d2`: Transitive, non-symmetric relation.
Semantics depend on the flavors of d1 and d2. If both are...
###### [SemiCompleteDateTime](Datatype#datetime)
* ''True'' iff (`d1.ILB>d2.ILB` is true or `d1.ILB=d2.ILB` is true) and (`d1.EUB<d2.EUB` is true or `d1.EUB=d2.EUB` is true).
* ''False'' otherwise.
Examples:
* `2015-01-01 IN 2015` is true.
* `2015-01-01T20:15:30 IN 2015-01-01` is true.
* `2015-01-01T20:15:30 IN 2015-01-01T20:15:30` is true.
* `2015 IN 2015-01-01` is false.
* `2015-01-01 IN 2015-01-01T20:15:30` is false.
##### `d1 NOT IN d2`: Non-symmetric relation.
Semantics depend on the flavors of d1 and d2. If both are...
###### [SemiCompleteDateTime](Datatype#datetime)
* ''True'' iff `d1.ILB IN d2.ILB` is false.
* ''False'' otherwise.
Examples:
* `2015 NOT IN 2015-01-01` is true.
* `2015-01-01 NOT IN 2015-01-01T20:15:30` is true.
* `2015-01-01 NOT IN 2015` is false.
* `2015-01-01T20:15:30 NOT IN 2015-01-01` is false.
* `2015-01-01T20:15:30 NOT IN 2015-01-01T20:15:30` is false.
##### Note
These semantics follow a three-valued logic with ''true'', ''false'' and ''undefined'' as truth values. Only ''true'' is truth preserving. I.e. only those expressions which evaluate to ''true'' pass the POV filter. `FIND ... WHICH HAS A somedate=2015-01` only returns entities for which `somedate=2015-01` is true. On the other hand, `FIND ... WHICH DOESN'T HAVE A somedate=2015-01` returns entities for which `somedate=2015-01` is false or undefined. Shortly put, `NOT d1=d2` is not equivalent to `d1!=d2`. The latter assertion is stronger.
#### Omitting the Property or the Value
One doesn't have to specify the property or the value at all. The following query filters the result set for entities which have any property with a value greater than _val1_.
`FIND ename WHICH HAS A PROPERTY > val1`
`FIND ename . > val1`
`FIND ename.>val1`
And for references...
`FIND ename1 WHICH HAS A REFERENCE TO ename2`
`FIND ename1 WHICH REFERENCES ename2`
`FIND ename1 . -> ename2`
`FIND ename1.->ename2`
The following query returns entities which have a _pname1_ property with any value.
`FIND ename WHICH HAS A PROPERTY pname1`
`FIND ename WHICH HAS A pname1`
`FIND ename WITH pname1`
`FIND ename . pname1`
`FIND ename.pname1`
### TransactionFilter
*Definition*
sugar:: `HAS BEEN` | `HAVE BEEN` | `HAD BEEN` | `WAS` | `IS` |
negated_sugar:: `HAS NOT BEEN` | `HASN'T BEEN` | `WAS NOT` | `WASN'T` | `IS NOT` | `ISN'T` | `HAVN'T BEEN` |
`HAVE NOT BEEN` | `HADN'T BEEN` | `HAD NOT BEEN`
by_clause:: `BY (ME | username | SOMEONE ELSE (BUT ME)? | SOMEONE ELSE BUT username)`
date:: A date string of the form `YYYY-MM-DD`
datetime:: A datetime string of the form `YYYY-MM-DD hh:mm:ss`
time_clause:: `ON ($date|$datetime) ` Here is plenty of room for more syntactic sugar, e.g. a `TODAY` keyword, and more funcionality, e.g. ranges.
`FIND ename WHICH ($sugar|$negated_sugar)? (NOT)? (CREATED|INSERTED|UPDATED|DELETED) (by_clause time_clause?| time_clause by_clause?)`
*Examples*
`FIND ename WHICH HAS BEEN CREATED BY ME ON 2014-12-24`
`FIND ename WHICH HAS BEEN CREATED BY SOMEONE ELSE ON 2014-12-24`
`FIND ename WHICH HAS BEEN CREATED BY erwin ON 2014-12-24`
`FIND ename WHICH HAS BEEN CREATED BY SOMEONE ELSE BUT erwin ON 2014-12-24`
`FIND ename WHICH HAS BEEN CREATED BY erwin`
`FIND ename . CREATED BY erwin ON `
### File Location
Search for file objects by their location:
`FIND FILE WHICH IS STORED AT a/certain/path/`
#### Wildcards
_STORED AT_ can be used with wildcards similar to unix wildcards.
* `*` matches any characters or none at all, but not the directory separator `/`
* `**` matches any character or none at all.
* A leading `*` is a shortcut for `/**`
* Asterisks directly between two other asterisks are ignored: `***` is the same as `**`.
* Escape character: `\` (E.g. `\\` is a literal backslash. `\*` is a literal star. But `\\*` is a literal backslash followed by a wildcard.)
Examples:
Find any files ending with `.acq`:
`FIND FILE WHICH IS STORED AT *.acq` or
`FIND FILE WHICH IS STORED AT **.acq` or
`FIND FILE WHICH IS STORED AT /**.acq`
Find files stored one directory below `/data/`, ending with `.acq`:
`FIND FILE WHICH IS STORED AT /data/*/*.acq`
Find files stored in `/data/`, ending with `.acq`:
`FIND FILE WHICH IS STORED AT /data/*.acq`
Find files stored in a directory at any depth in the tree below `/data/`, ending with `.acq`:
`FIND FILE WHICH IS STORED AT /data/**.acq`
Find any file in a directory which begins with `2016-02`:
`FIND FILE WHICH IS STORED AT */2016-02*/*`
### Back References
The back reference filters for entities that are referenced by another entity. The following query returns entities of the type _ename1_ which are referenced by _ename2_ entities via the reference property _pname1_.
* `FIND ename1 WHICH IS REFERENCED BY ename2 AS A pname1`
* `FIND ename1 WITH @ ename2 / pname1`
* `FIND ename1 . @ ename2 / pname1`
One may omit the property specification:
* `FIND ename1 WHICH IS REFERENCED BY ename2`
* `FIND ename1 WHICH HAS A PROPERTY @ ename2`
* `FIND ename1 WITH @ ename2`
* `FIND ename1 . @ ename2`
### Combining Filters with Propositional Logic
Any result set can be filtered by logically combining POV filters or back reference filters:
#### Conjunction (AND)
* `FIND ename1 WHICH HAS A PROPERTY pname1=val1 AND A PROPERTY pname2=val2 AND A PROPERTY...`
* `FIND ename1 WHICH HAS A PROPERTY pname1=val1 AND A pname2=val2 AND ...`
* `FIND ename1 . pname1=val1 & pname2=val2 & ...`
#### Disjunction (OR)
* `FIND ename1 WHICH HAS A PROPERTY pname1=val1 OR A PROPERTY pname2=val2 Or A PROPERTY...`
* `FIND ename1 WHICH HAS A PROPERTY pname1=val1 OR A pname2=val2 OR ...`
* `FIND ename1 . pname1=val1 | pname2=val2 | ...`
#### Negation (NOT)
* `FIND ename1 WHICH DOES NOT HAVE A PROPERTY pname1=val1`
* `FIND ename1 WHICH DOESN'T HAVE A pname1=val1`
* `FIND ename1 . NOT pname2=val2`
* `FIND ename1 . !pname2=val2`
#### ... and combinations with parentheses
* `FIND ename1 WHICH HAS A pname1=val1 AND DOESN'T HAVE A pname2<val2 AND ((WHICH HAS A pname3=val3 AND A pname4=val4) OR DOES NOT HAVE A (pname5=val5 AND pname6=val6))`
* `FIND ename1 . pname1=val1 & !pname2<val2 & ((pname3=val3 & pname4=val4) | !(pname5=val5 & pname6=val6))`
* `FIND ename1.pname1=val1&!pname2<val2&((pname3=val3&pname4=val4)|!(pname5=val5&pname6=val6))`
### A Few Important Expressions
* A:: The indistinct article. This is only syntactic suger. Equivalent expressions: `A, AN`
* AND:: The logical _and_. Equivalent expressions: `AND, &`
* FIND:: The beginning of the query.
* NOT:: The logical negation. Equivalent expressions: `NOT, DOESN'T HAVE A PROPERTY, DOES NOT HAVE A PROPERTY, DOESN'T HAVE A, DOES NOT HAVE A, DOES NOT, DOESN'T, IS NOT, ISN'T, !`
* OR:: The logical _or_. Equivalent expressions: `OR, |`
* RECORD,RECORDTYPE,FILE,PROPERTY:: Role expression for restricting the result set to a specific role.
* WHICH:: The marker for the beginning of the filters. Equivalent expressions: `WHICH, WHICH HAS A, WHICH HAS A PROPERTY, WHERE, WITH, .`
* REFERENCE:: This one is tricky: `REFERENCE TO` expresses a the state of _having_ a reference property. `REFERENCED BY` expresses the state of _being_ referenced by another entity.
* COUNT:: `COUNT` works like `FIND` but doesn't return the entities.
## Versioning
Since Caosdb 0.2 entities are optionally version controlled. The query language will be extended to include versioning in the future. A current minimal implementation introduces the `ANY VERSION OF` modifier which can be used to return all matching versions in the results of `COUNT`, `FIND`, and `SELECT` queries.
### Example
* `FIND ANY VERSION OF RECORD WITH pname=value` returns the all past and present versions of records with `pname = value`.
### Scope and current limitations
* The `ANY VERSION OF` modifier currently the only expression for taking the versioning into account when using the query language.
* Subproperties are not supported yet, e.g. `FIND ANY VERSION OF ENTITY WHICH IS REFERENCED BY ename WITH ...`. This applies to all cases where you specify properties of *referenced* entities or *referencing* entities.
### Future
* Add `(LATEST|LAST|OLDEST|NEWEST|FIRST) VERSION OF` modifiers.
* Add `(ANY|LATEST|LAST|OLDEST|NEWEST|FIRST) VERSION (BEFORE|AFTER) (<timestamp>|<transaction id>|<entity@version>) OF` modifier.
* Add support for subproperties, e.g. `FIND ANY VERSION OF ENTITY WHICH IS REFERENCED BY ename WITH ...`.
# Future
* *Sub Queries* (or *Sub Properties*): `FIND ename WHICH HAS A pname WHICH HAS A subpname=val`. This is like: `FIND AN experiment WHICH HAS A camera WHICH HAS A 'serial number'= 1234567890`
* *More Logic*, especially `ANY`, `ALL`, `NONE`, and `SUCH THAT` key words (and equivalents) for logical quantisation: `FIND ename1 SUCH THAT ALL ename2 WHICH HAVE A REFERENCE TO ename1 HAVE A pname=val`. This is like `FIND experiment SUCH THAT ALL person WHICH ARE REFERENCED BY THIS experiment AS conductor HAVE AN 'academic title'=professor.`
......@@ -11,6 +11,7 @@ Welcome to caosdb-server's documentation!
Getting started <README_SETUP>
Concepts <concepts>
tutorials
Query Language <CaosDB-Query-Language>
administration/*
development/*
API documentation<_apidoc/packages>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment