Skip to content
Snippets Groups Projects
Verified Commit c6cd0f3f authored by Timm Fitschen's avatar Timm Fitschen
Browse files

Merge branch 'f-filesystem-link' into f-filesystem-import

parents 622106f2 42e48560
No related branches found
No related tags found
1 merge request!77Draft: ENH: file system: import
Pipeline #47567 failed
Showing with 132 additions and 1103 deletions
...@@ -15,6 +15,7 @@ ...@@ -15,6 +15,7 @@
# typical build dirs # typical build dirs
build/ build/
bin/ bin/
lib/
target/ target/
_apidoc/ _apidoc/
src/doc/development/api/xml/out/ src/doc/development/api/xml/out/
......
...@@ -28,6 +28,7 @@ guidelines](https://gitlab.com/caosdb/caosdb/-/blob/dev/REVIEW_GUIDELINES.md) ...@@ -28,6 +28,7 @@ guidelines](https://gitlab.com/caosdb/caosdb/-/blob/dev/REVIEW_GUIDELINES.md)
- [ ] Up-to-date CHANGELOG.md (or not necessary) - [ ] Up-to-date CHANGELOG.md (or not necessary)
- [ ] Up-to-date JSON schema (or not necessary) - [ ] Up-to-date JSON schema (or not necessary)
- [ ] Appropriate user and developer documentation (or not necessary) - [ ] Appropriate user and developer documentation (or not necessary)
- Update / write published documentation (`make doc`).
- How do I use the software? Assume "stupid" users. - How do I use the software? Assume "stupid" users.
- How do I develop or debug the software? Assume novice developers. - How do I develop or debug the software? Assume novice developers.
- [ ] Annotations in code (Gitlab comments) - [ ] Annotations in code (Gitlab comments)
...@@ -41,7 +42,8 @@ guidelines](https://gitlab.com/caosdb/caosdb/-/blob/dev/REVIEW_GUIDELINES.md) ...@@ -41,7 +42,8 @@ guidelines](https://gitlab.com/caosdb/caosdb/-/blob/dev/REVIEW_GUIDELINES.md)
- [ ] I understand the intent of this MR - [ ] I understand the intent of this MR
- [ ] All automated tests pass - [ ] All automated tests pass
- [ ] Up-to-date CHANGELOG.md (or not necessary) - [ ] Up-to-date CHANGELOG.md (or not necessary)
- [ ] Appropriate user and developer documentation (or not necessary) - [ ] Appropriate user and developer documentation (or not necessary), also in published
documentation.
- [ ] The test environment setup works and the intended behavior is reproducible in the test - [ ] The test environment setup works and the intended behavior is reproducible in the test
environment environment
- [ ] In-code documentation and comments are up-to-date. - [ ] In-code documentation and comments are up-to-date.
......
...@@ -5,10 +5,79 @@ All notable changes to this project will be documented in this file. ...@@ -5,10 +5,79 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased] ## ## [Unreleased]
### Added
### Changed
### Deprecated
### Removed
### Fixed
### Security
## [0.12.1] - 2023-12-13
(Timm Fitschen)
### Fixed
* Insufficient permission checks during subproperty filters of SELECT queries
when an entity with retrieve permissions references one without
[linkahead-server#244](https://gitlab.com/linkahead/linkahead-server/-/issues/244)
* Insufficient permission checks in queries when a name of an invisible record
is used in a filter where a visible record references the invisible one
[linkahead-server#242](https://gitlab.com/linkahead/linkahead-server/-/issues/242)
### Security
This is an important security patch release. The bugs
[linkahead-server#244](https://gitlab.com/linkahead/linkahead-server/-/issues/244)
and
[linkahead-server#242](https://gitlab.com/linkahead/linkahead-server/-/issues/242)
possibly leak sensitive data when an attacker with read access to linkahead
(i.e. the attacker needs an active user account or anonymous needs to be
enabled) can guess the name of entities or properties of referenced entities
and construct a malicious FIND or SELECT statement and when the attacker has
read permissions for an entity which references the entities containing the
sensitive information. See the bug reports for more information.
## [0.12.0] - 2023-10-25
(Timm Fitschen)
### Fixed
* `FIND ENTITY <ID> is broken`.
[linkahead-server#323](https://gitlab.indiscale.com/caosdb/src/caosdb-server/-/issues/323)
* Unknown Server Error when inserting an Entity.
[linkahead-mariadbbackend](https://gitlab.indiscale.com/caosdb/src/caosdb-mysqlbackend/-/issues/48)
## [0.11.0] 2023-10-13 ##
### Added ### ### Added ###
* Configuration options `REST_RESPONSE_LOG_FORMAT` and
`GRPC_RESPONSE_LOG_FORMAT` which control the format and information included
in the log message of any response of the respective API. See
`conf/core/server.conf` for more information.
* REST API: Permanent redirect from "FileSystem" to "FileSystem/".
### Fixed ###
* Inheritance of the unit is not working. (GRPC API)
[linkahead-server#264](https://gitlab.indiscale.com/caosdb/src/caosdb-server/-/issues/264)
* Curly brackets in query lead to unexpected server error.
[linkahead-server#138](https://gitlab.com/linkahead/linkahead-server/-/issues/138)
* Wrong url returned by FileSystem resource behind proxy.
* `NullPointerException` in GRPC API converters when executing SELECT query on
NULL values.
* Fix parsing of decimal numbers. Fixes https://gitlab.com/linkahead/linkahead-server/-/issues/239
## [0.10.0] - 2023-06-02 ##
(Florian Spreckelsen)
### Changed ### ### Changed ###
* The default behavior of the query `FIND SomeName [...]` (as well as COUNT and SELECT) is being * The default behavior of the query `FIND SomeName [...]` (as well as COUNT and SELECT) is being
...@@ -26,12 +95,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ...@@ -26,12 +95,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
* The InsertFilesInDir FlagJob now creates File entities without a name. The previous behavior * The InsertFilesInDir FlagJob now creates File entities without a name. The previous behavior
caused severe performance problems for very large numbers of files. Issue: [#197](https://gitlab.com/caosdb/caosdb-server/-/issues/197) caused severe performance problems for very large numbers of files. Issue: [#197](https://gitlab.com/caosdb/caosdb-server/-/issues/197)
### Deprecated ###
### Removed ###
### Fixed ### ### Fixed ###
* Unexpected Server Error when inserting an Entity.
[#216](https://gitlab.com/caosdb/caosdb-server/-/issues/216)
* Bad performance due to the execution of unnecessary jobs during retrieval.
[#189](https://gitlab.com/caosdb/caosdb-server/-/issues/189)
* Query Language: Parentheses change filter to subproperty filter
[#203](https://gitlab.com/caosdb/caosdb-server/-/issues/203)
* Searching for values in scientific notation
[#143](https://gitlab.com/caosdb/caosdb-server/-/issues/143)
* Denying a role permission has no effect * Denying a role permission has no effect
[#196](https://gitlab.com/caosdb/caosdb-server/-/issues/196). See security [#196](https://gitlab.com/caosdb/caosdb-server/-/issues/196). See security
notes below. notes below.
...@@ -52,6 +125,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ...@@ -52,6 +125,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Nested queries. - Nested queries.
- Global entity permissions. - Global entity permissions.
- DOC: Data model tutorial.
- Removed old documentation directory `/doc/`, migrated non-duplicate content to `/src/doc/`.
## [0.9.0] - 2023-01-19 ## [0.9.0] - 2023-01-19
......
...@@ -23,6 +23,6 @@ authors: ...@@ -23,6 +23,6 @@ authors:
given-names: Stefan given-names: Stefan
orcid: https://orcid.org/0000-0001-7214-8125 orcid: https://orcid.org/0000-0001-7214-8125
title: "CaosDB - Server" title: "CaosDB - Server"
version: 0.8.1 version: 0.12.1
doi: 10.3390/data4020083 doi: 10.3390/data4020083
date-released: 2022-11-07 date-released: 2023-12-13
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
## For Building and Running the Server ## For Building and Running the Server
* `>=caosdb-proto 0.3.0` * `>=caosdb-proto 0.3.0`
* `>=caosdb-mysqlbackend 5.0.0` * `>=caosdb-mysqlbackend 7.0.0`
* `>=Java 11` * `>=Java 11`
* `>=Apache Maven 3.6.0` * `>=Apache Maven 3.6.0`
* `>=Make 4.2` * `>=Make 4.2`
...@@ -13,7 +13,7 @@ ...@@ -13,7 +13,7 @@
## For Deploying a Web User Interface (optional) ## For Deploying a Web User Interface (optional)
* `>=caosdb-webui 0.8.0` * `>=caosdb-webui 0.13.0`
## For Building the Documentation (optional) ## For Building the Documentation (optional)
......
...@@ -163,6 +163,9 @@ sources (if you called `make run` previously). ...@@ -163,6 +163,9 @@ sources (if you called `make run` previously).
`$ make test` `$ make test`
You can run single unit test with
`mvn test -X -Dtest=TestCQL#testDecimalNumber`
## Setup Eclipse ## Setup Eclipse
...@@ -238,6 +241,7 @@ Stand-alone documentation is built using Sphinx: `make doc` ...@@ -238,6 +241,7 @@ Stand-alone documentation is built using Sphinx: `make doc`
- recommonmark - recommonmark
- sphinx - sphinx
- sphinx-rtd-theme - sphinx-rtd-theme
- sphinx-a4doc
- sphinxcontrib-plantuml - sphinxcontrib-plantuml
- javasphinx :: `pip3 install --user javasphinx` - javasphinx :: `pip3 install --user javasphinx`
- Alternative, if javasphinx fails because python3-sphinx is too recent: - Alternative, if javasphinx fails because python3-sphinx is too recent:
......
...@@ -26,23 +26,25 @@ guidelines of the CaosDB Project ...@@ -26,23 +26,25 @@ guidelines of the CaosDB Project
5. Merge the release branch into the main branch. 5. Merge the release branch into the main branch.
6. Tag the latest commit of the main branch with `v<VERSION>`. 6. Wait for the main branch pipelines to pass.
7. Delete the release branch. 7. Tag the latest commit of the main branch with `v<VERSION>`.
8. Merge the main branch back into the dev branch. 8. Delete the release branch.
9. Update the versions for the next developement round: 9. Merge the main branch back into the dev branch.
10. Update the versions for the next developement round:
* [pom.xml](./pom.xml) with a `-SNAPSHOT` suffix * [pom.xml](./pom.xml) with a `-SNAPSHOT` suffix
* `src/doc/conf.py` * `src/doc/conf.py`
* `CHANGELOG.md`: Re-add the `[Unreleased]` section. * `CHANGELOG.md`: Re-add the `[Unreleased]` section.
10. Add a gitlab release in the respective repository: 11. Add a gitlab release in the respective repository:
https://gitlab.indiscale.com/caosdb/src/caosdb-server/-/releases https://gitlab.com/linkahead/linkahead-server/-/releases
Add a description, which can be a copy&paste from the CHANGELOG, possibly prepended by: Add a description, which can be a copy&paste from the CHANGELOG, possibly prepended by:
```md ```md
# Changelog # Changelog
[See full changelog](https://gitlab.indiscale.com/caosdb/src/caosdb-server/-/blob/${TAG}/CHANGELOG.md) [See full changelog](https://gitlab.com/linkahead/linkahead-server/-/blob/${TAG}/CHANGELOG.md)
``` ```
caosdb-webui @ 6e4db2f9
Subproject commit 421b2dce5199a5a2c96bcc638543c1fd51d48870 Subproject commit 6e4db2f99e1d441bbda9ccca85fae45526018406
# #
# This file is a part of the CaosDB Project. # This file is a part of the CaosDB Project.
# #
# Copyright (C) 2021 Timm Fitsche <t.fitschen@indiscale.com> # Copyright (C) 2021-2023 Timm Fitsche <t.fitschen@indiscale.com>
# Copyright (C) 2021 IndiScale GmbH <info@indiscale.com> # Copyright (C) 2021-2023 IndiScale GmbH <info@indiscale.com>
# #
# This program is free software: you can redistribute it and/or modify # This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as # it under the terms of the GNU Affero General Public License as
...@@ -22,6 +22,7 @@ ...@@ -22,6 +22,7 @@
# DOMAIN_ID, ENTITY_ID, TRANSACTION, JOB, JOB_FAILURE_SEVERITY # DOMAIN_ID, ENTITY_ID, TRANSACTION, JOB, JOB_FAILURE_SEVERITY
# general rules # general rules
0,0,INSERT,GenerateEntityId,ERROR
0,0,INSERT,CheckPropValid,ERROR 0,0,INSERT,CheckPropValid,ERROR
0,0,INSERT,CheckParValid,ERROR 0,0,INSERT,CheckParValid,ERROR
0,0,INSERT,CheckParOblPropPresent,ERROR 0,0,INSERT,CheckParOblPropPresent,ERROR
...@@ -30,8 +31,7 @@ ...@@ -30,8 +31,7 @@
0,0,UPDATE,CheckParValid,ERROR 0,0,UPDATE,CheckParValid,ERROR
0,0,UPDATE,CheckParOblPropPresent,ERROR 0,0,UPDATE,CheckParOblPropPresent,ERROR
0,0,UPDATE,CheckValueParsable,ERROR 0,0,UPDATE,CheckValueParsable,ERROR
0,0,DELETE,CheckReferenceDependencyExistent,ERROR 0,0,DELETE,CheckDependenciesBeforeDeletion,ERROR
0,0,DELETE,CheckChildDependencyExistent,ERROR
# role specific rules # role specific rules
......
...@@ -8,6 +8,10 @@ verbose = true ...@@ -8,6 +8,10 @@ verbose = true
property.LOG_DIR = log property.LOG_DIR = log
property.REQUEST_TIME_LOGGER_LEVEL = OFF property.REQUEST_TIME_LOGGER_LEVEL = OFF
# the root logger has level "WARN" but we want to log the GRPC traffic in INFO level:
logger.grpc_response_log.name = org.caosdb.server.grpc.LoggingInterceptor
logger.grpc_response_log.level = INFO
## appenders ## appenders
# stderr # stderr
appender.stderr.type = Console appender.stderr.type = Console
......
...@@ -66,7 +66,7 @@ MYSQL_USER_NAME=caosdb ...@@ -66,7 +66,7 @@ MYSQL_USER_NAME=caosdb
# Password for the user # Password for the user
MYSQL_USER_PASSWORD=random1234 MYSQL_USER_PASSWORD=random1234
# Schema of mysql procedures and tables which is required by this CaosDB instance # Schema of mysql procedures and tables which is required by this CaosDB instance
MYSQL_SCHEMA_VERSION=v6.0-SNAPSHOT MYSQL_SCHEMA_VERSION=v8.0-SNAPSHOT
# -------------------------------------------------- # --------------------------------------------------
...@@ -97,6 +97,21 @@ GRPC_SERVER_PORT_HTTPS=8443 ...@@ -97,6 +97,21 @@ GRPC_SERVER_PORT_HTTPS=8443
# HTTP port of the grpc end-point # HTTP port of the grpc end-point
GRPC_SERVER_PORT_HTTP= GRPC_SERVER_PORT_HTTP=
# --------------------------------------------------
# Response Log formatting (this cannot be configured by the logging frame work
# and thus has to be configured here).
# --------------------------------------------------
# Logging format of the GRPC API.
# Known keys: user-agent, local-address, remote-address, method.
# 'OFF' turns off the logging.
GRPC_RESPONSE_LOG_FORMAT={method} {local-address} {remote-address} {user-agent}
# Logging format of the REST API.
# Known keys: see column "Variable name" at https://javadocs.restlet.talend.com/2.4/jse/api/index.html?org/restlet/util/Resolver.html
# 'OFF' turns off the logging.
# Leaving this empty means using restlet's default settings.
REST_RESPONSE_LOG_FORMAT=
# -------------------------------------------------- # --------------------------------------------------
# HTTPS options # HTTPS options
# -------------------------------------------------- # --------------------------------------------------
...@@ -222,4 +237,4 @@ ENTITY_VERSIONING_ENABLED=true ...@@ -222,4 +237,4 @@ ENTITY_VERSIONING_ENABLED=true
# Enabling the state machine extension # Enabling the state machine extension
# EXT_STATE_ENTITY=ENABLE # EXT_STATE_ENTITY=ENABLE
\ No newline at end of file
...@@ -46,11 +46,14 @@ class = org.caosdb.server.accessControl.Pam ...@@ -46,11 +46,14 @@ class = org.caosdb.server.accessControl.Pam
# scripts or the misc/pam_authentication/ldap_authentication.sh script here. # scripts or the misc/pam_authentication/ldap_authentication.sh script here.
; pam_script = ./misc/pam_authentication/pam_authentication.sh ; pam_script = ./misc/pam_authentication/pam_authentication.sh
default_status = ACTIVE default_status = ACTIVE
# Only users which fulfill these criteria are accepted. # Only users which fulfill these criteria are accepted. The values are
# user/group name(s) separated by whitespaces
;include.user = [uncomment and put your users here] ;include.user = [uncomment and put your users here]
;include.group = [uncomment and put your groups here] ;include.group = [uncomment and put your groups here]
;exclude.user = [uncomment and put excluded users here] ;exclude.user = [uncomment and put excluded users here]
;exclude.group = [uncomment and put excluded groups here] ;exclude.group = [uncomment and put excluded groups here]
# It is typically necessary to add at least one admin # It is typically necessary to add at least one admin
;user.[uncomment a set a username here].roles = administration ;user.[uncomment and set a username here].roles = administration
# Several roles are separated by commas
;user.[uncomment and set a username here].roles = role1, role2, role with spaces
Author: Timm Fitschen
Email: timm.fitschen@ds.mpg.de
Date: Older than 2016
Some features of CaosDB are available to registered users only. Making any changes to the data stock via HTTP requires authentication by `username` _plus_ `password`. They are to be send as a HTTP header, while the password is to be hashed by the sha512 algorithm:
| `username:` | `$username` |
|-------------|-------------|-
| `password:` | `$SHA512ed_password` |
# Sessions
## Login
### Request Challenge
* `GET http://host:port/login?username=$username`
* `GET http://host:port/login` with `username` header
*no password required to be sent over http*
The request returns an AuthToken with a login challenge as a cookie. The AuthToken is a dictionary of the following form:
{scope=$scope;
mode=LOGIN;
offerer=$offerer;
auth=$auth
expires=$expires;
date=$date;
hash=$hash;
session=$session;
}
$scope:: A uri pattern string. Example: ` {**/*} `
$mode:: `ONETIME`, `SESSION`, or `LOGIN`
$offerer:: A valid username
$auth:: A valid username
$expires:: A `YYYY-MM-DD HH:mm:ss[.nnnn]` date string
$date:: A `YYYY-MM-DD HH:mm:ss[.nnnn]` date string
$hash:: A string
$session:: A string
The challenge is solved by concatenating the `$hash` string and the user's `$password` string and calculating the sha512 hash of both. Pseudo code:
$solution = sha512($hash + sha512($password))
### Send Solution
The old $hash string in the cookie has to be replaces by $solution and the cookie is to be send with the next request:
`PUT http://host:port/login`
The server will return the user's entity in the HTTP body, e.g.
<Response ...>
<User name="$username" ...>
...
</User>
</Response>
and a new AuthToken with `$mode=SESSION` and a new expiration date and so on. This AuthToken cookie is to be send with every request.
### Logout
Send
`PUT http://host:port/logout`
with a valid AuthToken cookie. No new AuthToken will be returned and no AuthToken with that `$session` will be accepted anymore.
# TEXT
* Description: TEXT stores stores any text values.
* Range: Any [utf-8](https://en.wikipedia.org/wiki/UTF-8) encodable sequence of characters with maximal 65,535 bytes. (Simply put: In most cases, any text with less than 65,535 letters and spaces will work. But if you use special characters like `à`, `€` or non-latin letters then the number of bytes, which are needed to store it, increases. Then the effective maximal length is smaller than 65,535. A bad case scenario would be a text in Chinese. Chinese characters need about three times the space of letters from the latin alphabet. Therefore, only 21845 Chinese characters can be stored within this datatype. Which is still quite a lot I guess :D)
* Examples:
* `Am Faßberg 17, D-37077 Göttingen, Germany`
* `Experiment went well until the problem with the voltmeter occured. Don't use the results after that.`
* `someone@email.org`
* `Abstract: bla bla bla ...`
* `Head of Group`
* `http://www.bmp.ds.mpg.de`
*
A. Schlemmer, S. Berg, TK Shajahan, S. Luther, U. Parlitz,
Quantifying Spatiotemporal Complexity of Cardiac Dynamics using Ordinal Patterns,
37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2015, doi: 10.1109/EMBC.2015.7319283
----
# BOOLEAN
* Description: BOOLEAN stores boolean `TRUE` or `FALSE`. It is therefore suitable for any variable that represents that something is the case or not.
* Accepted Values: `TRUE` or `FALSE`, case insensitive (i.e. it doesn't matter if you use capitals or small letters).
* Note: You could also use a TEXT datatype to represent booleans (or even INTEGER or DOUBLE). But it makes a lot of sense to use this special datatype as it ensures that only the two possible values, `TRUE` or `FALSE` are inserted into the database. Every other input would be rejected. This helps to keep the database understandable and to avoid mistakes.
----
# INTEGER
* Description: INTEGER stores integer numbers. If you need floating point variables, take a look at DOUBLE.
* Range: `-2147483648` to `2147483647`, `-0` is interpreted and stored as `0`.
* Note: This rather limited range is just provisional. It can be extended with low effort as soon as requested.
----
# DOUBLE
* Description: DOUBLE stores floating point numbers with a double precision as defined by [IEEE 754](https://en.wikipedia.org/wiki/IEEE_floating_point).
* Range:
* From `2.2250738585072014E-308` to `1.7976931348623157E308` (negative and positive) with a precision of 15 decimals.
* Any other decimal number _might work_ but it is not guaranteed.
* `-0`, `0`, `NaN`, `-inf` and `inf`
* Note: The server generates a warning when the precision of the submitted DOUBLE value is to high to be preserved.
----
# DATETIME
The DateTime data type exists in (currently) three flavors which are dynamically chosen during parsing on the the serverside. The flavors have different ranges, support of time zones and intended use cases. Only the first two flavors are actually implemented for storage and queries. The third one is implemented for queries exclusively.
## UTCDateTime
* Description: This DATETIME flavor stores values which represent a single point of time according to [UTC](https://en.wikipedia.org/wiki/Coordinated_Universal_Time) with the format specified by [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) (Combined date and time). It does support [UTC Leap Seconds](https://en.wikipedia.org/wiki/Leap_second) and time zones.
* Range: From `-9999-01-01T00:00:00.0UTC` to `9999-12-31T23:59:59.999999999UTC` with nanosecond precision.
* Examples:
* `2016-01-01T13:23:00.0CEST` which means _January 1, 2016, 1:23 PM, Central European Summer Time_.
* `-800-01-01T13:23:00.0` which means _January 1, 800 BC, 1:23 PM, UTC_.
* Note:
* It is allowed to ommit the nanosecond part of a UTCDateTime (`2016-01-01T13:23:00CEST`). This indicates a precision of seconds for a UTCDateTime value.
## Date
Description:: This DATETIME flavor stores values which represent a single date, month or year according to the [gregorian calendar](https://en.wikipedia.org/wiki/Gregorian_Calendar). A month/year is conceived as a single date with the presion of a month/year. This concept is useful if you try to understand the query semantics which are explained [elsewhere](./QueryLanguage#POVDateTime).
Format:: `Y[YYY][-MM[-dd]]` (where square brackets mean that the expression is optional).
Range:: Any valid date according to the gregorian calendar from `-9999-01-01` to `9999-12-31` (and respective dates with lower precision. E.g. the year `-9999`). There is no year `0`.
* Note: Date is a specialization of [#SemiCompleteDateTime].
## SemiCompleteDateTime
* Description: A generalization of the _Date_ and _UTCDateTime_ flavors. In general, there is no time zone support. Although this flavor is not yet storable in general, it is implemented for search queries yet. I.e. you might search for `FIND ... date>2015-04-03T20:15` yet.
* Format: `Y[YYY]['-MM[-dd[Thh:[mm[:ss[.ns]]]]]]]`.
* Special Properties: For every SemiCompleteDateTime _d_ there exists a _Inclusive Lower Bound_ (`d.ILB`) and a _Exclusive Upper Bound_ (`d.EUB`). That means, a SemiCompleteDateTime can be interpreted as an interval of time. E.g. `2015-01` is the half-open interval `[2015-01-01T00:00:00.0, 2016-01-01T00:00:00.0)`. ILB and EUB are UTCDateTimes respectively. These properties are important for the semantics of the the query language, especialy the [operators](./QueryLanguage#POVDateTime).
## Future Flavors
Please file a new feature request as soon as you need them.
* Time:: For a time of the day (without the date). Supports time zones.
* FragmentaryDateTime:: For any fragmentary DateTime. That is an arbitrary combination of year, month, day of week, day of month, day of year, hour of day, minute, seconds (and nanoseconds). This flavor is useful for recurrent events like a bus schedule (_Saturday, 7:30_) or the time of a standing order for money transfer (_third day of the month_).
----
# REFERENCE
* Description: REFERENCE values store the [Valid ID](./Glossary#valid-id) of an existing entity. The are useful to establish links between two entities.
* Accepted Values: Any [Valid ID](./Glossary#valid-id) or [Valid Unique Existing Name](./Glossary#valid-unique-existing-name) or [Valid Unique Temporary ID](./Glossary#valid-unique-temporary-id) or [Valid Unique Prospective Name](./Glossary#valid-unique-prospective-pame).
* Note:
* After beeing processed successfully by the server the REFERENCE value is normalized to a [Valid ID](./Glossary#valid-id). I.e. it is guaranteed that a REFERENCE value of a valid property is a positive integer.
## FILE
* Description: A FILE is a special REFERENCE. It only allows entity IDS which belong to a File.
## RecordType as a data type
* Furthermore, any RecordType can be used as a data type. This is a variant of the REFERENCE data type where any entity is a valid value which is a child of the RecordType in question.
* Example:
* Let `Person` be a RecordType, `Bertrand Russel` be a child of `Person`. Then `Bertrand Russel` is a valid value for a property with a `Person` data type.
# LIST
* Description: A LIST is always a list of something which has another data type. E.g. A LIST of TEXT values, a LIST of REFERENCES value, etc. Here we call TEXT resp. REFERENCE the **Element Data Type**. The LIST data type allows you to store an arbitrary empty or non-empty ordered set (with duplicates) of values of the *same* data type into one property. Each value must be a valid value of the Element Data Type.
* Example:
* LIST of INTEGER: ```[0, 2, 4, 5, 8, 2, 3, 6, 7]```
* LIST of Person, while `Person` is a RecordType: ```['Bertrand Russel', 'Mahatma Ghandi', 'Mother Therese']```
Version: 0.1.0r1
Author: Timm Fitschen
Email: timm.fitschen@ds.mpg.de
Date: 2017-12-17
# Introduction
CaosDB is a database management system that stores it's data into `Entities`. An `Entity` can be thought of as the equivalent to tables, rows, columns and the tuples that fill the tables of a traditional RDBMS. Entities are not only used to store the data they also define the structure of the data.
# Formal Definition
An `Entity` may have
* a `domain`
* an `id`
* a `role`
* a `name`
* a `data type`
* a `Set of Values`
* a `Set of Properties`
* a `Set of Parents`
A `domain` contains an `Entity`.
An `id` is an arbitrary string.
A `role` is an arbitrary string. Especially, it may be one of the following strings:
* `RecordType`
* `Record`
* `Relation`
* `Property`
* `File`
* `QueryTemplate`
* `Domain`
* `Unit`
* `Rule`
* `DataType`
* `Remote`
A `name` is an arbitrary string.
A `data type` contains an `Entity`. Note: this is not necessarily a `Data Type`.
## Set of Values
A `Set of Values` is a mapping from a `indices` to a finite set of `Values`.
An `index` is an interval of non-negative integers starting with zero.
### Value
A `Value` may have a `data type` and/or a `unit`.
A `data type` is an `Entity`. Note: this is not necessarily a `Data Type`.
A `unit` is an arbitrary string.
## Data Type
A `Data Type` is an `Entity` with role `DataType`.
### Reference Data Type
A `Reference Data Type` is a `Data Type`. It may have a `scope`.
A `scope` contains an `Entity`.
### Collection Data Type
A `Collection Data Type` is a `Data Type`. It may have an ordered set of `elements`.
## Record Type
A `Record Type` is an `Entity` with role `RecordType`.
## Record
A `Record` is an `Entity` with role `Record`.
## Relation
A `Relation` is an `Entity` with role `Relation`.
## Property
A `Property` is an `Entity` with role `Property`. It is also refered to as `Abstract Property`.
## File
A `File` is an `Entity` with role `File`.
A `File` may have
* a `path`
* a `size`
* a `checksum`
A `path` is an arbitrary string.
A `size` is a non-negative integer.
A `checksum` is an ordered pair (`method`,`result`).
A `method` is an arbitrary string.
A `result` is an arbitrary string.
## QueryTemplate
A `QueryTemplate` is an `Entity` with role `QueryTemplate`.
## Domain
A `Domain` is an `Entity` with role `Domain`.
## Unit
A `Unit` is an `Entity` with role `Unit`.
## Rule
A `Rule` is an `Entity` with role `Rule`.
## Remote
A `Remote` is an `Entity` with role `Remote`.
## Set of Parents
A `Set of Parents` is a set of `Parents`.
### Parent
A `Parent` may contain another `Entity`.
A `Parent` may have an `affiliation`.
An `affiliation` may contain of the following strings:
* `subtyping`
* `instantiation`
* `membership`
* `parthood`
* `realization`
## Set of Properties
A `Set of Properties` is a tripple (`index`, set of `Implemented Properties`, `Phrases`).
An `index` is a bijective mapping from an interval of non-negative integer numbers starting with zero to the set of `Implemented Properties`.
### Implemented Property
An `Implemented Property` contains another `Entity`.
An `Implemented Property` may have an `importance`.
An `Implemented Property` may have a `maximum cardinality`.
An `Implemented Property` may have a `minimum cardinality`.
An `Implemented Property` may have an `import`.
An `importance` is an arbitrary string. It may contain of the following strings:
* `obligatory`
* `recommended`
* `suggested`
* `fix`
A `maximum cardinality` is a non-negative integer.
A `minimum cardinality` is a non-negative integer.
An `import` is an arbitrary string. It may contain of the following strings:
* `fix`
* `none`
### Phrases
`Phrases` are a mapping from the cartesian product of the `index` with itself to a `predicate`.
A `predicate` is an arbitrary string.
Author: Timm Fitschen
Email: timm.fitschen@ds.mpg.de
Date: 2014-06-17
# Info
There are several ways to utilize the file server component of CaosDB. It is possible to upload a file or a whole folder including subfolders via HTTP and the _drop off box_. It is possible to download a file via HTTP identified by its ID or by its path in the internal file system. Furthermore, it is possible to get the files metadata via HTTP as an xml.
# File upload
## Drop off box
The drop off box is a directory on the CaosDB server's local file system, specified in the `server.conf` file in the server's basepath (something like `~/CaosDB/server/server.conf`). The key in the `server.conf` is called `dropoffbox`. Since the drop off box directory is writable for all, users can push their files or complete folders via a `mv` or a `cp` (recommended!) in that folder. The server deletes files older than their maximum lifetime (24 hours by default, specified `in server.conf`). But within their lifetime a user can prompt the server to pick up the file (or folder) from the drop off box in order to transfer it to the internal file system.
Now, the user may send a pick up request to `POST http://host:port/FilesDropOff` with a similar body:
<Post>
<File pickup="$path_dropoffbox" destination="$path_filesystem" description="$description" generator="$generator"/>
...
</Post>
whereby
* $path_dropoffbox is the actual relative path of the dropped file or folder in the DropOffBox,
* $path_filesystem is the designated relative path of that object in the internal file system,
* $description is a description of the file to be uploaded,
* $generator is the tool or client used for pushing this file.
After a successful pick up the server will return:
<Response>
<File description="$description" path="$path" id="$id" checksum="$checksum" size="$size" />
...
</Response>
whereby
* $id is the new generated id of that file and
* $path is the path of the submitted file or folder relative to the file system's root.
## HTTP upload stream
### Files
File upload via HTTP is implemented in a [rfc1867](http://www.ietf.org/rfc/rfc1867.txt) consistent way. This is a de-facto standard that defines a file upload as a part of an HTML form submission. This concept shall not be amplified here. But it has to be noticed that this protocol is not designed for uploads of complete structured folders. Therefore the CaosDB file components have to impose that structure on the upload protocol.
CaosDB's file upload resource does exclusively accept POST requests of MIME media type `multipart/form-data`. The first part of each POST body is expected to be a form-data text field, containing information about the files to be uploaded. It has to meet the following requirements:
* `Content-type: text/plain; charset=UTF-8`
* `Content-disposition: form-data; name="FileRepresentation"`
If the content type of the first part is not `text/plain; charset=UTF-8` the server will return error 418. If the body is not actually encoded in UTF-8 the servers behaviour is not defined. If the field name of the first part is not `FileRepresentation` the server will return error 419.
The body of that first part is to be an xml document of the following form:
<Post>
<File upload="$temporary_identifier" destination="$path_filesystem" description="$description" checksum="$checksum" size="$size"/>
...
</Post>
whereby
* $temporary_identifier is simply a arbitrary name, which will be used to identify this `<File>` tag with a uploaded file in the other form-data parts.
* $path_filesystem is the designated relative path of that object in the internal file system,
* $description is a description of the file to be uploaded,
* $size is the files size in bytes,
* $checksum is a SHA-512 Hash of the file.
The other parts (which must be at least one) may have any appropriate media type. `application/octet-stream` is a good choice for it is the default for any upload file according to [rfc1867](http://www.ietf.org/rfc/rfc1867.txt). Their field name may be any name meeting the requirements of [rfc1867](http://www.ietf.org/rfc/rfc1867.txt) (most notably they must be unique within this POST). But in order to identify the corresponding xml file representation of each file the `filename` parameter of the content-disposition header has to be set to the proper $temporary_identifier. The Content-disposition type must be `form-data`:
* `Content-disposition: form-data; name="$any_name"; filename="$temporary_identifier"`
Finally the body of these parts have to contain the file encoded in the proper `Content-Transfer-Encoding`.
If a file part has a `filename` parameter which doesn't occur in the xml file representation the server will return error 420. The file will not be stored anywhere. If an xml file representation has no corresponding file to be uploaded (i.e. there is no part with the same `filename`) the server will return error 421. Some other error might occur if the checksum, the size, the destination etc. are somehow corrupted.
### Folders
Uploading folders works in a similar way. The first part of the `multipart/form-data` document is to be the representation of the folders:
<Post>
<File upload="$temporary_identifier" destination="$path_filesystem" description="$description" checksum="$checksum" size="$size"/>
...
</Post>
The root folder is represented by a part which has a header of the form:
* `Content-disposition: form-data; name="$any_name"; filename="$temporary_identifier/"`
The slash at the end of the `filename` indicates that this is a folder, not a file. Consequently, the body of this part will be ignored and should be empty.
Any file with the name `$filename` in the root folder is represented by a part which has a header of the form:
* `Content-disposition: form-data; name="$any_name"; filename="$temporary_identifier/$filename"`
Any sub folder with the name `$subfolder` is represented by a part which has a header of the form:
* `Content-disposition: form-data; name="$any_name"; filename="$temporary_identifier/$subfolder/"`
Likewise, a complete directory tree can be transfered by appending the structure to the `filename` header field.
**Example**:
Given the structure
rootfolder/
rootfolder/file1
rootfolder/subfolder/
rootfolder/subfolder/file2
an upload document would have the following form:
... (HTTP Header)
Content-type: multipart/form-data, boundary=AaB03x
--AaB03x
content-disposition: form-data; name="FileRepresentation"
<Post>
<File upload="tmp1234" destination="$path_filesystem" description="$description" checksum="$checksum" size="$size"/>
</Post>
--AaB03x
content-disposition: form-data; name="random_name1"; filename="temp1234/"
--AaB03x
content-disposition: form-data; name="random_name1"; filename="temp1234/file1"
Hello, world! This is file1.
--AaB03x
content-disposition: form-data; name="random_name1"; filename="temp1234/subfolder/"
--AaB03x
content-disposition: form-data; name="random_name1"; filename="temp1234/subfolder/file2"
Hello, world! This is file2.
--AaB03x--
# Introduction
API Version 0.1.0
A Message is a way of communication between the server and a client. The main purpose is to inform the clients about errors which occured during transactions, issue warnings when entities have a certain state or just explicitly confirm that a transaction was successful. Messages represents information that is not persistent or just the reproducible outcome of a transaction. Messages are not stored aside from logging.
# Message Classes And Their Properties
## Message (generic super class)
A `Message` must be either a `Server Message` or a `Client Message`.
A `Message` must have a `description`. A `description` is a string and a human-readable explanation of the meaning and/or purpose of the message. The description must not have leading or trailing whitespaces. The description should be kept in English. For the time being there is no mechanism to indicate that the description is written in other languages. This could be changed in later versions of this API.
## Server Message
A `Server Message` is a Message issued by the server. It must not be issued by clients.
A `Server Message` may be either a `Standard Server Message` or a `Non-Standard Server Message`
### Standard Server Message
A `Standard Server Message` is one of a set of predefined messages with a certain meaning. The set of these `Standard Server Messages` is maintained and documented in the Java code of the server. There should be a server resource for these definitions in order to have a always up-to-date documentation of the messages on every server.
A `Standard Server Message` must have an `id`. An `id` is a non-empty string that uniquely identifies a standard server message. An id should consist only of ASCII compliant upper-case Latin alphabetic letters from `A` to `Z` and the underscore character `_`.
An `id` of a `Standard Server Message` must not start with the string `NSSM_`.
A `Standard Server Message` must have a `type`. A `type` is one these strings: `Info`, `Warning`, `Error`, or `Success`.
#### Error Message
A `Server Message` with type `Error` is also called `Error Message` and sometimes just `Error`. An `Error Message` indicates that a request has *failed*. It informs about the reasons for that failure or the nature of the problems which occurred. The description of each error message should explain the error and indicate if and how the client can remedy the problems with her request.
#### Warning Message
A `Server Message` with type `Error` is also called `Warning Message` and sometime just `Warning`. A `Warning Message` indicates that certain *irregularities* occurred during the processing of the request or that the client requested something that is *not recommended but not strictly forbidden*.
#### Info Message
A `Server Message` with type `Info` is also called `Info Message` and sometimes just `Info`. An `Info Message` is a means to inform the client about *arbitrary events* which occurred during the processing of the request and which are *not* to be considered *erroneous* or *non-recommended*. These info messages are primarily intended to make the processing of the request more understandable for the client. Info messages are not meant to be used for debugging.
#### Success Message
A `Server Message` with type `Success` is also called a `Success Message`. A `Success Message` indicates the successful *state change* due to portions of a request or the whole request. A success message must not be issued if the request fails.
### Non-Standard Server Message
A `Non-Standard Server Message` may be issued by any non-standard server plugin or extension. It is a placeholder for extensions to the Message API.
A `Non-Standard Server Message` may have an `id`. An `id` is a non-empty string. It should consist only of ASCII compliant upper-case Latin alphabetic letters from `A` to `Z` and the underscore character `_`. However, the id should not be equal to any id from the set of predefined standard server messages. Furthermore, the id of a non-standard server message should start with the string `NSSM_`.
A `Non-Standard Server Message` may have a `type`. A `type` is a non-empty string. It should consist only of ASCII compliant upper-case or lower-case Latin alphabetic letters from `a` to `z`, from `A` to `Z`, and the underscore character `_`. If the type is equal to one of the above-mentioned types, it must have the same meaning and the same effects on the request as the respective type from above. Especially, a message with type `Error` must not be issued unless the request actually fails. Likewise a `Success` must not be issued unless the request actually caused a *state change* of the server.
## Client Message
A `Client Message` may have an `ignore` flag. The `ignore` flag can have one of these values: `no`, `yes`, `warn`, `silent`
A `Client Message` is a message issued by a client. It should not be issued by the server. A `Client Message` may be completely ignored by clients. A client message must not be ignored by the server. A `Client Message` which cannot be understood by the server must result in an error, unless the `ignore` flag states otherwise.
### Ignore Flag
If the `ignore` flag is set to `no` the server must not ignore the client message. If the server cannot understand the client message an error must be issued. This will cause the transaction to fail.
If the `ignore` flag is set to `yes` the server must ignore the client message.
If the `ignore` flag is set to `warn` the server should not ignore the message. If the server cannot understand the client message, a warning must be issued. The transaction will not fail due to this warning.
## Message Parameters
A `Message` may have zero or more parameters. A `Message Parameter` is a a triple of a `key`, a `value`. It is intended to facilitate the processing and representation of messages by clients and the server. For example, consider an `Error Message` which states that a certain server state cannot be reached and the reason be that there is an entity with certain features. Then it is useful to refer to the entity via the
parameters. A client can now resolve the entity and just show it or generate a URI for this entity.
A `key` is a non-empty string which should consist only of ASCII compliant lower-case Latin alphabetic letters from `a` to `z` and the minus character `-`. A `key` must be unique among the keys of the message parameters.
A `value` is a possibly empty, arbitrary string which must not have leading or trailing white spaces.
A `Message Parameter` may have a `type`. The `type` of a `Message Parameter` is also called a `Message Parameter Type`. A `Message Parameter Type` is a non-empty string which should consist only of ASCII compliant lower-case Latin alphabetic letters from `a` to `z` and the minus character `-`. A message parameter type may be one these string: `entity-id`, `entity-name`, `entity-cuid`, `property-index`, `parent-id`, `parent-name`.
A `Message Parameter` with a type which begins with `entity-` is also called an `Entity Message Parameter`. The value of an `Entity Message Parameter` must refer to an entity—via its id, name, or cuid, respectively.
A `Message Parameter` with a type which begins with `property-` is also called a `Property Message Parameter`. The value of such a parameter must refer to an entity's property. In the case of the `property-index` type the value refers to a property via a zero-based index (among the list of properties of that entity). The list of properties in question must belong to the `Message Bearer` which must in turn be an `Entity`.
A `Message Parameter` with a type which begins with `parent-` is also called a `Parent Message Parameter`. The value of such a parameter must refer to an entity's parent via its id or name, respectively.
## Message Bearer
A `Message` must have a single `Message Bearer`, or, equivalently, a `Message` `belongs to` a single `Message Bearer`. The message is usually considered to carry information about the message bearer if not stated otherwise. The message's subject should be the message bearer itself, so to speak. Although, possibly indicated by a `Message Parameter` the message may be additionally or solely concerned with other things than the message bearer. Please note: The message bearer may also indicate the context of evaluation of the message parameters, e.g. when the type of the message parameter is `property-index`.
A `Message Bearer` may be an `Entity`, a `Property`, a `Container`, a `Request`, a `Response`, or a `Transaction`.
# Representation and Serialization
Messages can be serialized, deserialized by the means of XML.
## XML Representation
A `Message` is serialized into a single XML Element Node (hereafter the *root element* with zero or more Child Nodes.
#### Root Element Tag
The root element's tag of a `Server Message` must be equal to its `type` if and only if the type is equal to one of the allowed types of a `Standard Server Message` (even if it is a Non-Standard Server Message). Otherwise the root tag is just 'ServerMessage'.
```xml
<Error/><!--an Error Message-->
<Warning/><!--a Warning Message-->
<Info/><!--an Info Message-->
<Success/><!--a Success Message-->
<ServerMessage/> <!--a Non-Standard Server Message with a non-standard type-->
```
The root element's tag of a `Client Message` must be 'ClientMessage'. E.g.
```xml
<ClientMessage/><!--a Client Message-->
```
#### Root Element Attributes
The root element must have the attributes nodes `id`, and/or `ignore` if and only if the messages have corresponding properties. The root element must have a 'type' attribute only if the message has a type property and if the type is not equal to the root element's tag. The values of the attributes must equal the corresponding properties. E.g.
```xml
<Error id="ENTITY_DOES_NOT_EXIST" type="Error"/><!--this and the next element are equivalent-->
<Error id="ENTITY_DOES_NOT_EXIST"/>
<ServerMessage type="CustomType"/><!--has no id-->
<ServerMessage id="NSSM_MY_ID"/><!--has no type-->
```
or
```xml
<ClientMessage id="CM_MY_ID" ignore="warn"/>
```
All other Attributes should be ignored.
#### Description Element
The root element must have exactly one Child Element Node with tag 'Description' if and only if the message has a `description` property. The string value of the message's description must be the first Child Text Node of the 'Description' Element. E.g.
```xml
<ServerMessage>
<Description>This is a description.</Description>
</ServerMessage>
```
Please note: Any leading or trailing whitespaces of the Text Node must be stripped during the deserialization.
All other Attributes and Child Nodes should be ignored.
#### Parameters Element
The root element must have exactly one Child Element Node with tag 'Parameters' if the message has at least one `parameter`. The 'Parameters' Element in turn must have a single Child Element Node for each parameter which are called `Parameter Elements`.
A `Parameter Element` must have a tag equal to the `key` of the parameter.
It must have a `type` attribute equal to the `type` property of the parameter if and only if the parameter has a type. And it must have a first Child Text Node which is equal to the parameter's `value`. E.g.
```xml
<ClientMessage>
<Parameters>
<param-one type="entity-name">Experiment</param-one><!--One parameter with key="param-one", value="Experiment", and type="entity-name"-->
</Parameters>
</ClientMessage>
```
Please note: Any leading or trailing whitespaces of the Text Node must be stripped during the deserialization.
All other Attributes and Child Nodes below the 'Parameters' Element should be ignored.
The Paging flag splits the retrieval of a (possibly huge) number entities into pages.
# Syntax
flag = name, [":", value];
name = "P";
value = [ index ], ["L", length]];
index = ? any positive integer ?;
length = ? any positive integer ?;
# Semantics
The `index` (starting with zero) denotes the index of the first entity to be retrieved. The `length` is the number of entities on that page. If `length` is omitted, the default number of entities is returned (as configured by a server constant called ...). If only the `name` is given the paging behaves as if the `index` has been zero.
# Examples
`https://caosdb/Entities/all?flags=P:24L50` returns 50 entities starting with the 25th entity which would be retrieved without paging.
`https://caosdb/Entities/all?flags=P:24` returns the default number of entities starting with the 25th entity which would be retrieved without paging.
`https://caosdb/Entities/all?flags=P:L50` returns 50 entities starting with the first entity which would be retrieved without paging.
`https://caosdb/Entities/all?flags=P` returns the default number of entities starting with the first entity which would be retrieved without paging.
\ No newline at end of file
# Example queries
## Simple FIND Query
The following query will return any entity which has the name _ename_ and all its children.
`FIND ename`
The following queries are equivalent and will return any entity which has the name _ename_ and all its children, but only if they are genuin records. Of course, the returned set of entities (henceforth referred to as _resultset_) can also be restricted to recordtypes, properties and files.
`FIND RECORD ename`
`FIND RECORDS ename`
Wildcards use `*` for any characters or none at all. Wildcards for single characters (like the '_' wildcard from mysql) are not implemented yet.
`FIND RECORD en*` returns any entity which has a name beginning with _en_.
Regular expressions must be surrounded by _<<_ and '>>':
`FIND RECORD <<e[aemn]{2,5}>>`
`FIND RECORD <<[cC]am_[0-9]*>>`
*TODO* (Timm):
Describe escape sequences like `\\`, `\*`, `\<<` and `\>>`.
Currently, wildcards and regular expressions are only available for the _simple-find-part_ of the query, i. e. no wildcards/regexps for filters.
## Simple COUNT Query
This query counts entities which have certain properties.
`COUNT ename`
will return the number of entities which have the name _ename_ and all their children.
The syntax of the COUNT queries is equivalent to the FIND queries in any respect (this also applies to wildcards and regular expressions) but one: The prefix is to be `COUNT` instead of `FIND`.
Unlike the FIND queries, the COUNT queries do not return any entities. The result of the query is the number of entities which _would be_ returned if the query was a FIND query.
## Filters
### POV - Property-Operator-Value
The following queries are equivalent and will restrict the result set to entities which have a property named _pname1_ that has a value _val1_.
`FIND ename.pname1=val1`
`FIND ename WITH pname1=val1`
`FIND ename WHICH HAS A PROPERTY pname1=val1`
`FIND ename WHICH HAS A pname1=val1`
Again, the resultset can be restricted to records:
`FIND RECORD ename WHICH HAS A pname1=val1`
_currently known operators:_ `=, !=, <=, <, >=, >` (and cf. next paragraphes!)
#### Special Operator: LIKE
The _LIKE_ can be used with wildcards. The `*` is a wildcard for any (possibly empty) sequence of characters. Examples:
`FIND RECORD ename WHICH HAS A pname1 LIKE va*`
`FIND RECORD ename WHICH HAS A pname1 LIKE va*1`
`FIND RECORD ename WHICH HAS A pname1 LIKE *al1`
_Note:_ The _LIKE_ operator is will only produce expectable results with text properties.
#### Special Case: References
In general a reference can be addressed just like a POV filter. So
`FIND ename1.pname1=ename2`
will also return any entity named _ename1_ which references the entity with name or id _ename2_ via a reference property named _pname1_. However, it will also return any entity with a text property of that name with the string value _ename2_. In order to restrict the result set to reference properties one may make use of special reference operators:
_reference operators:_ `->, REFERENCES, REFERENCE TO`
The query looks like this:
`FIND ename1 WHICH HAS A pname1 REFERENCE TO ename2`
`FIND ename1 WHICH HAS A pname1->ename2`
#### Time Special Case: DateTime
_DateTime operators:_ `=, !=, <, >, IN, NOT IN`
##### `d1=d2`: Equivalence relation.
* ''True'' iff d1 and d2 are equal in every respect (same DateTime flavor, same fields are defined/undefined and all defined fields are equal respectively).
* ''False'' iff they have the same DateTime flavor but have different fields defined or fields with differing values.
* ''Undefined'' otherwise.
Examples:
* `2015-04-03=2015-04-03T00:00:00` is undefined.
* `2015-04-03T00:00:00=2015-04-03T00:00:00.0` is undefined (second precision vs. nanosecond precision).
* `2015-04-03T00:00:00.0=2015-04-03T00:00:00.0` is true.
* `2015-04-03T00:00:00=2015-04-03T00:00:00` is true.
* `2015-04=2015-05` is false.
* `2015-04=2015-04` is true.
##### `d1!=d2`: Intransitive, symmetric relation.
* ''True'' iff `d1=d2` is false.
* ''False'' iff `d1=d2` is true.
* ''Undefined'' otherwise.
Examples:
* `2015-04-03!=2015-04-03T00:00:00` is undefined.
* `2015-04-03T00:00:00!=2015-04-03T00:00:00.0` is undefined.
* `2015-04-03T00:00:00.0!=2015-04-03T00:00:00.0` is false.
* `2015-04-03T00:00:00!=2015-04-03T00:00:00` is false.
* `2015-04!=2015-05` is true.
* `2015-04!=2015-04` is false.
##### `d1>d2`: Transitive, non-symmetric relation.
Semantics depend on the flavors of d1 and d2. If both are...
###### [UTCDateTime](Datatype#datetime)
* ''True'' iff the time of d1 is after the the time of d2 according to [https://en.wikipedia.org/wiki/Coordinated_Universal_Time](UTC)
* ''False'' otherwise.
###### [SemiCompleteDateTime](Datatype#datetime)
* ''True'' iff `d1.ILB>d2.EUB` is true or `d1.ILB=d2.EUB` is true.
* ''False'' iff `d1.EUB<d2.ILB}} is true or {{{d1.EUB=d2.ILB` is true.
* ''Undefined'' otherwise.
Examples:
* `2015>2014` is true.
* `2015-04>2014` is true.
* `2015-01-01T20:15.00>2015-01-01T20:14` is true.
* `2015-04>2015` is undefined.
* `2015>2015-04` is undefined.
* `2015-01-01T20:15>2015-01-01T20:15:15` is undefined.
* `2014>2015` is false.
* `2014-04>2015` is false.
* `2014-01-01>2015-01-01T20:15:30` is false.
##### `d1<d2`: Transitive, non-symmetric relation.
Semantics depend on the flavors of d1 and d2. If both are...
###### [UTCDateTime](Datatype#datetime)
* ''True'' iff the time of d1 is before the the time of d2 according to [UTC](https://en.wikipedia.org/wiki/Coordinated_Universal_Time)
* ''False'' otherwise.
###### [SemiCompleteDateTime](Datatype#datetime)
* ''True'' iff `d1.EUB<d2.ILB` is true or `d1.EUB=d2.ILB` is true.
* ''False'' iff `d1.ILB>d2.EUB}} is true or {{{d1.ILB=d2.EUB` is true.
* ''Undefined'' otherwise.
Examples:
* `2014<2015` is true.
* `2014-04<2015` is true.
* `2014-01-01<2015-01-01T20:15:30` is true.
* `2015-04<2015` is undefined.
* `2015<2015-04` is undefined.
* `2015-01-01T20:15<2015-01-01T20:15:15` is undefined.
* `2015<2014` is false.
* `2015-04<2014` is false.
* `2015-01-01T20:15.00<2015-01-01T20:14` is false.
##### `d1 IN d2`: Transitive, non-symmetric relation.
Semantics depend on the flavors of d1 and d2. If both are...
###### [SemiCompleteDateTime](Datatype#datetime)
* ''True'' iff (`d1.ILB>d2.ILB` is true or `d1.ILB=d2.ILB` is true) and (`d1.EUB<d2.EUB` is true or `d1.EUB=d2.EUB` is true).
* ''False'' otherwise.
Examples:
* `2015-01-01 IN 2015` is true.
* `2015-01-01T20:15:30 IN 2015-01-01` is true.
* `2015-01-01T20:15:30 IN 2015-01-01T20:15:30` is true.
* `2015 IN 2015-01-01` is false.
* `2015-01-01 IN 2015-01-01T20:15:30` is false.
##### `d1 NOT IN d2`: Transitive, non-symmetric relation.
Semantics depend on the flavors of d1 and d2. If both are...
###### [SemiCompleteDateTime](Datatype#datetime)
* ''True'' iff (`d1.ILB IN d2.ILB` is false.
* ''False'' otherwise.
Examples:
* `2015 NOT IN 2015-01-01` is true.
* `2015-01-01 NOT IN 2015-01-01T20:15:30` is true.
* `2015-01-01 NOT IN 2015` is false.
* `2015-01-01T20:15:30 NOT IN 2015-01-01` is false.
* `2015-01-01T20:15:30 NOT IN 2015-01-01T20:15:30` is false.
##### Note
These semantics follow a three-valued logic with ''true'', ''false'' and ''undefined'' as truth values. Only ''true'' is truth preserving. I.e. only those expressions which evaluate to ''true'' pass the POV filter. `FIND ... WHICH HAS A somedate=2015-01` only returns entities for which `somedate=2015-01` is true. On the other hand, `FIND ... WHICH DOESN'T HAVE A somedate=2015-01` returns entities for which `somedate=2015-01` is false or undefined. Shortly put, `NOT d1=d2` is not equivalent to `d1!=d2`. The latter assertion is stronger.
#### Omitting the Property or the Value
One doesn't have to specify the property or the value at all. The following query filters the result set for entities which have any property with a value greater than _val1_.
`FIND ename WHICH HAS A PROPERTY > val1`
`FIND ename . > val1`
`FIND ename.>val1`
And for references...
`FIND ename1 WHICH HAS A REFERENCE TO ename2`
`FIND ename1 WHICH REFERENCES ename2`
`FIND ename1 . -> ename2`
`FIND ename1.->ename2`
The following query returns entities which have a _pname1_ property with any value.
`FIND ename WHICH HAS A PROPERTY pname1`
`FIND ename WHICH HAS A pname1`
`FIND ename WITH pname1`
`FIND ename WITH A pname1`
`FIND ename WITH A PROPERTY pname1`
`FIND ename WITH PROPERTY pname1`
`FIND ename . pname1`
`FIND ename.pname1`
### TransactionFilter
*Definition*
sugar:: `HAS BEEN` | `HAVE BEEN` | `HAD BEEN` | `WAS` | `IS`
negated_sugar:: `HAS NOT BEEN` | `HASN'T BEEN` | `WAS NOT` | `WASN'T` | `IS NOT` | `ISN'T` | `HAVN'T BEEN` | `HAVE NOT BEEN` | `HADN'T BEEN` | `HAD NOT BEEN`
by_clause:: `BY (ME | username | SOMEONE ELSE (BUT ME)? | SOMEONE ELSE BUT username)`
datetime:: A datetime string of the form `YYYY[-MM[-DD(T| )[hh[:mm[:ss[.nnn][(+|-)zzzz]]]]]]`
time_clause:: `[AT|ON|IN|BEFORE|AFTER|UNTIL|SINCE] (datetime) `
`FIND ename WHICH (sugar|negated_sugar)? (NOT)? (CREATED|INSERTED|UPDATED) (by_clause time_clause?| time_clause by_clause?)`
*Examples*
`FIND ename WHICH HAS BEEN CREATED BY ME ON 2014-12-24`
`FIND ename WHICH HAS BEEN CREATED BY SOMEONE ELSE ON 2014-12-24`
`FIND ename WHICH HAS BEEN CREATED BY erwin ON 2014-12-24`
`FIND ename WHICH HAS BEEN CREATED BY SOMEONE ELSE BUT erwin ON 2014-12-24`
`FIND ename WHICH HAS BEEN CREATED BY erwin`
`FIND ename WHICH HAS BEEN INSERTED SINCE 2021-04`
Note that `SINCE` and `UNTIL` are inclusive, while `BEFORE` and `AFTER` are not.
### File Location
Search for file objects by their location:
`FIND FILE WHICH IS STORED AT a/certain/path/`
#### Wildcards
_STORED AT_ can be used with wildcards similar to unix wildcards.
* `*` matches any characters or none at all, but not the directory separator `/`
* `**` matches any character or none at all.
* A leading `*` is short cut for `/**`
* A star directly between two other stars is ignored: `***` is the same as `**`.
* Escape character: `\` (E.g. `\\` is a literal backslash. `\*` is a literal star. But `\\*` is a literal backslash followed by a wildcard.)
Examples:
Find any files ending with `.acq`:
`FIND FILE WHICH IS STORED AT *.acq` or
`FIND FILE WHICH IS STORED AT **.acq` or
`FIND FILE WHICH IS STORED AT /**.acq`
Find files stored one directory below `/data/`, ending with `.acq`:
`FIND FILE WHICH IS STORED AT /data/*/*.acq`
Find files stored in `/data/`, ending with `.acq`:
`FIND FILE WHICH IS STORED AT /data/*.acq`
Find files stored in a directory at any depth in the tree below `/data/`, ending with `.acq`:
`FIND FILE WHICH IS STORED AT /data/**.acq`
Find any file in a directory which begins with `2016-02`:
`FIND FILE WHICH IS STORED AT */2016-02*/*`
### Back References
The back reference filters for entities that are referenced by another entity. The following query returns entities of the type _ename1_ which are referenced by _ename2_ entities via the reference property _pname1_.
* `FIND ename1 WHICH IS REFERENCED BY ename2 AS A pname1`
* `FIND ename1 WITH @ ename2 / pname1`
* `FIND ename1 . @ ename2 / pname1`
One may omit the property specification:
* `FIND ename1 WHICH IS REFERENCED BY ename2`
* `FIND ename1 WHICH HAS A PROPERTY @ ename2`
* `FIND ename1 WITH @ ename2`
* `FIND ename1 . @ ename2`
### Combining Filters with Propositional Logic
Any result set can be filtered by logically combining POV filters or back reference filters:
#### Conjunction (AND)
* `FIND ename1 WHICH HAS A PROPERTY pname1=val1 AND A PROPERTY pname2=val2 AND A PROPERTY...`
* `FIND ename1 WHICH HAS A PROPERTY pname1=val1 AND A pname2=val2 AND ...`
* `FIND ename1 . pname1=val1 & pname2=val2 & ...`
#### Disjunction (OR)
* `FIND ename1 WHICH HAS A PROPERTY pname1=val1 OR A PROPERTY pname2=val2 Or A PROPERTY...`
* `FIND ename1 WHICH HAS A PROPERTY pname1=val1 OR A pname2=val2 OR ...`
* `FIND ename1 . pname1=val1 | pname2=val2 | ...`
#### Negation (NOT)
* `FIND ename1 WHICH DOES NOT HAVE A PROPERTY pname1=val1`
* `FIND ename1 WHICH DOESN'T HAVE A pname1=val1`
* `FIND ename1 . NOT pname2=val2`
* `FIND ename1 . !pname2=val2`
#### ... and combinations with parentheses
* `FIND ename1 WHICH HAS A pname1=val1 AND DOESN'T HAVE A pname2<val2 AND ((WHICH HAS A pname3=val3 AND A pname4=val4) OR DOES NOT HAVE A (pname5=val5 AND pname6=val6))`
* `FIND ename1 . pname1=val1 & !pname2<val2 & ((pname3=val3 & pname4=val4) | !(pname5=val5 & pname6=val6))`
* `FIND ename1.pname1=val1&!pname2<val2&((pname3=val3&pname4=val4)|!(pname5=val5&pname6=val6))`
### A Few Important Expressions
* A:: The indistinct article. This is only syntactic suger. Equivalent expressions: `A, AN`
* AND:: The logical _and_. Equivalent expressions: `AND, &`
* FIND:: The beginning of the query.
* NOT:: The logical negation. Equivalent expressions: `NOT, DOESN'T HAVE A PROPERTY, DOES NOT HAVE A PROPERTY, DOESN'T HAVE A, DOES NOT HAVE A, DOES NOT, DOESN'T, IS NOT, ISN'T, !`
* OR:: The logical _or_. Equivalent expressions: `OR, |`
* RECORD,RECORDTYPE,FILE,PROPERTY:: Role expression for restricting the result set to a specific role.
* WHICH:: The marker for the beginning of the filters. Equivalent expressions: `WHICH, WHICH HAS A, WHICH HAS A PROPERTY, WHERE, WITH (A), .`
* REFERENCE:: This one is tricky: `REFERENCE TO` expresses a the state of _having_ a reference property. `REFERENCED BY` expresses the state of _being_ referenced by another entity.
* COUNT:: `COUNT` works like `FIND` but doesn't return the entities.
# Future
* *Sub Queries* (or *Sub Properties*): `FIND ename WHICH HAS A pname WHICH HAS A subpname=val`. This is like: `FIND AN experiment WHICH HAS A camera WHICH HAS A 'serial number'= 1234567890`
* *More Logic*, especially `ANY`, `ALL`, `NONE`, and `SUCH THAT` key words (and equivalents) for logical quantisation: `FIND ename1 SUCH THAT ALL ename2 WHICH HAVE A REFERENCE TO ename1 HAVE A pname=val`. This is like `FIND experiment SUCH THAT ALL person WHICH ARE REFERENCED BY THIS experiment AS conductor HAVE AN 'academic title'=professor.`
Author: Timm Fitschen
Email: timm.fitschen@ds.mpg.de
Date: 2013-02-23
# No Proposal
http://caosdb/register
# Proposal
## Add User
* POST Request is to be send to `http://host:port/User`.
* This requires authetication as user _admin_ (default password: _adminpw_).
* Http body:
<Post>
<User name="${username}" password="${md5ed_password} />
</Post>
## Delete User
* DELETE Request
* admin authentication required.
* Http body:
<Delete>
<User name="${username}/>
</Delete>
The user to be deleted may also be identified by his id (`id="${id}"`) instead of his name.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment