Skip to content
Snippets Groups Projects
Commit 7b4dce40 authored by Henrik tom Wörden's avatar Henrik tom Wörden
Browse files

DOC: add information on sql profiling

parent f138ba9a
No related branches found
No related tags found
1 merge request!21Release v0.4.0
Pipeline #7913 canceled
......@@ -14,7 +14,7 @@ representative state, Python scripts are used. The scripts can be found in the
repository, located at [https://gitlab.indiscale.com/caosdb/src/caosdb-dev-tools](https://gitlab.indiscale.com/caosdb/src/caosdb-dev-tools) in the folder
`benchmarking`:
### `fill_database.py` ###
### Python Script `fill_database.py` ###
This commandline script is meant for filling the database with enough data to represeny an actual
real-life case, it can easily create hundreds of thousands of Entities.
......@@ -26,14 +26,14 @@ of the Entities into CaosDB is done in chunks of a defined size.
Users can tell the script to store times needed for the insertion of each chunk into a tsv file.
### `measure_execution_time.py` ###
### Python Script `measure_execution_time.py` ###
A somewhat outdated script which executes a given query a number of times and then save statistics
about the `TransactionBenchmark` readings (see below for more information about the transaction
benchmarks) delivered by the server.
### `sql_routine_measurement.py`
### Python Script `sql_routine_measurement.py`
......@@ -65,45 +65,29 @@ performance_schema=ON
```
Start the server.
### Benchmarking SQL commands ###
### MariaDB General Query Log ###
MariaDB and MySQL have a feature to enable the logging of SQL queries' times. This logging must be
turned on on the SQL server as described in the [upstream documentation](https://mariadb.com/kb/en/general-query-log/). For the Docker
environment LinkAhead, this can conveniently be done with `linkahead mysqllog {on,off,store}`.
Alternatively, you can enable the SQL general logs, log into the SQL server and do:
turned on on the SQL server as described in the [upstream documentation](https://mariadb.com/kb/en/general-query-log/):
Add to the mysql configuration:
```
log_output=TABLE
general_log
```
or calling
```sql
SET GLOBAL log_output = 'TABLE';
SET GLOBAL general_log = 'ON';
```
### External JVM profilers ###
In the Docker environment LinkAhead, this can conveniently be
done with `linkahead mysqllog {on,off,store}`.
Additionally to the transaction benchmarks, it is possible to benchmark the server execution via
external Java profilers. For example, [VisualVM](https://visualvm.github.io/) can connect to JVMs running locally or remotely
(e.g. in a Docker container). To enable this in LinkAhead's Docker environment, set
### MariaDB Slow Query Log ###
See [slow query log docs](https://mariadb.com/kb/en/slow-query-log-overview/)
```yaml
devel:
profiler: true
```
Alternatively, start the server (without docker) with the `run-debug-single` make target, it will expose
the JMX interface, by default on port 9090.
Most profilers, like as VisualVM, only gather cumulative data for call trees, they do not provide
complete call graphs (as callgrind/kcachegrind would do). They also do not differentiate between
calls with different query strings, as long as the Java process flow is the same (for example, `FIND
Record 1234` and `FIND Record A WHICH HAS A Property B WHICH HAS A Property C>100` would be handled
equally).
#### Example settings for VisualVM
In the sampler settings, you may want to add these expressions to the blocked
packages: `org.restlet.**, com.mysql.**`. Branches on the call tree which are
entirely inside the blacklist, will become leaves. Alternatively, specify a
whitelist, for example with `org.caosdb.server.database.backend.implementation.**`,
if you only want to see the time spent for certain MySQL calls.
### MariaDB Performance Schema ###
The most detailed information on execution times can be acquired using the performance schema.
### Manual Java-side benchmarking #
......@@ -140,6 +124,35 @@ Additionally, the server should be started via `make run-debug` (instead of
| `ExecuteQuery` | ? | ? |
| | | |
### External JVM profilers ###
Additionally to the transaction benchmarks, it is possible to benchmark the server execution via
external Java profilers. For example, [VisualVM](https://visualvm.github.io/) can connect to JVMs running locally or remotely
(e.g. in a Docker container). To enable this in LinkAhead's Docker environment, set
```yaml
devel:
profiler: true
```
Alternatively, start the server (without docker) with the `run-debug-single` make target, it will expose
the JMX interface, by default on port 9090.
Most profilers, like as VisualVM, only gather cumulative data for call trees, they do not provide
complete call graphs (as callgrind/kcachegrind would do). They also do not differentiate between
calls with different query strings, as long as the Java process flow is the same (for example, `FIND
Record 1234` and `FIND Record A WHICH HAS A Property B WHICH HAS A Property C>100` would be handled
equally).
#### Example settings for VisualVM
In the sampler settings, you may want to add these expressions to the blocked
packages: `org.restlet.**, com.mysql.**`. Branches on the call tree which are
entirely inside the blacklist, will become leaves. Alternatively, specify a
whitelist, for example with `org.caosdb.server.database.backend.implementation.**`,
if you only want to see the time spent for certain MySQL calls.
## How to set up a representative database ##
For reproducible results, it makes sense to start off with an empty database and fill it using the
`fill_database.py` script, for example like this:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment