From 7b4dce40e24631ad4a39be946a5dd3a763065654 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Henrik=20tom=20W=C3=B6rden?= <henrik@trineo.org> Date: Fri, 28 May 2021 08:35:38 +0200 Subject: [PATCH] DOC: add information on sql profiling --- doc/devel/Benchmarking.md | 79 +++++++++++++++++++++++---------------- 1 file changed, 46 insertions(+), 33 deletions(-) diff --git a/doc/devel/Benchmarking.md b/doc/devel/Benchmarking.md index 9184b64a..ab1c9451 100644 --- a/doc/devel/Benchmarking.md +++ b/doc/devel/Benchmarking.md @@ -14,7 +14,7 @@ representative state, Python scripts are used. The scripts can be found in the repository, located at [https://gitlab.indiscale.com/caosdb/src/caosdb-dev-tools](https://gitlab.indiscale.com/caosdb/src/caosdb-dev-tools) in the folder `benchmarking`: -### `fill_database.py` ### +### Python Script `fill_database.py` ### This commandline script is meant for filling the database with enough data to represeny an actual real-life case, it can easily create hundreds of thousands of Entities. @@ -26,14 +26,14 @@ of the Entities into CaosDB is done in chunks of a defined size. Users can tell the script to store times needed for the insertion of each chunk into a tsv file. -### `measure_execution_time.py` ### +### Python Script `measure_execution_time.py` ### A somewhat outdated script which executes a given query a number of times and then save statistics about the `TransactionBenchmark` readings (see below for more information about the transaction benchmarks) delivered by the server. -### `sql_routine_measurement.py` +### Python Script `sql_routine_measurement.py` @@ -65,45 +65,29 @@ performance_schema=ON ``` Start the server. -### Benchmarking SQL commands ### +### MariaDB General Query Log ### MariaDB and MySQL have a feature to enable the logging of SQL queries' times. This logging must be -turned on on the SQL server as described in the [upstream documentation](https://mariadb.com/kb/en/general-query-log/). For the Docker -environment LinkAhead, this can conveniently be done with `linkahead mysqllog {on,off,store}`. - -Alternatively, you can enable the SQL general logs, log into the SQL server and do: +turned on on the SQL server as described in the [upstream documentation](https://mariadb.com/kb/en/general-query-log/): +Add to the mysql configuration: +``` +log_output=TABLE +general_log +``` +or calling ```sql SET GLOBAL log_output = 'TABLE'; SET GLOBAL general_log = 'ON'; ``` -### External JVM profilers ### +In the Docker environment LinkAhead, this can conveniently be +done with `linkahead mysqllog {on,off,store}`. -Additionally to the transaction benchmarks, it is possible to benchmark the server execution via -external Java profilers. For example, [VisualVM](https://visualvm.github.io/) can connect to JVMs running locally or remotely -(e.g. in a Docker container). To enable this in LinkAhead's Docker environment, set +### MariaDB Slow Query Log ### +See [slow query log docs](https://mariadb.com/kb/en/slow-query-log-overview/) -```yaml -devel: - profiler: true -``` -Alternatively, start the server (without docker) with the `run-debug-single` make target, it will expose -the JMX interface, by default on port 9090. - -Most profilers, like as VisualVM, only gather cumulative data for call trees, they do not provide -complete call graphs (as callgrind/kcachegrind would do). They also do not differentiate between -calls with different query strings, as long as the Java process flow is the same (for example, `FIND -Record 1234` and `FIND Record A WHICH HAS A Property B WHICH HAS A Property C>100` would be handled -equally). - - -#### Example settings for VisualVM - -In the sampler settings, you may want to add these expressions to the blocked -packages: `org.restlet.**, com.mysql.**`. Branches on the call tree which are -entirely inside the blacklist, will become leaves. Alternatively, specify a -whitelist, for example with `org.caosdb.server.database.backend.implementation.**`, -if you only want to see the time spent for certain MySQL calls. +### MariaDB Performance Schema ### +The most detailed information on execution times can be acquired using the performance schema. ### Manual Java-side benchmarking # @@ -140,6 +124,35 @@ Additionally, the server should be started via `make run-debug` (instead of | `ExecuteQuery` | ? | ? | | | | | +### External JVM profilers ### + +Additionally to the transaction benchmarks, it is possible to benchmark the server execution via +external Java profilers. For example, [VisualVM](https://visualvm.github.io/) can connect to JVMs running locally or remotely +(e.g. in a Docker container). To enable this in LinkAhead's Docker environment, set + +```yaml +devel: + profiler: true +``` +Alternatively, start the server (without docker) with the `run-debug-single` make target, it will expose +the JMX interface, by default on port 9090. + +Most profilers, like as VisualVM, only gather cumulative data for call trees, they do not provide +complete call graphs (as callgrind/kcachegrind would do). They also do not differentiate between +calls with different query strings, as long as the Java process flow is the same (for example, `FIND +Record 1234` and `FIND Record A WHICH HAS A Property B WHICH HAS A Property C>100` would be handled +equally). + + +#### Example settings for VisualVM + +In the sampler settings, you may want to add these expressions to the blocked +packages: `org.restlet.**, com.mysql.**`. Branches on the call tree which are +entirely inside the blacklist, will become leaves. Alternatively, specify a +whitelist, for example with `org.caosdb.server.database.backend.implementation.**`, +if you only want to see the time spent for certain MySQL calls. + + ## How to set up a representative database ## For reproducible results, it makes sense to start off with an empty database and fill it using the `fill_database.py` script, for example like this: -- GitLab