From 7b4dce40e24631ad4a39be946a5dd3a763065654 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Henrik=20tom=20W=C3=B6rden?= <henrik@trineo.org>
Date: Fri, 28 May 2021 08:35:38 +0200
Subject: [PATCH] DOC: add information on sql profiling

---
 doc/devel/Benchmarking.md | 79 +++++++++++++++++++++++----------------
 1 file changed, 46 insertions(+), 33 deletions(-)

diff --git a/doc/devel/Benchmarking.md b/doc/devel/Benchmarking.md
index 9184b64a..ab1c9451 100644
--- a/doc/devel/Benchmarking.md
+++ b/doc/devel/Benchmarking.md
@@ -14,7 +14,7 @@ representative state, Python scripts are used.  The scripts can be found in the
 repository, located at [https://gitlab.indiscale.com/caosdb/src/caosdb-dev-tools](https://gitlab.indiscale.com/caosdb/src/caosdb-dev-tools) in the folder
 `benchmarking`:
 
-### `fill_database.py` ###
+### Python Script `fill_database.py` ###
 
 This commandline script is meant for filling the database with enough data to represeny an actual
 real-life case, it can easily create hundreds of thousands of Entities.
@@ -26,14 +26,14 @@ of the Entities into CaosDB is done in chunks of a defined size.
 
 Users can tell the script to store times needed for the insertion of each chunk into a tsv file.
 
-### `measure_execution_time.py` ###
+### Python Script  `measure_execution_time.py` ###
 
 A somewhat outdated script which executes a given query a number of times and then save statistics
 about the `TransactionBenchmark` readings (see below for more information about the transaction
 benchmarks) delivered by the server.
 
 
-### `sql_routine_measurement.py` 
+### Python Script  `sql_routine_measurement.py` 
 
 
 
@@ -65,45 +65,29 @@ performance_schema=ON
 ```
 Start the server.
 
-### Benchmarking SQL commands ###
+### MariaDB General Query Log ###
 
 MariaDB and MySQL have a feature to enable the logging of SQL queries' times.  This logging must be
-turned on on the SQL server as described in the [upstream documentation](https://mariadb.com/kb/en/general-query-log/).  For the Docker
-environment LinkAhead, this can conveniently be done with `linkahead mysqllog {on,off,store}`.
-
-Alternatively, you can enable the SQL general logs, log into the SQL server and do:
+turned on on the SQL server as described in the [upstream documentation](https://mariadb.com/kb/en/general-query-log/):
+Add to the mysql configuration:
+```
+log_output=TABLE
+general_log
+```
+or calling
 ```sql
 SET GLOBAL log_output = 'TABLE';
 SET GLOBAL general_log = 'ON';
 ```
 
-### External JVM profilers ###
+In the Docker environment LinkAhead, this can conveniently be 
+done with `linkahead mysqllog {on,off,store}`.
 
-Additionally to the transaction benchmarks, it is possible to benchmark the server execution via
-external Java profilers.  For example, [VisualVM](https://visualvm.github.io/) can connect to JVMs running locally or remotely
-(e.g. in a Docker container).  To enable this in LinkAhead's Docker environment, set
+### MariaDB Slow Query Log ###
+See [slow query log docs](https://mariadb.com/kb/en/slow-query-log-overview/)
 
-```yaml
-devel:
-  profiler: true
-```
-Alternatively, start the server (without docker) with the `run-debug-single` make target, it will expose
-the JMX interface, by default on port 9090.
-
-Most profilers, like as VisualVM, only gather cumulative data for call trees, they do not provide
-complete call graphs (as callgrind/kcachegrind would do).  They also do not differentiate between
-calls with different query strings, as long as the Java process flow is the same (for example, `FIND
-Record 1234` and `FIND Record A WHICH HAS A Property B WHICH HAS A Property C>100` would be handled
-equally).
-
-
-#### Example settings for VisualVM 
-
-In the sampler settings, you may want to add these expressions to the blocked
-packages: `org.restlet.**, com.mysql.**`.  Branches on the call tree which are
-entirely inside the blacklist, will become leaves.  Alternatively, specify a
-whitelist, for example with `org.caosdb.server.database.backend.implementation.**`,
-if you only want to see the time spent for certain MySQL calls.
+### MariaDB Performance Schema ###
+The most detailed information on execution times can be acquired using the performance schema.
 
 ### Manual Java-side benchmarking #
 
@@ -140,6 +124,35 @@ Additionally, the server should be started via `make run-debug` (instead of
 | `ExecuteQuery`                       | ?                                            | ?                             |
 |                                      |                                              |                               |
 
+### External JVM profilers ###
+
+Additionally to the transaction benchmarks, it is possible to benchmark the server execution via
+external Java profilers.  For example, [VisualVM](https://visualvm.github.io/) can connect to JVMs running locally or remotely
+(e.g. in a Docker container).  To enable this in LinkAhead's Docker environment, set
+
+```yaml
+devel:
+  profiler: true
+```
+Alternatively, start the server (without docker) with the `run-debug-single` make target, it will expose
+the JMX interface, by default on port 9090.
+
+Most profilers, like as VisualVM, only gather cumulative data for call trees, they do not provide
+complete call graphs (as callgrind/kcachegrind would do).  They also do not differentiate between
+calls with different query strings, as long as the Java process flow is the same (for example, `FIND
+Record 1234` and `FIND Record A WHICH HAS A Property B WHICH HAS A Property C>100` would be handled
+equally).
+
+
+#### Example settings for VisualVM 
+
+In the sampler settings, you may want to add these expressions to the blocked
+packages: `org.restlet.**, com.mysql.**`.  Branches on the call tree which are
+entirely inside the blacklist, will become leaves.  Alternatively, specify a
+whitelist, for example with `org.caosdb.server.database.backend.implementation.**`,
+if you only want to see the time spent for certain MySQL calls.
+
+
 ## How to set up a representative database ##
 For reproducible results, it makes sense to start off with an empty database and fill it using the
 `fill_database.py` script, for example like this:
-- 
GitLab