There were a total of five SSH connections opened at the start of each
test. Only two of these are currently required: the SSL certificate
directory check and the actual command that restarts MaxScale. Two of the
three remaining commands, stopping of MaxScale and copying of the
configuration, can be made conditional or combined into other
commands.
The stopping of MaxScale is done to prevent it from interfering with the
cluster setup process. As MaxScale does nothing if nothing is wrong, it is
safe to make the restart conditional so that it is done only when a
problem in the cluster setup is detected.
The final SSH command, the MaxScale health check via maxadmin, can be
removed as it is redundant: the daemonization already covers this by
exiting only after MaxScale is ready.
A certain templated parameter was only substituted when the VMs were
provisioned. This needs to be handled by the test framework to allow
changes into Galera clusters configuration.
Also made the startup of the "lesser" nodes parallel so minimize the
startup time.
The galera configurations need pre-processing before they can be
used. Switched to std::endl to automatically flush the output at the end
of each line. This makes it easier to see what is happening when the tests
are ran by buildbot. Also removed the extra startup of the servers that
was done right after installing the database.
Grouped all binlogrouter and avrorouter tests so that they are executed as
the last tests. This helps prevent some side effects that result from the
"aggressive" replication modifications the tests do. Also removed some
commented out test cases.
If the replication is broken between the nodes, it is now fixed in
parallel on all nodes instead of doing it one server at a time.
This reduces the time from about 120 seconds to 13 seconds. The time was
measured by running the check_backend test first with all backends broken
and then with the fixed backends subtracting time of the latter from the
former.
As the wait_for_monitor function guarantees that the monitor notices the
state change, we can skip the replication fixing which was somewhat
superficial in the first place.
The tests were consistently unstable and as a result of this did not
provide any actionable output. In addition to this these two test were the
longest running tests in the whole MaxScale test suite so a re-design was
warranted.
Instead of emulating a client and a server failure, testing functionality
provides for a test that is faster, more precise and provides more
actionable output. Due to the single-threadedness of the new test, no
cross-thread depencies are present. In addition to this, the superfluous
log flushing was not done as it almost always happened after all
transactions were already complete.
The estimated savings in test time alone is around 1100 seconds (roughly
18 minutes).
This will simply cause a task to be posted to each worker.
If the workers are running normally, the task will reach the
workers and the associated semaphore posted, and the REST-API
call will return. If any worker is not running normally, the
task will not be processed and the REST-API call will hang.
Figure out the console width and adjust output accordingly.
In default mode use '\n' as separator (necessary for making the
session query output sensible) and in tsv mode ','.
Whether or not a session should retain its statements is now
a property of the session. This in preparation for making the
whole functionality a property that can be enabled and disabled
at runtime, of the service.
As the router is the only one that knows what backends a particular
statement has been sent to, it is the responsibility of the router
to keep the session bookkeeping up to date. If it doesn't we will
know what statements a session has received (provided at least some
component in the routing chain has RCAP_TYPE_STMT_INPUT capability),
but not how long their processing took. Currently only readwritesplit
does that.
All queries are stored and not just COM_QUERY as that makes the
overall bookkeeping simpler; at clientReply() time we do not need to
know whether or not to bookkeep information, we can just do it.
When session information is queried for, we report as much information
we have available.
The main class was getting unwieldly and too general. Dividing the fields
helps adding support for other operation types.
This commit leaves most data duplicated, later commits clean up the affected code.