Going the belt-and-suspenders way of both sleeping and waiting for the
moitor should make sure MaxScale has at least some time to start up, query
the servers and do a single iteration of monitoring.
Readconnroute with router_options=slave will use a master if one is
available. The test still expected the old behavior where masters were
never used with router_options=slave.
After a session command fails on all connected slaves but succeeds on the
master, the servers that failed are discarded from the pool of valid
servers. If there are available servers, they will be taken into use when
the next read query is performed.
When a query is routed to an unconnected slave, it is taken into use and
the session command history is replayed. If the execution of the history
failed and there were more session commands to be executed, any queued
queries would get lost. This is due to a missing check that would detect
the fact that the server has been discarded.
The test unnecessarily did an unconditional drop of the user resulting in
errors. Also prevented stopping of MaxScale if it wasn't started in the
first place.
Allowing calls to select_connect_backend_servers even when all slaves are
connected solves the debug assertion in select_connect_backend_servers
that happens when the execution of a queued query causes a new connection
to be created.
The test now sets a two minute timeout for all larger operations. This
prevents excessive waiting when the test is executed.
Removed the row count output to stdout to prevent excessive logging when
the terminal contents are written to a file.
Also renamed the file to match the test case name. This should remove the
need to find the source file to test name mapping from the CMakeLists.txt.
The tests can now wait for a number of monitor intervals. This removes the
need to have hard-coded sleeps in the code and makes monitor tests more
robust under heavier load.
Increased the amount of time that the tests sleep while they wait for
states to change. This should make them more tolerant of server load by
allowing more time for things to settle down.
The code that selects the candidate backend always returned the root
master if the server bitmask contained the master bit. This should only be
done if the master bit is the only bit in the bitmask and when there are
other bits, the normal candidate selection code should be used.
Also added a query to the expanded test case to make sure the connection
actually works.
The test case now verifies that the servers are actually load balanced
correctly. This test reveals a problem in the readconnroute; the master is
always preferred over slaves if one is available with
router_options=master,slave.
The two operations return different types of results and need to be
treated differently in order for them to be handled correctly in 2.2.
This fixes the unexpected internal state errors that happened in all 2.2
versions due to a wrong assumption made by readwritesplit. This fix is not
necessary for newer versions as the LOAD DATA LOCAL INFILE processing is
done with a simpler, and more robust, method.
Now, if a test is invoked with '-l', then MaxScale is assumed to
be running locally using a configuration file suitable for the
test that is invoked. Further, at the end of the test, the log
files of MaxScale are not downloaded (obviously).
As a side-effect, an environment variable no_maxscale_log_copy,
similar to the existing no_backend_log_copy, has been introduced
using which the downloading of maxscale log files unconditionally
can be prevented.
The test may fail if the client/maxscale/mariadb combo is too slow.
TODO, maybe: today I saw mysql_query() hang (in a poll) when maxscale
disconnected. Is there something that can be done about that?
I added a source directory, base, for stuff that should become part of a common
utility library in the future.
A slight increase of the timeout to 300 seconds should solve some test
failures. The test is extremely hard on the machines and even when run
locally, the test slows down to a crawl.