The `global` parameter causes the time window defined by the `time`
parameter to be applied at the instance level instead of the session
level. This means that a write from one connection will cause all other
connections to use the master for a certain period of time.
Using a configurable time window for consistency is not good as it is not
absolute and cannot adjust to how servers behave.
One example that demonstrates this is when a slave is normally lagging
behind by less than a second but some event causes the lag to spike up to
several seconds. In this case the configured time window would no longer
guarantee consistency.
Another reason to avoid a "static" time window is the fact taht it
prevents load balancing in the cases where slaves catch up to the master
within time window. This happens when time is configured to a higher value
to avoid inconsistencies at all costs.
Added a test case that verified the feature works.
The SHOW DATABASES greatly slows down the test. By doing that only from
time to time, the test time drops from roughly 160 seconds to 15
seconds. There's also no point in continuing the test after a failure has
been seen.
Also added a missing sync_slaves to the database creation to make sure
they are created on all servers before the test starts.
When a system test program is invoked with the flag '-l', it will
assume MaxScale is running on 127.0.0.1 using a configuration that
is compatible with the test.
NOTE: Currently any test that directly or indirectly sshes to the
MaxScale node will fail. With a bit of setup that could also
be made to work.
In some configs /etc/my.cnf.d/* files access rights are
very limited and server can not read configuration.
It leads to broken settings and unability to setup
replication during the test
Due to incorrect SSL certs copying to backend and wrong setting in
maxscale.cnf it was not possible to active backend SSL.
Additionally, one more maxscale restart added to 'sql_queries' test
to reproduce SSL bug in 2.4.1.
Also ssl.cnf tuned in order to reproduce SSL bug
By moving the setting up of the test environment from the constructor
to a separate setup()-function, it is possible to introduce virtual
functions and make it easier to do things differently depending on
whether the backend is MariaDB, Galera och Clustrix.
When checking the state of a Clustrix node, we do so in steps:Z
- Is Clustrix installed
- Is Clustrix running
- Can Clustrix be accessed using root
- Can Clustrix be accessed using the test user
and deal with a failure at each point.
The test case covers a few bugs that were fixed by the previous
commits. The first part of the test covers the case when master
reconnection fails while session command history is being executed. The
second part of the test makes sure exceeding the session command history
will prevent master reconnections from taking place.
The test now uses standard library threads and lambda functions to make
the code simpler. Also made the test more robust by ignoring any errors
that are caused by the exhaustion of available client side TCP ports.
The hard limit of 10 seconds is too strict when taking into account the
fact that infinite refreshes was possible before the bug was fixed. This
also makes testing a lot easier where rapid reloads are necessary.
Certain MariaDB connectors will use the direct execution for batching
COM_STMT_PREPARE and COM_STMT_EXECUTE execution without waiting for the
COM_STMT_PREPARE to complete. In these cases the COM_STMT_EXECUTE (and
other COM_STMT commands as well) will use the special ID 0xffffffff. When
this is detected, it should be substituted with the ID of the latest
statement that was prepared.
By injecting messages into the maxscale.log from the test, the reader of
the log can easily see the "synchronization" with the test case. This does
affect the test timing so it can only be used to see whether non-timing
related functionality is correct.
The test appears to fail when the throttling is unable to keep the QPS
high enough for the test to pass. To reduce the likelihood of this, lower
the limit to 500 QPS.
In theory, the minimum delay of one millisecond in the delayed_call limits
the filter to a maximum QPS of 1000 as each query would wait for at least
a millisecond before being routed. This is yet to be proven but it would
explain why the tests are having a hard time approaching that level of
QPS.