The test should stop MaxScale when it is fixing the replication to prevent
the triggering of the standalone master detection.
Also removed leading spaces from the messages and fixed a possible crash with a
NULL value given to `ssh_node`.
Testing of routing behavior with master_failure_mode=error_on_write and
allow_master_changes=true. By sending an error instead of closing the
connection when the master fails, the connection can resume execution if a
new master becomes available.
Added test cases that verify that the functionality works as
expected. Also made Mariadb_nodes::change_master less verbose when one of
the nodes is down.
Added some helper functions into the MaxScale class and default parameters
into the connection creation functions. Also made the ip() function const
correct.
Timing out the statements and adding a LIMIT clause to the DELETE
statement should rule out backend server related problems. If the test
still times out, the problem is most likely in MaxScale.
This commit introduces changes that fix the relay master detection that
was broken by the merge from 2.1 into 2.2 by commit
1ecd791887994209eb29e56e1271f8c407cd0cdf.
In 2.2, the master server ID is used to detect whether a slave is actually
replicating from a master. The value is still displayed even if the slave
is not actively replicating from a master. The commit in 2.1 causes this
value to be stored unconditionally if it is available. By checking the
value of Last_IO_Errno and comparing it to a list of known error codes, we
know whether the slave is replicating properly.
The slave detection in 2.2 correctly identifies a broken slave with a
stopped IO thread. Due to this, the test case must be modified to check
that the relay master is not a slave if the IO thread is stopped.
The test appears to fail due to connection errors when it attempts to
check whether MaxScale is still alive. To offset the chance of the backend
server still refusing connections after the initial spike, the sleep
before the check was increased.
The code had a note in that states that the test uses custom code
backported from 2.2 and that it should be updated to use common code once
merged into 2.2.
Added a small sleep to make sure the monitor picks up the changes in the
topology.
When the test finishes and is about to check whether MaxScale is alive,
the servers should be cleared from maintenance mode and the replication
should be fixed. This way the test will clean up after itself.
When the test changes the master, it should reset the slave configuration
on the new master. This way no circular replication topologies are formed
and the monitor can be expected to perform correctly.
When auto_failover has been disabled due to failure to perform automatic
failover, the test should reset the number of failures. Resetting it
before the check for the log message allows the test to still fail if the
message is missing.
Instead of the current master rejoining to the diverged master, the
current master should remain as the master server. This behavior should be
explained by the extra GTID event injected by the failover process.
The insertion and subsequent read of the data into the master in
`generate_traffic_and_check` will now be done inside a transaction. This
keeps the behavior consistent with the `check` function that only inserts
one row.