The `MYSQL_ROW row` variable was being overwritten by the extra query done
by the SST method detection code. Moving it into its own function prevents
this and makes the code significantly easier to comprehend.
Added a test case that reproduced the problem (MaxScale crashed) and
verifies that the patch fixes the problem.
The previous core check would pick up any file in /tmp/ that would start
with the `core` prefix. This included some npm generated files which are
created if MaxCtrl is built on the MaxScale machine.
The test appears to hang when the `SET sql_log_bin = 0` statement is
executed. Removing this seems to fix it and is OK as that's not what the
test aims to check.
If a MaxScale-generated configuration defines an empty value, it is
ignored with the assumption that the next modification will cause the
problem to correct itself.
The test should stop MaxScale when it is fixing the replication to prevent
the triggering of the standalone master detection.
Also removed leading spaces from the messages and fixed a possible crash with a
NULL value given to `ssh_node`.
Added some helper functions into the MaxScale class and default parameters
into the connection creation functions. Also made the ip() function const
correct.
Timing out the statements and adding a LIMIT clause to the DELETE
statement should rule out backend server related problems. If the test
still times out, the problem is most likely in MaxScale.
This commit introduces changes that fix the relay master detection that
was broken by the merge from 2.1 into 2.2 by commit
1ecd791887994209eb29e56e1271f8c407cd0cdf.
In 2.2, the master server ID is used to detect whether a slave is actually
replicating from a master. The value is still displayed even if the slave
is not actively replicating from a master. The commit in 2.1 causes this
value to be stored unconditionally if it is available. By checking the
value of Last_IO_Errno and comparing it to a list of known error codes, we
know whether the slave is replicating properly.
The slave detection in 2.2 correctly identifies a broken slave with a
stopped IO thread. Due to this, the test case must be modified to check
that the relay master is not a slave if the IO thread is stopped.
The test appears to fail due to connection errors when it attempts to
check whether MaxScale is still alive. To offset the chance of the backend
server still refusing connections after the initial spike, the sleep
before the check was increased.
The code had a note in that states that the test uses custom code
backported from 2.2 and that it should be updated to use common code once
merged into 2.2.
Added a small sleep to make sure the monitor picks up the changes in the
topology.
When the test finishes and is about to check whether MaxScale is alive,
the servers should be cleared from maintenance mode and the replication
should be fixed. This way the test will clean up after itself.
When the test changes the master, it should reset the slave configuration
on the new master. This way no circular replication topologies are formed
and the monitor can be expected to perform correctly.
When auto_failover has been disabled due to failure to perform automatic
failover, the test should reset the number of failures. Resetting it
before the check for the log message allows the test to still fail if the
message is missing.
Instead of the current master rejoining to the diverged master, the
current master should remain as the master server. This behavior should be
explained by the extra GTID event injected by the failover process.