Since the servers are not modified before or during the wait, the waiting
can be done in the preparation method. This simplifies the actual failover
somewhat, and allows the monitor to keep running normally while waiting for
the log to clear.
When the replication status from the external master is checked, the
pending status must be used. This makes sure that the SlaveStatusArray and
the server state are sync.
Also extended the message that was logged when the external master was
lost. By adding the network address there, it makes it easier to see where
the server was replicating from if only the log file is available.
This reduces the ambiguity of server id:s in the slave status contents.
If a slave connection has been seen properly connected at an earlier time,
it can be trusted to report the correct master server id. This also
fixes some wrong status assignment edge cases with the SERVER_WAS_SLAVE-bit.
The bit will be removed in a later commit.
Even this does not solve the situation when MaxScale is started with
some servers down.
The auto_failover is a more reliable solution and should be used instead. Several
unused parameters were removed, although they can still be defined in the config
file. Updated documentation on the relevant parts.
The monitor now detects when a server has changed such that a replication
graph rebuild is needed and only then rebuilds the graph and detects
cycles and master.
Also, some old code is no longer called in the monitor cycle. It will be
removed in later commits. Refactored some of the related functions.
Not yet used, as more is needed to replace the old code. The
algorithm is based on counting the total number of slave nodes
a server has, possibly in multiple layers and/or cycles.
This makes the code clearer and reduces race conditions, as the monitor
could be writing SERVER->status while a router is reading it. Also,
the time during which the SERVER struct is locked drops to a fraction.
The master down verification through slaves won't work with this commit. It needs to be
redesigned to handle multiple slave connections or removed. Also, only the first row of
slave status data is used by the monitor, so multiple slave connections are still
incorrectly handled.
5.1 to 5.3 are officially not supported anymore, so support can be removed from
the monitor. This allows removing the config parameter "mysql51_replication".
MASTER_GTID_WAIT uses gtid_slave_pos when comparing to the target gtid. This creates
problems with multi-domain gtids. It's simpler to just query the server for its
gtids repeatedly. Also, the method is now in MariaDBServer.