This reduces the ambiguity of server id:s in the slave status contents.
If a slave connection has been seen properly connected at an earlier time,
it can be trusted to report the correct master server id. This also
fixes some wrong status assignment edge cases with the SERVER_WAS_SLAVE-bit.
The bit will be removed in a later commit.
Even this does not solve the situation when MaxScale is started with
some servers down.
Uses mostly the status functions for reading the flags. Strickly
speaking this breaks the REST API since in some cases (status combinations)
the printed string is different from what was printed before.
The master failure verification would not work if the slaves did not have
a state change since MaxScale had started. This can be fixed by treating
the startup of MaxScale as an event of sorts.
The auto_failover is a more reliable solution and should be used instead. Several
unused parameters were removed, although they can still be defined in the config
file. Updated documentation on the relevant parts.
The command is saved in a function object which is read by the monitor
thread. This way, manual and automatic cluster modification commands are
ran in the same step of a monitor cycle.
This update required several modifications in related code.
The monitor now detects when a server has changed such that a replication
graph rebuild is needed and only then rebuilds the graph and detects
cycles and master.
Also, some old code is no longer called in the monitor cycle. It will be
removed in later commits. Refactored some of the related functions.
Not yet used, as more is needed to replace the old code. The
algorithm is based on counting the total number of slave nodes
a server has, possibly in multiple layers and/or cycles.
This makes the code clearer and reduces race conditions, as the monitor
could be writing SERVER->status while a router is reading it. Also,
the time during which the SERVER struct is locked drops to a fraction.
Since monitors are now freed at MaxScale exit, the server data should be freed. Also,
gtid domain variables are now initialized with a common constant.
The updating of GTIDs was only considered successful if both the current
GTID position and binlog GTID positions were non-empty. If a slave has no
binlogged events, the GTID update would always fail.
This change in behavior caused the mysqlmon_failover_auto and
mysqlmon_failver_manual tests to break. The test disabled the binary log
on one of the servers which caused it to be left out from the rejoining
process.
The master down verification through slaves won't work with this commit. It needs to be
redesigned to handle multiple slave connections or removed. Also, only the first row of
slave status data is used by the monitor, so multiple slave connections are still
incorrectly handled.
5.1 to 5.3 are officially not supported anymore, so support can be removed from
the monitor. This allows removing the config parameter "mysql51_replication".