In theory, the value of m_master could change between reading it to
local variable and stopping monitor. To be on the safe side, stop the
monitor first.
1. Move some remaining class data private.
2. Linebreak long lines.
3. Move current master autoselection inside class method.
4. Remove single-use constant #defines.
5. Monitor status is only written inside loop.
This commit introduces changes that fix the relay master detection that
was broken by the merge from 2.1 into 2.2 by commit
1ecd791887994209eb29e56e1271f8c407cd0cdf.
In 2.2, the master server ID is used to detect whether a slave is actually
replicating from a master. The value is still displayed even if the slave
is not actively replicating from a master. The commit in 2.1 causes this
value to be stored unconditionally if it is available. By checking the
value of Last_IO_Errno and comparing it to a list of known error codes, we
know whether the slave is replicating properly.
The slave detection in 2.2 correctly identifies a broken slave with a
stopped IO thread. Due to this, the test case must be modified to check
that the relay master is not a slave if the IO thread is stopped.
When MaxScale thinks that the master has failed, it tries to verify it by
seeing if the slave server is receiving events. There was a missing IO
thread status check in the slave_receiving_events function which caused
the failover to wait until the verification timed out.
The relay master detection logic also lacked a check for the slave SQL
thread status. The code should check the state of the SQL thread to
determine whether the server is actually a functional slave to a master.
The fix causes a regression in the failover functionality as there is a
dependency between the slave's master ID and how the failover
performs. This dependency should not exist but fixing it causes a problem
with the mysqlmon_rejoin_bad2 test.
Startup now done in a static method. Constructor initializes some values.
Config parameters loaded in a separate method. Some things still need
looking.
When the IO thread of a relay master is stopped, the knowledge that it is
not a real master but a relay master is lost. To prevent this loss of
information, the master server's server_id value should always be stored
if it is available.
Previously, if the list contained servers that were not monitored by
the monitor yet were valid servers, an error value would be returned
and the monitor failed to start.
With this update, the non-monitored servers are simply ignored when
forming the final list.
Also, added printing of the list to diagnostics.
If the master is replicating from an external master, the monitor will save the
host:port of the external server. During demotion, the old master stops the external
replication while the new master begins it. Also, any commands that would add
to gtid have to be omitted when an external master is in play.