The monitor queried the session-specific domain id, which does not follow the global
value while the session is alive. This caused the monitor to follow the wrong gtid
domain if the domain was changed after MaxScale was started. This patch modifies the
query to read the global value instead. Even this is not fool-proof, as existing
sessions can issue writes with the old domain, confusing the gtid-parsing.
Renamed to MariaDBServer. The objects have a pointer to the underlying
MXS_MONITORED_SERVER. The purpose is to have the monitor mainly use
MariaDBServer instead of the current mix of MXS_MONITORED_SERVER* and
MySqlServerInfo and to simplify the mapping between the two. Also,
many methods can be moved to the MariaDBServer class later on.
Some functions have been converted from MXS_MONITORED_SERVER* to
MariaDBServer.
1. Move some remaining class data private.
2. Linebreak long lines.
3. Move current master autoselection inside class method.
4. Remove single-use constant #defines.
5. Monitor status is only written inside loop.
This commit introduces changes that fix the relay master detection that
was broken by the merge from 2.1 into 2.2 by commit
1ecd791887994209eb29e56e1271f8c407cd0cdf.
In 2.2, the master server ID is used to detect whether a slave is actually
replicating from a master. The value is still displayed even if the slave
is not actively replicating from a master. The commit in 2.1 causes this
value to be stored unconditionally if it is available. By checking the
value of Last_IO_Errno and comparing it to a list of known error codes, we
know whether the slave is replicating properly.
The slave detection in 2.2 correctly identifies a broken slave with a
stopped IO thread. Due to this, the test case must be modified to check
that the relay master is not a slave if the IO thread is stopped.
When MaxScale thinks that the master has failed, it tries to verify it by
seeing if the slave server is receiving events. There was a missing IO
thread status check in the slave_receiving_events function which caused
the failover to wait until the verification timed out.
The relay master detection logic also lacked a check for the slave SQL
thread status. The code should check the state of the SQL thread to
determine whether the server is actually a functional slave to a master.
The fix causes a regression in the failover functionality as there is a
dependency between the slave's master ID and how the failover
performs. This dependency should not exist but fixing it causes a problem
with the mysqlmon_rejoin_bad2 test.
Startup now done in a static method. Constructor initializes some values.
Config parameters loaded in a separate method. Some things still need
looking.