The settings are different from the other fields in that they should not change
on their own. Most manipulation functions only require the settings.
Also takes into use a class for host and port data.
The monitor queries for logged in users with super-privileges and kicks them out to
prevent writes to master. Normal users can stay since their writes are prevented by
read_only. Also, the master-status is removed from the master manually to signal to
routers that no more writes should go to master.
If gtid of master is unknown (as is typical when master is down when MaxScale
starts) the domain id is guessed from the slaves instead. This is usually
safe.
All servers are now updated in their own threads simultaneously. This
should reduce the possibility of having significantly different gtid:s
shown for different servers.
This fixes some situations where MaxAdmin/MaxCtrl would block and wait
until a monitor operation or tick is complete. This also fixes a deadlock
caused by calling monitor diagnostics inside a monitor script.
Concurrency is enabled by adding one mutex per server object to protect
array-like fields from concurrent reading/writing.
The monitor now continuously updates a list of enabled server events. When
promoting a new master in failover/switchover, only events that were enabled
on the previous master are enabled on the new. This avoids enabling events
that may have been disabled on the master yet stayed in the SLAVESIDE_DISABLED-
state on the slave.
In the case of reset-replication command, events on the new master are only
enabled if the monitor had a master when the command was launched. Otherwise
all events remain disabled.
To allow MariaDBMon to be used with Clustrix we need to handle
Clustrix separately as its apparent version is 5.0.45, which is
lower than what MariaDBMon supports. Further, we must ensure that
Clustrix does not query the slave status as there are no slaves
in the M/S sense in a Clustrix cluster.
NOTE: Once there is a specific Clustrix monitor, this code should
be removed.
Previously, if the server had no gtid:s, the method would fail leading to
a confusing error message. This could even totally stop the monitor from working
if a recent server version (10.X) did not have any gtid events.
The main class was getting unwieldly and too general. Dividing the fields
helps adding support for other operation types.
This commit leaves most data duplicated, later commits clean up the affected code.
The removing and slave status updating is now separated to a function.
As the MariaDBServer object now contains the updated slave connections,
keeping track of removed connections is no longer required.
The two cases are now separated. In switchover, the promotion and
demotion targets can swap connections between each other without worry.
In failover, the two connection lists must be merged semi-intelligently.
The slave connections of the two servers are now saved to the operation
descriptor object at the start of the operation. This allows slave status
updating during the operation.
Several small changes:
Binlog is flushed at the end of old master demotion.
Only new master is required to catch up to old master.
Use the same replication check method as failover.
No longer writes events to the master, as this creates problems if the
promoted server was not the overall master. Instead, the slave status
output is inspected.
In progress, does not yet overwrite existing code.
The new promotion mechanism automatically retries queries which timed out. It also
handles multimaster situations correctly.