Currently the state change explanations are only added to mariadbmon. They
are less relevant for Galera clusters as they themselves explain why they
change their states but should still be added to make them easier to
analyze.
The event that isn't explained and is most often encountered is the loss
of a Slave status. Most often the loss of a Slave status happens because
either the IO thread or the SQL thread has stopped. Printing the states of
the threads as well as the latest error should hint at what caused the
outage.
The information can be added to the REST API in 2.5 where the monitors can
add extra information to the server JSON.
During switchover/failover, server events are altered. The ALTER
EVENT command automatically modifies the event charset and collation
to the values of the connetion running the query. This may cause
the event to become invalid.
Fixed this by changing connection charset and collation to the ones
in the event description just before altering it.
The settings are different from the other fields in that they should not change
on their own. Most manipulation functions only require the settings.
Also takes into use a class for host and port data.
The monitor queries for logged in users with super-privileges and kicks them out to
prevent writes to master. Normal users can stay since their writes are prevented by
read_only. Also, the master-status is removed from the master manually to signal to
routers that no more writes should go to master.
If gtid of master is unknown (as is typical when master is down when MaxScale
starts) the domain id is guessed from the slaves instead. This is usually
safe.
All servers are now updated in their own threads simultaneously. This
should reduce the possibility of having significantly different gtid:s
shown for different servers.
This fixes some situations where MaxAdmin/MaxCtrl would block and wait
until a monitor operation or tick is complete. This also fixes a deadlock
caused by calling monitor diagnostics inside a monitor script.
Concurrency is enabled by adding one mutex per server object to protect
array-like fields from concurrent reading/writing.
The monitor now continuously updates a list of enabled server events. When
promoting a new master in failover/switchover, only events that were enabled
on the previous master are enabled on the new. This avoids enabling events
that may have been disabled on the master yet stayed in the SLAVESIDE_DISABLED-
state on the slave.
In the case of reset-replication command, events on the new master are only
enabled if the monitor had a master when the command was launched. Otherwise
all events remain disabled.