Now the monitor
- will frequently ping the health port of each server
- less frequently check from system.membership the actual
number of available nodes
and act accordingly.
Currently, the updated servers are the ones listed in the conf
file. Subsequently this will be changed so that the servers listed
in the configuration file are only used for bootstrapping the monitor
and server objects are then created dynamically according to what is
found in the cluster.
The functions stores the current server status to the monitored
server's mon_prev_status and pending_status fields.
To be used at the start of the monitor loop, before the pending
status fields are updated.
There is a race condition between the addition of the DCB into epoll and
the execution of the event that initiates the protocol pointer for the DCB
and sends the handshake to the client. If a hangup event would occur
before the handshake would be sent, it would be possible that the DCB
would get freed before the code that sends the handshake is executed.
By picking the worker who owns the DCB before the DCB is placed into the
owner's epoll instance, we make sure no events arrive on the DCB while the
control is transferred from the accepting worker to the owning
worker.
The Galera documentation tells us to use the galera_new_cluster command to
start a new Galera cluster. This should prevent the problems of nodes
failing to join the cluster either on the initial startup or after a node
goes down.
When poll_add_dcb was called for a DCB that once was polling system but
was subsequently removed, the DCB would appear twice in the worker's list
of DCBs. This caused a hang when the DCB was the last one in the worker's
list and dcb_foreach_local would be called.
To prevent the aforementioned problem, the DCBs are now added and removed
directly to and from the workers instead of indirectly via poll_add_dcb
and poll_remove_dcb.
If the connection to the master is lost, knowing what type of an error
caused the call to handleError helps deduce what was the real reason for
it. Logging the idle time of the connection helps detect when the
wait_timeout of a connection is exceeded.
If the session doesn't match the required username or remote address, the
match data is not allocated. This also doubles as a replacement of the
active member variable.
The script adds config and log files into a zip archive. Passwords in
config files are censored. Also attempts to read current status by calling
maxctrl. If core-file exists, runs gdb on it to gather call stack.
The script is installed to the binary file directory.
The code used a rather questionable method for parsing SQL statements
instead of using the query classifier for detecting transaction start and
stop events.