Both the replication lag and the message printing state are saved in SERVER,
although the values are mostly used by readwritesplit. A log message is printed
both when a server goes over the limit and when it comes back below.
Because of concurrency issues, a message may be printed multiple times before
different threads detect the new message state.
Documentation updated to explain the change.
MaxScale server objects are now created for all Clustrix nodes.
Currently the name is "Clustrix-Server-N" where N is the number
of the node.
The server is created using runtime_create_server() that has been
modified so that it optionally will not persist the created server.
That is probably just a temporary solution as a monitor should not
need to include .../core/internal-stuff.
Now the monitor
- will frequently ping the health port of each server
- less frequently check from system.membership the actual
number of available nodes
and act accordingly.
Currently, the updated servers are the ones listed in the conf
file. Subsequently this will be changed so that the servers listed
in the configuration file are only used for bootstrapping the monitor
and server objects are then created dynamically according to what is
found in the cluster.
The functions stores the current server status to the monitored
server's mon_prev_status and pending_status fields.
To be used at the start of the monitor loop, before the pending
status fields are updated.
There is a race condition between the addition of the DCB into epoll and
the execution of the event that initiates the protocol pointer for the DCB
and sends the handshake to the client. If a hangup event would occur
before the handshake would be sent, it would be possible that the DCB
would get freed before the code that sends the handshake is executed.
By picking the worker who owns the DCB before the DCB is placed into the
owner's epoll instance, we make sure no events arrive on the DCB while the
control is transferred from the accepting worker to the owning
worker.
Fixed the documentation on the arguments to maxkeys, which is a directory,
and added a short paragraph about alternative key file locations. Also
documented that keys are read from the directory where the `datadir`
parameter points to.
The Galera documentation tells us to use the galera_new_cluster command to
start a new Galera cluster. This should prevent the problems of nodes
failing to join the cluster either on the initial startup or after a node
goes down.
When poll_add_dcb was called for a DCB that once was polling system but
was subsequently removed, the DCB would appear twice in the worker's list
of DCBs. This caused a hang when the DCB was the last one in the worker's
list and dcb_foreach_local would be called.
To prevent the aforementioned problem, the DCBs are now added and removed
directly to and from the workers instead of indirectly via poll_add_dcb
and poll_remove_dcb.
If the connection to the master is lost, knowing what type of an error
caused the call to handleError helps deduce what was the real reason for
it. Logging the idle time of the connection helps detect when the
wait_timeout of a connection is exceeded.
If the session doesn't match the required username or remote address, the
match data is not allocated. This also doubles as a replacement of the
active member variable.