Was deprecated in 2.3. Similar features are in MariaDB-Monitor.
One system test was modified to use MariaDB-Monitor instead. Some parts
of the test are disabled for now to make it pass.
The monitor now continuously updates a list of enabled server events. When
promoting a new master in failover/switchover, only events that were enabled
on the previous master are enabled on the new. This avoids enabling events
that may have been disabled on the master yet stayed in the SLAVESIDE_DISABLED-
state on the slave.
In the case of reset-replication command, events on the new master are only
enabled if the monitor had a master when the command was launched. Otherwise
all events remain disabled.
Since the current node id can be obtained using the function gtmnid()
the queries for finding out whether a node is in the quorum and whether
it is softfailed can be made simpler.
When a softfailed node is finally revoked, it will appear as the
single node in a functioning Clustrix cluster. To ensure that the
Clustrix monitor will not stick to that node, if the node that is
used as hub is softfailed, it is immediately replaced with another
node.
Worker::STOPPED -> MONITOR_STATE_STOPPED
Worker::POLLING -> MONITOR_STATE_RUNNING
Worker::PROCESSING -> MONITOR_STATE_RUNNING
By defining the monitor state from the worker state there is
no risk they will ever get out of sync. And there is one thing
less to maintain.
When the servers of a service are defined by a monitor, then
at startup all servers of the monitor should be added to relevant
services. Likewise, when a server is added to or removed from a
monitor at runtime, those changes should affect services as well.
However, whether that should happen or not depends upon the monitor.
In the case of the Clustrix monitor this should not happen as it
adds and removes servers depending on the runtime state of the
Clustrix cluster.
The services whose servers are defined using a monitor, will
now be populated from the monitor.
Note, no consideration has yet been given to runtime changes.
The manipulation functions are currently static so that the container can be initialized
if required. This will be fixed later.
The new functions are taken into use in monitor management.
The default ECMAScript syntax appears to be broken on CentOS 7 which
effectively prevents its use in most cases. A more reliable alternative
would be to use the bundled PCRE2 library but the basic POSIX regular
expressions seem to work.
The likely reason for a node being down is that some cluster level
modifications have been performed. Consequently a cluster check should
be triggered in that case.
When checking the node info, also include information about wheter
a node is being SOFTFAILed. If it is, turn on the `Being Drained`
bit.
A node is SOFTFAILed with the intention of removing it, so better
not to create new connections to it as they later would be broken
when the node is actually taken down.
It is now possible to [un]softfail a Clustrix node via MaxScale
using a Clustrix monitor module command.
In case a node is successfully softfailed, the `Being Drained` bit
will automatically turned on. Similarly, if a node is successfully
unsoftfailed, the `Being Drained` bit will be cleared.
Since the settings are now protected fields, all related functions were
moved inside the monitor class. mon_ping_or_connect_to_db() is now a method
of MXS_MONITORED_SERVER. The connection settings class is defined inside the
server since that is the class actually using the settings.
Once the monitor has been able to connect to a clustrix node
and obtain the clustrix nodes, it'll primarily use those nodes
when looking for a Clustrix node to be used as the "hub".
With this change it is sufficient (but perhaps unwise) to provide
a single node boostrap node in the configuration file.
Some other rearrangements and renamings of functions has also been
made.
Convention needs to be that the runtime object creating other objects
needs to incorporate its own name in the name of any object created.
Together with the '@@' prefix that ensures that the created name will
be reasonably globally unique.
The QueryResult-object remembers if a conversion failed. This makes checking
for errors more convenient, as just one check per row is required. The conversion
functions always return a valid value.
If a server cannot be used, close the associated MYSQL connection.
Further, when an existing connection is used, verify that the server
is still part of the quorum.
From system.membership we can find out what server exist in the
cluster while system.nodeinfo contains information about those
servers. If a node goes down, it will disappear from system.nodeinfo,
but not from system.membership. Consequently, we must start from
system.membership and then fetch more information from system.nodeinfo.
Incidentally, a query like
SELECT ms.nid, ni.iface_ip
FROM system.membership AS ms
LEFT JOIN system.nodeinfo AS ni ON ms.nid=ni.nodeid;
should provide all information in one go, but it seems that such joins
are not supported on the system tables.
The node infos of the Clustrix servers are now kept around and
and updated based upon changing conditions instead of regularly
being re-created.
Further, the server is now looked up by name only right after
having been created (and that only due to runtime_create_server()
currently being used).
The state of the dynamically created server is now updated directly
as a result of the health-check ping, while the state of the bootstrap
servers is updated during the tick()-call according to the monitor
"protocol".
MaxScale server objects are now created for all Clustrix nodes.
Currently the name is "Clustrix-Server-N" where N is the number
of the node.
The server is created using runtime_create_server() that has been
modified so that it optionally will not persist the created server.
That is probably just a temporary solution as a monitor should not
need to include .../core/internal-stuff.