When a softfailed node is finally revoked, it will appear as the
single node in a functioning Clustrix cluster. To ensure that the
Clustrix monitor will not stick to that node, if the node that is
used as hub is softfailed, it is immediately replaced with another
node.
When the servers of a service are defined by a monitor, then
at startup all servers of the monitor should be added to relevant
services. Likewise, when a server is added to or removed from a
monitor at runtime, those changes should affect services as well.
However, whether that should happen or not depends upon the monitor.
In the case of the Clustrix monitor this should not happen as it
adds and removes servers depending on the runtime state of the
Clustrix cluster.
The services whose servers are defined using a monitor, will
now be populated from the monitor.
Note, no consideration has yet been given to runtime changes.
It is now possible to [un]softfail a Clustrix node via MaxScale
using a Clustrix monitor module command.
In case a node is successfully softfailed, the `Being Drained` bit
will automatically turned on. Similarly, if a node is successfully
unsoftfailed, the `Being Drained` bit will be cleared.
Once the monitor has been able to connect to a clustrix node
and obtain the clustrix nodes, it'll primarily use those nodes
when looking for a Clustrix node to be used as the "hub".
With this change it is sufficient (but perhaps unwise) to provide
a single node boostrap node in the configuration file.
Some other rearrangements and renamings of functions has also been
made.
From system.membership we can find out what server exist in the
cluster while system.nodeinfo contains information about those
servers. If a node goes down, it will disappear from system.nodeinfo,
but not from system.membership. Consequently, we must start from
system.membership and then fetch more information from system.nodeinfo.
Incidentally, a query like
SELECT ms.nid, ni.iface_ip
FROM system.membership AS ms
LEFT JOIN system.nodeinfo AS ni ON ms.nid=ni.nodeid;
should provide all information in one go, but it seems that such joins
are not supported on the system tables.
The node infos of the Clustrix servers are now kept around and
and updated based upon changing conditions instead of regularly
being re-created.
Further, the server is now looked up by name only right after
having been created (and that only due to runtime_create_server()
currently being used).
The state of the dynamically created server is now updated directly
as a result of the health-check ping, while the state of the bootstrap
servers is updated during the tick()-call according to the monitor
"protocol".
Now the monitor
- will frequently ping the health port of each server
- less frequently check from system.membership the actual
number of available nodes
and act accordingly.
Currently, the updated servers are the ones listed in the conf
file. Subsequently this will be changed so that the servers listed
in the configuration file are only used for bootstrapping the monitor
and server objects are then created dynamically according to what is
found in the cluster.