The logic was weird, as the permission checking function assumes a disconnected
server as fine. The checking is now done when starting the main loop and lacking
grants print errors but does not stop the monitor.
Since monitors are now freed at MaxScale exit, the server data should be freed. Also,
gtid domain variables are now initialized with a common constant.
StartMonitor() now takes a MXS_MONITOR_INSTANCE and returns
true, if the monitor could be started and false otherwise.
So, the setup is such that in createInstance(), the instance
data is created and then using startMonitor() and stopMonitor()
the monitor is started/stopped. Finally in destroyInstance(),
the actual instance data is deleted.
The following type name changes
MXS_MONITOR_OBJECT -> MXS_MONITOR_API
MXS_SPECIFIC_MONITOR -> MXS_MONITOR_INSTANCE
Further, the 'handle' instance variable of what used to be
called MXS_MONITOR_OBJECT has been renamed to 'api'.
An example, what used to look like
mon->module->stopMonitor(mon->handle);
now looks like
mon->api->stopMonitor(mon->instance);
which makes it more obvious what is going on.
CreateInstance() (renamed from initMonitor()) and destroyInstance()
(renamed from finishMonitor()) have now tentatively been
implemented for all monitors.
Next step is to
1) change the prototype of startMonitor() to
bool (*startMonitor)(MXS_SPECIFIC_MONITOR*,
const MXS_MONITOR_PARAMETER*);
and assume that mon->handle will always contain the
instance,
2) not delete any data in stopMonitor(),
3) add monitorCreateAll() that calls createInstance() for all
monitors (and call that in main()), and
4) add monitorDestroyAll() that calls destroyInstance() for
all monitors (and call that in main()).
Now, all monitor functions but startMonitor takes a
MXS_SPECIFIC_MONITOR instead of MXS_MONITOR. That is, startMonitor
is now like a static factory member returning a new specific
monitor instance and the other functions are like member functions
of that instance.
Instead of using void there's now a MXS_SPECIFIC_MONITOR struct
from which monitor specific types can be derived. This change
does not bring about other benefits than a bit of clarity but
this is the first step in clearing up the monitor API.
The updating of GTIDs was only considered successful if both the current
GTID position and binlog GTID positions were non-empty. If a slave has no
binlogged events, the GTID update would always fail.
This change in behavior caused the mysqlmon_failover_auto and
mysqlmon_failver_manual tests to break. The test disabled the binary log
on one of the servers which caused it to be left out from the rejoining
process.
The master down verification through slaves won't work with this commit. It needs to be
redesigned to handle multiple slave connections or removed. Also, only the first row of
slave status data is used by the monitor, so multiple slave connections are still
incorrectly handled.
5.1 to 5.3 are officially not supported anymore, so support can be removed from
the monitor. This allows removing the config parameter "mysql51_replication".
In this case, the server was already a slave and is not being demoted. Also, the file may
contain queries which cannot be ran while a slave connection is running.
The sql queries are given in two text files, defined by options promotion_sql_file
and demotion_sql_file. The files must exist when monitor starts. The files are read
line by line, ignoring empty lines and lines starting with '#'. All other lines
are sent to the server being promoted, demoted or rejoined. Any error in opening
a file, reading it or executing the contents will cause the entire operation to
fail.
The filed defined in demotion_sql_file is also ran when rejoining a server. This
is to ensure a previously failed master is "demoted" properly when it joins the
cluster.
The functions do not set errno on all invalid input, so it's best to check
endptr.
Also, strtoll is now used for server id scanning through QueryResult.