Add missing listener JSON diagnostics call. Check that the
diagnostics_json function exists before calling it.
As the protocol modules don't have diagnostics functions, they aren't
called.
Replace hard-coded strings with constant parameters. This makes it
slightly cleaner.
Previously, if the list contained servers that were not monitored by
the monitor yet were valid servers, an error value would be returned
and the monitor failed to start.
With this update, the non-monitored servers are simply ignored when
forming the final list.
Also, added printing of the list to diagnostics.
"servers_no_promotion" is a comma-separated list of servers
which cannot be chosen when selecting a new master during failover
(auto or manual), or when automatically selecting a new master
for switchover (currently disabled).
The servers in the list are redirected normally and can be promoted
by switchover when manually selecting a new master.
Now detects some erroneous situations before starting switchover.
Switchover can be activated without specifying current master.
In this case, the cluster master server is selected.
The internal header directory conflicted with in-source builds causing a
build failure. This is fixed by renaming the internal header directory to
something other than maxscale.
The renaming pointed out a few problems in a couple of source files that
appeared to include internal headers when the headers were in fact public
headers.
Fixed maxctrl in-source builds by making the copying of the sources
optional.
The error message was not 100% accurate about the value. In addition to
that, neither the value itself nor the monitor or parameter names were
printed in the error message.
The state of the monitored servers is only persisted if the states of the
servers have changed. This removes the unnecessary disk IO caused by the
writing on the monitor journal.
As the passive parameter is only used by the failover and the failover can
only be initiated by the monitor, there is no true need to synchronize the
reads and write of this parameter.
As all runtime changes are protected by the runtime lock, only partial
reads are of concern. For the supported platforms, this is not a practical
problem and it only confuses the reader when other variables are modified
without atomic operations.
The master failure was assumed to be the only master related event for
each monitoring loop. If the master was switched by an external actor, the
monitor tracking would be out of sync.
Using timestamps to detect whether MaxScale was active or passive can
cause problems if multiple events happen at the same time. This can be
avoided by separating events into actively observed and passively observed
events. This clarifies the logic by removing the ambiguity of timestamps.
As the monitoring threads are separate from the worker threads, it is
prudent to use atomic operations to modify and read the state of the
MaxScale. This will impose an happens-before relation between MaxScale
being set into passive mode and events being classified as being passively
observed.
Moved mon_process_failover() from monitor.cc to mysql_mon.cc. Renamed
some functions and variables related to previous failover functionality
to avoid confusion.
The parameter handling for monitors can now be done in a consistent manner
by establishing a rule that the monitor owns the parameter object as long
as it is running. This will allow parameters to be added and removed
safely both from outside and inside monitors.
Currently this functionality is only used by mysqlmon to disable failover
after an attempt to perform a failover has failed.
The `failover` and `failover_timeout` parameters are now declared as a
part of the mysqlmon module. Changed the implementation of the failover
function so that the dependencies on the monitor struct can be removed or
moved into parameters.
Split the state change processing and failover handling into two separate
functions and added a call to the failover function into mysqlmon. This
prevents unintended behavior when failover is enabled for non-mysqlmon
monitors. The parameter itself still needs to be moved into mysqlmon.
Moved the failover documentation to the mysqlmon documentation as it is
specific to this monitor.
As the monitor event is now stored in the server, it can be re-used when
the event is converted to string form. This also fixes the problem of
state calculation taking place when the event happened in the past.
The failover command is simulated by executing a call to /usr/bin/echo
with all possible monitor parameters. This allows testing of the failover
mechanism without actually using the failover command.
The timestamp of the last change from passive to active is now
tracked. This, with the timestamps of the last master_down and master_up
events, allows detection of cases when MaxScale was failed over but the
failover was not done.
Currently, only a warning is logged if no new master has appeared within
90 seconds of a master_down event and MaxScale was set to active from
passive.
The last event and when the event was triggered is now shown for all
servers. The latest change from passive to active is also shown.
When an event occurs on a server, it is now stored so that the last event
for each server is known. This allows a state change to trigger an event
even if, at the time of the event, no action was taken.
This change is only cosmetic as no functionality is implemented.
The `script_timeout` and `journal_max_age` parameters weren't handled in
the monitor alteration code.
Also added missing documentation to maxadmin help output for
`alter monitor`.
If an invoked script must access servers, it needs credentials.
When invoked, a script can now be provided with the monitor
credentials of MaxScale using the variable CREDENTIALS.
It will be expanded like
user:password@[...]:N1,user:password@[...]:N2
for every server the monitor in question is monitoring. That is,
irrespective of whether it is a master or a slave, running or not.
Thus, a failover script could be specified like:
[MyMonitor]
type=monitor
module=mysqlmon
...
script=.../failover.sh --credentials=$CREDENTIALS --slaves=$SLAVELIST
events=master_down
Note, it may make sense to introduce specific failover (and switchover)
keywords, but with the above addition it is possible to start
experimenting with failover scripts.
The CHILDREN parameter expands to a list of server IPs and ports that are
direct descendants of the server that initiated the event.
Also added a note that the variables can expand to empty strings if
nothing matches the criteria of the variable.
The scripts now replace the PARENT variable with the IP and port of the
server that is the direct parent node of the server that initiated the
event. For master-slave clusters, this will be the master IP if the server
that triggered the event is a slave.
When the subprocess outputs a line, the message should be logged
immediately. This allows automated timestamps for the output of the
executed subprocess.
Moved 4 byte get/set into utils header. The byte packing functions in
maxscale/protocol/mysql.h should be migrated to the utils directory where
they can also be used by non-mysql code.
The temporary files are now generated with mkstemp. This will prevent
conflicts with multiple monitors operating on the same temporary journal
even though it is impossible in practice.
Added missing error messages to a couple of the functions.
All monitors now persist the state of the server in a monitor journal
file.
Moved the removal of stale journals into the core and removed them from
the monitor journal interface.
The monitors should only be reused if they have the same name and they use
the same module. This way the only difference is in configuration.
Fixed MaxCtrl detection of bad options and altered monitor creation test
to expect correct results. Also improved some of the error messages.
If a destroyed monitor is created again, it will be reused. This should
prevent excessive memory growth when the same monitor is created and
destroyed again.