84 Commits

Author SHA1 Message Date
Esa Korhonen
09df017528 MXS-1886 Better auto-rejoin error description and tolerance
Auto-rejoin now explains more accurately if a server cannot be joined due
to conflicting gtid.

Also, auto-rejoin is no longer disabled if a join fails. Usually the fail
is due to the server not replying fast enough with query completion. The
query is often completed anyways. This can lead to some log spam.
2018-06-15 13:11:10 +03:00
Esa Korhonen
9e68d8ec3d MXS-1886 Auto-failover error tolerance
Auto-failover is no longer considered to have failed if the preconditions
are not met. An error message with the failed checks is printed once, but
the checks are repeated every loop as long as the master is down.
2018-06-15 12:52:03 +03:00
Esa Korhonen
fb56de641a MXS-1859 Add options for enforcing read_only on slaves
If the feature is enabled (default off), at the end of a monitor loop
(once server states are known), read_only is enabled on slaves servers
without it.
2018-05-18 15:29:56 +03:00
Esa Korhonen
739edcbe22 MXS-1639 Run user-given sql commands during promotion, demotion and rejoin
The sql queries are given in two text files, defined by options promotion_sql_file
and demotion_sql_file. The files must exist when monitor starts. The files are read
line by line, ignoring empty lines and lines starting with '#'. All other lines
are sent to the server being promoted, demoted or rejoined. Any error in opening
a file, reading it or executing the contents will cause the entire operation to
fail.

The filed defined in demotion_sql_file is also ran when rejoining a server. This
is to ensure a previously failed master is "demoted" properly when it joins the
cluster.
2018-04-19 17:01:36 +03:00
Johan Wikman
b67ab83486 Revert "Use dedicated header in NDBClusterMon"
This reverts commit b9d80f6061d6b536d7a15febf0367e5f6dba0e84.
2018-02-24 15:43:15 +02:00
Esa Korhonen
b9d80f6061 Use dedicated header in NDBClusterMon
NDBClusterMonitor used the MariaDBMonitor header instead of its own.
2018-02-20 11:09:04 +02:00
Esa Korhonen
a0d9c7da74 External master server support for failover/switchover
If the master is replicating from an external master, the monitor will save the
host:port of the external server. During demotion, the old master stops the external
replication while the new master begins it. Also, any commands that would add
to gtid have to be omitted when an external master is in play.
2018-02-08 18:44:08 +02:00
Esa Korhonen
1cf3de4a74 Add config parameter for excluding servers from failover
"servers_no_promotion" is a comma-separated list of servers
which cannot be chosen when selecting a new master during failover
(auto or manual), or when automatically selecting a new master
for switchover (currently disabled).

The servers in the list are redirected normally and can be promoted
by switchover when manually selecting a new master.
2018-02-07 14:07:10 +02:00
Markus Mäkelä
79652301d8 Fix split of mysqlmon sources
For some reason, the source code of mysqlmon was split into C and C++
sources. This caused problems by effectively discarding all changes from
2.1 that are merged into 2.2.

This commit merges the changes into the correct file that were added to
the wrong file.
2017-12-21 16:24:03 +02:00
Markus Mäkelä
79afaa447e Merge branch '2.1' into 2.2 2017-12-12 13:23:02 +02:00
Esa Korhonen
45834a89b5 MXS-1533: Rename "auto_join" to "auto_rejoin" 2017-12-04 09:59:29 +02:00
Esa Korhonen
508ce3a703 MXS-1491: Failover can be executed manually
Also, renamed config setting "failover" to "auto_failover". Removed
setting "switchover" as it is now always enabled.
2017-12-04 09:41:00 +02:00
Esa Korhonen
90f6d78a58 MXS-1533: Add automatic join feature
When enabled, the monitor will redirect servers to replicate from the
current master. Standalone servers and servers replicating from a slave
are redirected.
2017-12-04 09:37:16 +02:00
Markus Mäkelä
d5d41349ae MXS-1509: Add ignore_external_masters parameter
The new parameter allows ignoring of master servers that are external to
the monitor configuration. This allows sub-trees of the actual replication
tree to be used as fully fledged replication trees.
2017-11-30 12:39:00 +02:00
Esa Korhonen
23cd294dad MXS-1533: Better handling of multi-domain gtids
If the gtid_domain_pos of the master is ever modified,
gtid-variables will have multiple domains. Generally, we are
only interested in the most recent domain. This is tracked in
gtid_domain_id:s and the value of the master is used for
filtering the correct domain from all gtid-values.

Also, use gtid_current_pos instead of gtid_slave_pos. The
advantage of current_pos is that the same variable works also
for master servers. The gtid-handling is now more thorough and
detects some weird situations.
2017-11-27 16:31:13 +02:00
Esa Korhonen
b63c6504a3 MXS-1513: Switchover script
First version of switchover script. Unsafe to run as it has no
timeouts for most queries. Also, removed code launching the
previous switchover_script.
2017-11-16 10:51:12 +02:00
Johan Wikman
4cf01fa88f Remove 'failover_script' parameter
As the failover is now internal to MySQL Monitor, no failover
script parameter is needed.
2017-11-07 16:05:44 +02:00
Markus Mäkelä
0be39b8545 MXS-1493: Improve master failure detection
The master failure can now be verified by checking when the slaves are
connected to the master. If the slaves do not receive any events from the
master, the connections are considered as down after a configurable limit.

Added two parameters for controlling whether the check is done and for how
long the monitor waits before doing the failover.
2017-10-27 15:31:18 +03:00
Esa Korhonen
63c7550196 MXS-1490 Prepare for failover functionality addition
Moved mon_process_failover() from monitor.cc to mysql_mon.cc. Renamed
some functions and variables related to previous failover functionality
to avoid confusion.
2017-10-25 12:24:29 +03:00
Markus Mäkelä
95ac9d501c MXS-1494: Add replication credentials to mysqlmon
The credentials used for slave servers can now be controlled with the
replication_user and replication_password parameters.
2017-10-24 23:44:46 +03:00
Johan Wikman
df816ea2a9 MXS-1460 Add failover_script parameter
The failover script can now be specified in the configuration file.
2017-10-03 15:24:29 +03:00
Johan Wikman
267a45ad63 MXS-1441 Add switchover_script parameter
If a switchover_script parameter is given, its value will be used as
the switchover script. Otherwise the default will be used. Currently
just echo.

The MySQL Monitor now introduces two script variables, CURRENT_MASTER
and NEW_MASTER, that contain information about the current and new
master respectively.

Switchover is performed only if switchover has been enabled and MaxScale
is *not* in passive mode.
2017-10-03 13:51:08 +03:00
Johan Wikman
438b4e0341 Merge branch '2.2' into 2.2-mrm 2017-10-02 15:49:08 +03:00
Johan Wikman
8d03876e3e Rename MXS_MONITOR_SERVERS to MXS_MONITORED_SERVER
An element in a linked list is not a list.
2017-10-02 15:05:17 +03:00
Johan Wikman
ff467e218a MXS-1441 Add switchover and switchover_timeout config vars
Tentative documentation.

With the 'switchover' config variable the switchover functionality
can be enabled. If enabled a REST API endpoint will appear, using
which that switchover can be initiated.

Switchover can only be performed when MaxScale is in active mode
and failover will be disabled for the duration of the switchover.
Only if the switchover succeeds, will failover be enabled again.

Might be easier to expose that REST API always and only change
the behaviour when calling it, instead of making it appear and
re-appear.
2017-09-28 13:29:21 +03:00
Markus Mäkelä
d4fd34cecd MXS-1446: Move failover parameters into mysqlmon
The `failover` and `failover_timeout` parameters are now declared as a
part of the mysqlmon module. Changed the implementation of the failover
function so that the dependencies on the monitor struct can be removed or
moved into parameters.
2017-09-28 08:23:34 +03:00
Markus Mäkelä
53bf21f785 MXS-1262: Use monitor journals in all monitors
All monitors now persist the state of the server in a monitor journal
file.

Moved the removal of stale journals into the core and removed them from
the monitor journal interface.
2017-08-11 04:09:08 +03:00
Markus Mäkelä
b448b129d0 MXS-1262: Move journal_max_age to MaxScale core
The parameter is now defined in the monitor. Further refactoring is needed
to make the interface of the journal system simpler.
2017-08-11 04:09:08 +03:00
Markus Mäkelä
837d57f4f4 MXS-1262: Move monitor journals into the core
The journaling functionality is now in the core. Only the MySQL Monitor is
using it.
2017-08-11 04:09:07 +03:00
MassimilianoPinto
cb57e10761 Develop merge
Develop merge
2017-06-29 15:34:22 +02:00
Johan Wikman
f546a17e77 Update change date of 2.2 2017-06-01 10:24:20 +03:00
Markus Mäkelä
b434c94563 Prevent monitor deadlocks with repeated restarts
If a monitor is started and stopped before the external monitoring thread
has had time to start, a deadlock will occur.

The first thing that the monitoring threads do is read the monitor handle
from the monitor object. This handle is given as the return value of
startMonitor and it is stored in the monitor object. As this can still be
NULL when the monitor thread starts, the threads use locks to prevent
this.

The correct way to prevent this is to pass the handle as the thread
parameter so that no locks are required.
2017-05-04 09:17:48 +03:00
Markus Mäkelä
bbcfe98651 Add stale journal file detection
Added a configurable maximum age for the mysqlmon journal files. If the
file is older than the configured value, it will be ignored and removed.
2017-03-17 13:37:37 +02:00
Markus Mäkelä
08cc7a9515 Make MySQL monitor crash-safe
The MySQL monitor stores the server states in a backup file which can be
used to restore the state of the servers even if MaxScale is stoppen in an
uncontrolled fashion.
2017-03-17 11:12:48 +02:00
Markus Mäkelä
916cb4df08 Rename failover and failover_recovery
The names of the parameters were misleading as MaxScale doesn't perform
the actual failover but only detects if one has been done.
2017-03-03 18:45:20 +02:00
Markus Mäkelä
e7c7caebad Add option for failover recovery in mysqlmon
The `failover_recovery` option allows failed servers to rejoin the
cluster. This should make using MaxScale with two node clusters easier.

One use case for this is when the replication-manager promotes the last
node in the cluster as the master. When this is done, the slave
configuration is cleared and the read-only mode is disabled. Since the
failover requires that the server is not configured as a slave and that it
is not in read-only mode, it is safe to use `failover_recovery` with
replication-manager.
2017-02-20 11:20:53 +02:00
Johan Wikman
5648f708af Update license to BSL 1.1 2017-02-14 21:42:28 +02:00
Esa Korhonen
b187afdcf4 Move config_runtime.h and externcmd.h to core
+ some cleanup
2017-01-24 13:05:21 +02:00
Esa Korhonen
680401cf8e Rename public types and constants in monitor.h
Preparing to split monitor.h into module and core sections. Also
changed a few comments in monitor.h.
2017-01-17 15:47:13 +02:00
Markus Mäkelä
a196420c2d Move monitor script processing and launching into the core
This removes parts of the nearly identical code from all monitors.

The removal of monitor type specific event checking is done based on the
assumption that only the monitor that is monitoring the server can be the
cause for a state change. This removes the need to actually check that the
state change is relevant for each monitor and allows the event handling to
be moved into the core.
2017-01-11 14:21:57 +02:00
Markus Mäkelä
5a290cb0b8 Use module parameters in monitors
All monitors now declare the parameters that they use. This allows the
core to check the validity of the parameters before they are passed to the
monitor. It also simplifies the processing of the parameters as they are
guaranteed to be valid.
2017-01-05 09:58:11 +02:00
Johan Wikman
1333da0712 Remove skygw_utils.h
The general purpose stuff in skygw_utils.h was moved to utils.h
and the corresponding implementation from skygw_utils.cc to utils.c.
Includes updated accordingly.

Skygw_utils.h is now only used by log_manager and by mlist, which
is only used by log_manager. Consequently, skygw_utils.h was moved
to server/maxscale.

Utils.h needs a separate overhaul.
2016-10-14 19:50:54 +03:00
Johan Wikman
c03b8079fd Move @file comment
Where it exists, the @file comment has now been moved to be
consistently right after the license blurb.
2016-10-14 13:20:52 +03:00
Johan Wikman
1a978be6b6 Cleanup header files
- All now include maxscale/cdefs.h as the very first file.
- MXS_[BEGIN|END]_DECLS added to all C-headers.
  Strictly speaking not necessary for private headers, but
  does not hurt either.
- Include guards moved to the very top of the file.
- #pragma once added.
2016-10-14 11:54:37 +03:00
Johan Wikman
76430e060f maxconfig.h renamed to config.h 2016-10-13 22:59:39 +03:00
Johan Wikman
e41589be10 Move headers from server/include to include/maxscale
- Headers now to be included as <maxscale/xyz.h>
- First step, no cleanup of headers has been made. Only moving
  from one place to another + necessary modifications.
2016-10-13 16:19:20 +03:00
Markus Makela
9b2209a8d1 Log only one warning when failover is initiated
Mysqlmon would log a warning at every monitoring interval when failover
was initiated.
2016-10-06 16:53:37 +03:00
Markus Makela
c919511ba7 Implement simple failover mode into mysqlmon
The mysqlmon simple failover mode allows it to direct write traffic to a
secondary node. This enables a very simple failover mode with MaxScale
when it is used in a two node master-slave setup.
2016-09-26 11:00:16 +03:00
Markus Makela
46c8a6f66b MXS-839: Detect multi-master topologies with mysqlmon
The mysqlmon now supports proper detection of multi-master topologies by
building a directed graph out of the monitored server. If cycles are found from
this graph, they are assigned a master group ID. All servers with a positive
master group ID will receive the Master status unless they have `@@read_only`
enabled.

This new functionality can be enabled with the 'multimaster' boolean
parameter.
2016-09-12 15:57:27 +03:00
Markus Makela
d745781bd0 MXS-839: Store additional server information in mysqlmon
Mysqlmon now stores the values of read_only, slave_sql_running,
slave_io_running, the name and position of the masters binlog and the
replication configuration status of the slave.

This allows more detailed server information to be displayed with the
`show monitor <name>` diagnostic interface. In addition to this, the new
structure used to store them provides an easy way to store information
that is specific to a monitor and the servers it monitors.

These new status variables can be used to implement better multi-master
detection in mysqlmon by using the value of read_only to resolve
situations where multiple master candidates are available.
2016-09-12 15:57:27 +03:00