875 lines
		
	
	
		
			40 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			875 lines
		
	
	
		
			40 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # MariaDB Monitor
 | |
| 
 | |
| Table of Contents
 | |
| =================
 | |
| 
 | |
|   * [Overview](#overview)
 | |
|   * [Master selection](#master-selection)
 | |
|   * [Configuration](#configuration)
 | |
|   * [Common Monitor Parameters](#common-monitor-parameters)
 | |
|   * [MariaDB Monitor optional parameters](#mariadb-monitor-optional-parameters)
 | |
|      * [assume_unique_hostnames](#assume_unique_hostnames)
 | |
|      * [detect_replication_lag](#detect_replication_lag)
 | |
|      * [detect_stale_master](#detect_stale_master)
 | |
|      * [detect_stale_slave](#detect_stale_slave)
 | |
|      * [mysql51_replication](#mysql51_replication)
 | |
|      * [multimaster](#multimaster)
 | |
|      * [ignore_external_masters](#ignore_external_masters)
 | |
|      * [detect_standalone_master](#detect_standalone_master)
 | |
|      * [failcount](#failcount)
 | |
|      * [allow_cluster_recovery](#allow_cluster_recovery)
 | |
|      * [enforce_read_only_slaves](#enforce_read_only_slaves)
 | |
|      * [maintenance_on_low_disk_space](#maintenance_on_low_disk_space)
 | |
|   * [Cluster manipulation operations](#cluster-manipulation-operations)
 | |
|      * [Operation details](#operation-details)
 | |
|      * [Manual activation](#manual-activation)
 | |
|      * [Automatic activation](#automatic-activation)
 | |
|      * [Limitations and requirements](#limitations-and-requirements)
 | |
|      * [External master support](#external-master-support)
 | |
|      * [Configuration parameters](#configuration-parameters)
 | |
|         * [auto_failover](#auto_failover)
 | |
|         * [auto_rejoin](#auto_rejoin)
 | |
|         * [switchover_on_low_disk_space](#switchover_on_low_disk_space)
 | |
|         * [replication_user and replication_password](#replication_user-and-replication_password)
 | |
|         * [failover_timeout and switchover_timeout](#failover_timeout-and-switchover_timeout)
 | |
|         * [verify_master_failure and master_failure_timeout](#verify_master_failure-and-master_failure_timeout)
 | |
|         * [servers_no_promotion](#servers_no_promotion)
 | |
|         * [promotion_sql_file and demotion_sql_file](#promotion_sql_file-and-demotion_sql_file)
 | |
|         * [handle_server_events](#handle_server_events)
 | |
|   * [Troubleshooting](#troubleshooting)
 | |
|      * [Failover/switchover fails](#failoverswitchover-fails)
 | |
|      * [Slave detection shows external masters](#slave-detection-shows-external-masters)
 | |
|   * [Using the MariaDB Monitor With Binlogrouter](#using-the-mariadb-monitor-with-binlogrouter)
 | |
|   * [Example 1 - Monitor script](#example-1---monitor-script)
 | |
| 
 | |
| ## Overview
 | |
| 
 | |
| MariaDB Monitor monitors a Master-Slave replication cluster. It probes the
 | |
| state of the backends and assigns server roles such as master and slave, which
 | |
| are used by the routers when deciding where to route a query. It can also modify
 | |
| the replication cluster by performing failover, switchover and rejoin. Backend
 | |
| server versions older than MariaDB/MySQL 5.5 are not supported. Failover and
 | |
| other similar operations require MariaDB 10.0.2 or later.
 | |
| 
 | |
| Up until MariaDB MaxScale 2.2.0, this monitor was called _MySQL Monitor_.
 | |
| 
 | |
| ## Master selection
 | |
| 
 | |
| Only one backend can be master at any given time. A master must be running
 | |
| (successfully connected to by the monitor) and its *read_only*-setting must be
 | |
| off. A master may not be replicating from another server in the monitored
 | |
| cluster unless the master is part of a multimaster group. Master selection
 | |
| prefers to select the server with the most slaves, possibly in multiple
 | |
| replication layers. Only slaves reachable by a chain of running relays or
 | |
| directly connected to the master count.  When multiple servers are tied for
 | |
| master status, the server which appears earlier in the `servers`-setting of the
 | |
| monitor is selected.
 | |
| 
 | |
| Servers in a cyclical replication topology (multimaster group) are interpreted
 | |
| as having all the servers in the group as slaves. Even from a multimaster group
 | |
| only one server is selected as the overall master.
 | |
| 
 | |
| After a master has been selected, the monitor prefers to stick with the choice
 | |
| even if other potential masters with more slave servers are available. Only if
 | |
| the current master is clearly unsuitable does the monitor try to select another
 | |
| master. An existing master turns invalid if:
 | |
| 
 | |
| 1. It is unwritable (*read_only* is on).
 | |
| 2. It has been down for more than *failcount* monitor passes and has no running
 | |
| slaves. Running slaves behind a downed relay count.
 | |
| 3. It did not previously replicate from another server in the cluster but it
 | |
| is now replicating.
 | |
| 4. It was previously part of a multimaster group but is no longer, or the
 | |
| multimaster group is replicating from a server not in the group.
 | |
| 
 | |
| Cases 1 and 2 cover the situations in which the DBA, an external script or even
 | |
| another MaxScale has modified the cluster such that the old master can no longer
 | |
| act as master. Cases 3 and 4 are less severe. In these cases the topology has
 | |
| changed significantly and the master should be re-selected, although the old
 | |
| master may still be the best choice.
 | |
| 
 | |
| The master change described above is different from failover and switchover
 | |
| described in section
 | |
| [Failover, switchover and auto-rejoin](#failover,-switchover-and-auto-rejoin).
 | |
| A master change only modifies the server roles inside MaxScale but does not
 | |
| modify the cluster other than changing the targets of read and write queries.
 | |
| Failover and switchover perform a master change on their own.
 | |
| 
 | |
| As a general rule, it's best to avoid situations where the cluster has multiple
 | |
| standalone servers, separate master-slave pairs or separate multimaster groups.
 | |
| Due to master invalidation rule 2, a standalone master can easily lose the
 | |
| master status to another valid master if it goes down. The new master probably
 | |
| does not have the same data as the previous one. Non-standalone masters are less
 | |
| vulnerable, as a single running slave or multimaster group member will keep the
 | |
| master valid even when down.
 | |
| 
 | |
| ## Configuration
 | |
| 
 | |
| A minimal configuration for a  monitor requires a set of servers for monitoring
 | |
| and a username and a password to connect to these servers.
 | |
| 
 | |
| ```
 | |
| [MyMonitor]
 | |
| type=monitor
 | |
| module=mariadbmon
 | |
| servers=server1,server2,server3
 | |
| user=myuser
 | |
| password=mypwd
 | |
| ```
 | |
| 
 | |
| From MaxScale 2.2.1 onwards, the module name is `mariadbmon` instead of
 | |
| `mysqlmon`. The old name can still be used.
 | |
| 
 | |
| The `user` requires the REPLICATION CLIENT privilege to successfully monitor the
 | |
| state of the servers. SUPER privilege is required for cluster manipulation
 | |
| features such as failover.
 | |
| 
 | |
| ```
 | |
| MariaDB [(none)]> grant replication client on *.* to 'maxscale'@'maxscalehost';
 | |
| Query OK, 0 rows affected (0.00 sec)
 | |
| ```
 | |
| 
 | |
| ## Common Monitor Parameters
 | |
| 
 | |
| For a list of optional parameters that all monitors support, read the
 | |
| [Monitor Common](Monitor-Common.md) document.
 | |
| 
 | |
| ## MariaDB Monitor optional parameters
 | |
| 
 | |
| These are optional parameters specific to the MariaDB Monitor. Failover,
 | |
| switchover and rejoin-specific parameters are listed in their own
 | |
| [section](#cluster-manipulation-operations).
 | |
| 
 | |
| ### `assume_unique_hostnames`
 | |
| 
 | |
| Boolean, default: ON. When active, the monitor assumes that server hostnames and
 | |
| ports are consistent between the MaxScale configuration file server definitions
 | |
| and the "SHOW ALL SLAVES STATUS" outputs of the servers. Specifically, the
 | |
| monitor assumes that if server A is replicating from server B, then A must have
 | |
| a slave connection with `Master_Host` and `Master_Port` equal to B's address and
 | |
| port in the configuration file. If this is not the case, e.g. an IP is used in
 | |
| the server while a hostname is given in the file, the monitor will misinterpret
 | |
| the topology.
 | |
| 
 | |
| This setting must be ON to use any cluster operation features such as failover
 | |
| or switchover, because MaxScale uses the addresses and ports in the
 | |
| configuration file when issuing "CHANGE MASTER TO"-commands.
 | |
| 
 | |
| If the network configuration is such that the addresses MaxScale uses to connect
 | |
| to backends are different from the ones the servers use to connect to each
 | |
| other, `assume_unique_hostnames` should be set to OFF. In this mode, MaxScale
 | |
| uses server id:s it queries from the servers and the `Master_Server_Id` fields
 | |
| of the slave connections to deduce which server is replicating from which. This
 | |
| is not perfect though, since MaxScale doesn't know the id:s of servers it has
 | |
| never connected to (e.g. server has been down since MaxScale was started). Also,
 | |
| the `Master_Server_Id`-field may have an incorrect value if the slave connection
 | |
| has not been established. MaxScale will only trust the value if the monitor has
 | |
| seen the slave connection IO thread connected at least once. If this is not the
 | |
| case, the slave connection is ignored.
 | |
| 
 | |
| ### `detect_replication_lag`
 | |
| 
 | |
| Deprecated and unused as of MaxScale 2.3. Can be defined but is ignored.
 | |
| 
 | |
| Is effectively always on. The monitor uses the "Seconds_Behind_Master"-field of
 | |
| "SHOW SLAVE STATUS" to get the replication lag.
 | |
| 
 | |
| ### `detect_stale_master`
 | |
| 
 | |
| Allow previous master to be available even in case of stopped or misconfigured
 | |
| replication.
 | |
| 
 | |
| Starting from MaxScale 2.0.0 this feature is enabled by default. It is disabled
 | |
| by default in MaxScale 1.4.3 and below.
 | |
| 
 | |
| This allows services that depend on master and slave roles to continue
 | |
| functioning as long as the master server is available. This is a situation
 | |
| which can happen if all slave servers are unreachable or the replication
 | |
| breaks for some reason.
 | |
| 
 | |
| ```
 | |
| detect_stale_master=true
 | |
| ```
 | |
| 
 | |
| ### `detect_stale_slave`
 | |
| 
 | |
| Treat running slaves servers without a master server as valid slave servers.
 | |
| 
 | |
| This feature is enabled by default.
 | |
| 
 | |
| If a slave server loses its master server, the replication is considered broken.
 | |
| With this parameter, slaves that have lost their master but have been slaves of
 | |
| a master server can retain their slave status even without a master. This means
 | |
| that when a slave loses its master, it can still be used for reads.
 | |
| 
 | |
| If this feature is disabled, a server is considered a valid slave if and only if
 | |
| it has a running master server monitored by this monitor.
 | |
| 
 | |
| ```
 | |
| detect_stale_slave=true
 | |
| ```
 | |
| 
 | |
| ### `mysql51_replication`
 | |
| 
 | |
| Deprecated and unused as of MaxScale 2.3. Can be defined but is ignored.
 | |
| 
 | |
| ### `multimaster`
 | |
| 
 | |
| Deprecated and unused as of MaxScale 2.3. Can be defined but is ignored.
 | |
| 
 | |
| ### `ignore_external_masters`
 | |
| 
 | |
| Ignore any servers that are not monitored by this monitor but are a part of the
 | |
| replication topology. This option was added in MaxScale 2.1.12 and is disabled
 | |
| by default.
 | |
| 
 | |
| MaxScale detects if a master server replicates from an external server. When
 | |
| this is detected, the server is assigned the `Slave` and `Slave of External
 | |
| Server` labels and will be treated as a slave server. Most of the time this
 | |
| topology is used when MaxScale is used for read scale-out without master
 | |
| servers, a Galera cluster with read replicas being a prime example of this
 | |
| setup. Sometimes this is not the desired behavior and the external master server
 | |
| should be ignored. Most of the time this is due to multi-source replication.
 | |
| 
 | |
| When this option is enabled, all servers that have the `Master, Slave, Slave of
 | |
| External Server, Running` labels will instead get the `Master, Running` labels.
 | |
| 
 | |
| ### `detect_standalone_master`
 | |
| 
 | |
| Detect standalone master servers. This feature takes a boolean parameter and is
 | |
| enabled by default.
 | |
| 
 | |
| This setting controls whether a standalone server can be a master. A standalone
 | |
| server is a server from which no other server in the cluster is attempting to
 | |
| replicate from. In most cases this should be left on.
 | |
| 
 | |
| ### `failcount`
 | |
| 
 | |
| Number of consecutive monitor passes a master server must be down before it is
 | |
| considered failed. At this point, automatic failover is performed if enabled
 | |
| (`auto_failover=true`). If automatic failover is not on, the monitor will try to
 | |
| search for another server to fultill the master role. See section
 | |
| [Master selection](#master-selection)
 | |
| for more details. Changing the master may break replication as queries could be
 | |
| routed to a server without previous events. To prevent this, avoid having
 | |
| multiple valid master servers in the cluster.
 | |
| 
 | |
| The default value is 5 failures.
 | |
| 
 | |
| ### `allow_cluster_recovery`
 | |
| 
 | |
| Deprecated and unused as of MaxScale 2.3. Can be defined but is ignored.
 | |
| 
 | |
| ### `enforce_read_only_slaves`
 | |
| 
 | |
| This feature is disabled by default. If set to ON, the monitor attempts to set
 | |
| the server `read_only` flag to ON on any slave server with `read_only` OFF. The
 | |
| flag is checked at every monitor iteration. The monitor user requires the
 | |
| SUPER-privilege for this feature to work. While the `read_only`-flag is ON, only
 | |
| users with the SUPER-privilege can write to the backend server. If temporary
 | |
| write access is required, this feature should be disabled before attempting to
 | |
| disable `read_only`. Otherwise the monitor would quickly re-enable it.
 | |
| 
 | |
| ### `maintenance_on_low_disk_space`
 | |
| 
 | |
| This feature is enabled by default. If a running server that is not the master
 | |
| or a relay master is out of disk space the server is set to maintenance mode.
 | |
| Such servers are not used for router sessions and are ignored when performing a
 | |
| failover or other cluster modification operation. See the general monitor
 | |
| parameters [disk_space_threshold](./Monitor-Common.md#disk_space_threshold) and
 | |
| [disk_space_check_interval](./Monitor-Common.md#disk_space_check_interval)
 | |
| on how to enable disk space monitoring.
 | |
| 
 | |
| Once a server has been put to maintenance mode, the disk space situation
 | |
| of that server is no longer updated. The server will not be taken out of
 | |
| maintanance mode even if more disk space becomes available. The maintenance
 | |
| flag must be removed manually:
 | |
| ```
 | |
| maxadmin clear server server2 Maint
 | |
| maxctrl clear server server2 Maint
 | |
| ```
 | |
| 
 | |
| ## Cluster manipulation operations
 | |
| 
 | |
| Starting with MaxScale 2.2.1, MariaDB Monitor supports replication cluster
 | |
| modification. The operations implemented are:
 | |
| - _failover_, which replaces a failer master with a slave
 | |
| - _switchover_, which swaps a running master with a slave
 | |
| - _rejoin_, which directs servers to replicate from the master
 | |
| - _reset-replication_ (added in MaxScale 2.3.0), which deletes binary logs and
 | |
| resets gtid:s
 | |
| 
 | |
| See [operation details](#operation-details) for more information on the
 | |
| implementation of the commands.
 | |
| 
 | |
| The cluster operations require that the monitor user (`user`) has the following
 | |
| privileges:
 | |
| 
 | |
| - SUPER, to modify slave connections and set globals such as *read\_only*
 | |
| - REPLICATION CLIENT, to list slave connections
 | |
| - RELOAD, to flush binary logs
 | |
| - PROCESS, to check if the *event\_scheduler* process is running
 | |
| - SHOW DATABASES and EVENTS, to list and modify server events
 | |
| 
 | |
| In addition, the monitor needs to know which username and password a
 | |
| slave should use when starting replication. These are given in
 | |
| `replication_user` and `replication_password`.
 | |
| 
 | |
| The user can define files with SQL statements which are executed on any server
 | |
| being demoted or promoted by cluster manipulation commands. See the sections on
 | |
| `promotion_sql_file` and `demotion_sql_file` for more information.
 | |
| 
 | |
| The monitor can manipulate scheduled server events when promoting or demoting a
 | |
| server. See the section on `handle_server_events` for more information.
 | |
| 
 | |
| All cluster operations can be activated manually through MaxAdmin/MaxCtrl. See
 | |
| section [Manual activation](#manual-activation) for more details.
 | |
| 
 | |
| ### Operation details
 | |
| 
 | |
| **Failover** replaces a failed master with a running slave. It does the
 | |
| following:
 | |
| 
 | |
| 1. Select the most up-to-date slave of the old master to be the new master. The
 | |
| selection criteria is as follows in descending priority:
 | |
|       1. gtid_IO_pos (latest event in relay log)
 | |
|       2. gtid_current_pos (most processed events)
 | |
|       3. log_slave_updates is on
 | |
|       4. disk space is not low
 | |
| 2. If the new master has unprocessed relay log items, cancel and try again
 | |
| later.
 | |
| 3. Prepare the new master:
 | |
|       1. Remove the slave connection the new master used to replicate from the old
 | |
|       master.
 | |
|       2. Disable the *read\_only*-flag.
 | |
|       3. Enable scheduled server events (if event handling is on). Only events that were
 | |
|       enabled on the old master are enabled.
 | |
|       4. Run the commands in `promotion_sql_file`.
 | |
|       5. Start replication from external master if one existed.
 | |
| 4. Redirect all other slaves to replicate from the new master:
 | |
|       1. STOP SLAVE and RESET SLAVE
 | |
|       2. CHANGE MASTER TO
 | |
|       3. START SLAVE
 | |
| 5. Check that all slaves are replicating.
 | |
| 
 | |
| Failover is considered successful if steps 1 to 3 succeeded, as the cluster then has at
 | |
| least a valid master server.
 | |
| 
 | |
| **Switchover** swaps a running master with a running slave. It does the
 | |
| following:
 | |
| 
 | |
| 1. Prepare the old master for demotion:
 | |
|       1. Stop any external replication.
 | |
|       2. Kill connections from super-users since *read\_only* does not affect them.
 | |
|       3. Enable the *read\_only*-flag to stop writes.
 | |
|       4. Disable scheduled server events (if event handling is on).
 | |
|       5. Run the commands in `demotion_sql_file`.
 | |
|       6. Flush the binary log (FLUSH LOGS) so that all events are on disk.
 | |
| 2. Wait for the new master to catch up with the old master.
 | |
| 3. Promote new master and redirect slaves as in failover steps 3 and 4. Also
 | |
| redirect the demoted old master.
 | |
| 4. Check that all slaves are replicating.
 | |
| 
 | |
| Similar to failover, switchover is considered successful if the new master was
 | |
| successfully promoted.
 | |
| 
 | |
| **Rejoin** joins a standalone server to the cluster or redirects a slave
 | |
| replicating from a server other than the master. A standalone server is joined
 | |
| by:
 | |
| 
 | |
| 1. Run the commands in `demotion_sql_file`.
 | |
| 2. Enable the *read\_only*-flag.
 | |
| 3. Disable scheduled server events (if event handling is on).
 | |
| 4. Start replication: CHANGE MASTER TO and START SLAVE.
 | |
| 
 | |
| A server which is replicating from the wrong master is redirected simply with
 | |
| STOP SLAVE, RESET SLAVE, CHANGE MASTER TO and START SLAVE commands.
 | |
| 
 | |
| **Reset-replication** (added in MaxScale 2.3.0) deletes binary logs and resets
 | |
| gtid:s. This destructive command is meant for situations where the gtid:s in the
 | |
| cluster are out of sync while the actual data is known to be in sync. The
 | |
| operation  proceeds as follows:
 | |
| 
 | |
| 1. Reset gtid:s and delete binary logs on all servers:
 | |
|       1. Stop (STOP SLAVE) and delete (RESET SLAVE ALL) all slave connections.
 | |
|       2. Enable the *read\_only*-flag.
 | |
|       3. Disable scheduled server events (if event handling is on).
 | |
|       3. Delete binary logs (RESET MASTER).
 | |
|       4. Set the sequence number of *gtid\_slave\_pos* to zero. This also affects
 | |
|  *gtid\_current\_pos*.
 | |
| 2. Prepare new master:
 | |
|       1. Disable the *read\_only*-flag.
 | |
|       2. Enable scheduled server events (if event handling is on). Events are only enabled
 | |
|       if the cluster had a master server when starting the reset-replication operation.
 | |
|       Only events that were enabled on the previous master are enabled on the new.
 | |
| 3. Direct other servers to replicate from the new master as in the other
 | |
| operations.
 | |
| 
 | |
| ### Manual activation
 | |
| 
 | |
| Cluster operations can be activated manually through the REST API, MaxCtrl or
 | |
| MaxAdmin. The commands are only performed when MaxScale is in active mode. The
 | |
| commands generally match their automatic versions. The exception is _rejoin_, in
 | |
| which the manual command allows rejoining even when the joining server has empty
 | |
| gtid:s. This rule allows the user to force a rejoin on a server without binary
 | |
| logs.
 | |
| 
 | |
| All commands require the monitor instance name as the first parameter. Failover
 | |
| selects the new master server automatically and does not require additional
 | |
| parameters. Rejoin requires the name of the joining server as second parameter.
 | |
| Replication reset accepts the name of the new master server as second parameter.
 | |
| If not given, the current master is selected.
 | |
| 
 | |
| Switchover takes one to three parameters. If only the monitor name is given,
 | |
| switchover will autoselect both the slave to promote and the current master as
 | |
| the server to be demoted. If two parameters are given, the second parameter is
 | |
| interpreted as the slave to promote. If three parameters are given, the third
 | |
| parameter is interpreted as the current master. The user-given current master is
 | |
| compared to the master server currently deduced by the monitor and if the two
 | |
| are unequal, an error is given.
 | |
| 
 | |
| Example commands are below:
 | |
| ```
 | |
| call command mariadbmon failover MyMonitor
 | |
| call command mariadbmon rejoin MyMonitor OldMasterServ
 | |
| call command mariadbmon reset-replication MyMonitor
 | |
| call command mariadbmon reset-replication MyMonitor NewMasterServ
 | |
| call command mariadbmon switchover MyMonitor
 | |
| call command mariadbmon switchover MyMonitor NewMasterServ
 | |
| call command mariadbmon switchover MyMonitor NewMasterServ OldMasterServ
 | |
| ```
 | |
| 
 | |
| The commands follow the standard module command syntax. All require the monitor
 | |
| configuration name (MyMonitor) as the first parameter. For switchover, the
 | |
| last two parameters define the server to promote (NewMasterServ) and the server
 | |
| to demote (OldMasterServ). For rejoin, the server to join (OldMasterServ) is
 | |
| required. Replication reset requires the server to promote (NewMasterServ).
 | |
| 
 | |
| It is safe to perform manual operations even with automatic failover, switchover
 | |
| or rejoin enabled since automatic operations cannot happen simultaneously
 | |
| with manual ones.
 | |
| 
 | |
| If a switchover or failover fails, automatic failover is disabled to prevent
 | |
| master changes to a possibly malfunctioning cluster. Automatic failover can be
 | |
| turned on manually via the REST API or MaxAdmin. Example commands are listed
 | |
| below.
 | |
| 
 | |
| ```
 | |
| maxadmin alter monitor MariaDB-Monitor auto_failover=true
 | |
| maxctrl alter monitor MariaDB-Monitor auto_failover true
 | |
| ```
 | |
| 
 | |
| When a cluster modification is iniated via the REST-API, the URL path is of the
 | |
| form:
 | |
| ```
 | |
| /v1/maxscale/modules/mariadbmon/<operation>?<monitor-instance>&<server-param1>&<server-param2>
 | |
| ```
 | |
| - `<operation>` is the name of the command: _failover_, _switchover_, _rejoin_
 | |
| or _reset-replication_.
 | |
| - `<monitor-instance>` is the monitor section name from the MaxScale
 | |
| configuration file.
 | |
| - `<server-param1>` and `<server-param2>` are server parameters as described
 | |
| above for MaxAdmin. Only _switchover_ accepts both, _failover_ doesn't need any
 | |
| and both _rejoin_ and _reset-replication_ accept one.
 | |
| 
 | |
| Given a MaxScale configuration file like
 | |
| ```
 | |
| [Cluster1]
 | |
| type=monitor
 | |
| module=mariadbmon
 | |
| servers=server1, server2, server3, server 4
 | |
| ...
 | |
| ```
 | |
| with the assumption that `server2` is the current master, then the URL
 | |
| path for making `server4` the new master would be:
 | |
| ```
 | |
| /v1/maxscale/modules/mariadbmon/switchover?Cluster1&server4&server2
 | |
| ```
 | |
| 
 | |
| Example REST-API paths for other commands are listed below.
 | |
| ```
 | |
| /v1/maxscale/modules/mariadbmon/failover?Cluster1
 | |
| /v1/maxscale/modules/mariadbmon/rejoin?Cluster1&server3
 | |
| /v1/maxscale/modules/mariadbmon/reset-replication?Cluster1&server3
 | |
| ```
 | |
| 
 | |
| ### Automatic activation
 | |
| 
 | |
| Failover can activate automatically if `auto_failover` is on. The activation
 | |
| begins when the master has been down at least `failcount` monitor iterations.
 | |
| Before modifying the cluster, the monitor checks that all prerequisites for the
 | |
| failover are fulfilled. If the cluster does not seem ready, an error is printed
 | |
| and the cluster is rechecked during the next monitor iteration.
 | |
| 
 | |
| Switchover can also activate automatically with the
 | |
| `switchover_on_low_disk_space`-setting. The operation begins if the master
 | |
| server is low on disk space but otherwise the operating logic is quite similar
 | |
| to automatic failover.
 | |
| 
 | |
| Rejoin stands for starting replication on a standalone server or redirecting a
 | |
| slave replicating from the wrong master (any server that is not the cluster
 | |
| master). The rejoined servers are directed to replicate from the current cluster
 | |
| master server, forcing the replication topology to a 1-master-N-slaves
 | |
| configuration.
 | |
| 
 | |
| A server is categorized as standalone if the server has no slave connections,
 | |
| not even stopped ones. A server is replicating from the wrong master if the
 | |
| slave IO thread is connected but the master server id seen by the slave does not
 | |
| match the cluster master id. Alternatively, the IO thread may be stopped or
 | |
| connecting but the master server host or port information differs from the
 | |
| cluster master info. These criteria mean that a STOP SLAVE does not yet set a
 | |
| slave as standalone.
 | |
| 
 | |
| With `auto_rejoin` active, the monitor will try to rejoin any servers matching
 | |
| the above requirements. Rejoin does not obey `failcount` and will attempt to
 | |
| rejoin any valid servers immediately. When activating rejoin manually, the
 | |
| user-designated server must fulfill the same requirements.
 | |
| 
 | |
| ### Limitations and requirements
 | |
| 
 | |
| Switchover and failover only understand simple topologies. They will not work if
 | |
| the cluster has multiple masters, relay masters, or if the topology is circular.
 | |
| The server cluster is assumed to be well-behaving with no significant
 | |
| replication lag and all commands that modify the cluster complete in a few
 | |
| seconds (faster than `backend_read_timeout` and `backend_write_timeout`).
 | |
| 
 | |
| The backends must all use GTID-based replication, and the domain id should not
 | |
| change during a switchover or failover. Master and slaves must have
 | |
| well-behaving GTIDs with no extra events on slave servers.
 | |
| 
 | |
| Failover cannot be performed if MaxScale was started only after the master
 | |
| server went down. This is because MaxScale needs reliable information on the
 | |
| gtid domain of the cluster and the replication topology in general to properly
 | |
| select the new master.
 | |
| 
 | |
| Failover may lose events. If a master goes down before sending new events to at
 | |
| least one slave, those events are lost when a new master is chosen. If the old
 | |
| master comes back online, the other servers have likely moved on with a
 | |
| diverging history and the old master can no longer join the replication cluster.
 | |
| To minimize the chance for this happening, use
 | |
| [semisynchronous replication](https://mariadb.com/kb/en/library/semisynchronous-replication/).
 | |
| 
 | |
| Switchover requires that the cluster is "frozen" for the duration of the
 | |
| operation. This means that no data modifying statements such as INSERT or UPDATE
 | |
| are executed and the GTID position of the master server is stable. When
 | |
| switchover begins, the monitor sets the global *read_only* flag on the old
 | |
| master backend to stop any updates. *read_only* does not affect users with the
 | |
| SUPER-privilege so any such user can issue writes during a switchover. These
 | |
| writes have a high chance of breaking replication, because the write may not be
 | |
| replicated to all slaves before they switch to the new master. To prevent this,
 | |
| any users who commonly do updates should not have the SUPER-privilege. For even
 | |
| more security, the only SUPER-user session during a switchover should be the
 | |
| MaxScale monitor user.
 | |
| 
 | |
| When mixing rejoin with failover/switchover, the backends should have
 | |
| *log_slave_updates* on. The rejoining server is likely lagging behind the rest
 | |
| of the cluster. If the current cluster master does not have binary logs from the
 | |
| moment the rejoining server lost connection, the rejoining server cannot
 | |
| continue replication. This is an issue if the master has changed and
 | |
| the new master does not have *log_slave_updates* on.
 | |
| 
 | |
| If an automatic cluster operation such as auto-failover or auto-rejoin fails,
 | |
| all cluster modifying operations are disabled for `failcount` monitor iterations,
 | |
| after which the operation may be retried. Similar logic applies if the cluster is
 | |
| unsuitable for such operations, e.g. replication is not using GTID.
 | |
| 
 | |
| ### External master support
 | |
| 
 | |
| The monitor detects if a server in the cluster is replicating from an external
 | |
| master (a server that is not monitored by the monitor). If the replicating
 | |
| server is the cluster master server, then the cluster itself is considered to
 | |
| have an external master.
 | |
| 
 | |
| If a failover/switchover happens, the new master server is set to replicate from
 | |
| the cluster external master server. The usename and password for the replication
 | |
| are defined in `replication_user` and `replication_password`. The address and
 | |
| port used are the ones shown by `SHOW ALL SLAVES STATUS` on the old cluster
 | |
| master server. In the case of switchover, the old master also stops replicating
 | |
| from the external server to preserve the topology.
 | |
| 
 | |
| After failover the new master is replicating from the external master. If the
 | |
| failed old master comes back online, it is also replicating from the external
 | |
| server. To normalize the situation, either have *auto_rejoin* on or manually
 | |
| execute a rejoin. This will redirect the old master to the current cluster
 | |
| master.
 | |
| 
 | |
| ### Configuration parameters
 | |
| 
 | |
| #### `auto_failover`
 | |
| 
 | |
| Enable automated master failover. This parameter expects a boolean value and the
 | |
| default value is false.
 | |
| 
 | |
| When automatic failover is enabled, traditional MariaDB Master-Slave clusters
 | |
| will automatically elect a new master if the old master goes down and stays down
 | |
| a number of iterations given in `failcount`. Failover will not take place when
 | |
| MaxScale is configured as a passive instance. For details on how MaxScale
 | |
| behaves in passive mode, see the documentation on `failover_timeout` below.
 | |
| 
 | |
| The monitor user must have the SUPER and RELOAD privileges for failover to work.
 | |
| 
 | |
| #### `auto_rejoin`
 | |
| 
 | |
| Enable automatic joining of server to the cluster. This parameter expects a
 | |
| boolean value and the default value is false.
 | |
| 
 | |
| When enabled, the monitor will attempt to direct standalone servers and servers
 | |
| replicating from a relay master to the main cluster master server, enforcing a
 | |
| 1-master-N-slaves configuration.
 | |
| 
 | |
| For example, consider the following event series.
 | |
| 
 | |
| 1. Slave A goes down
 | |
| 2. Master goes down and a failover is performed, promoting Slave B
 | |
| 3. Slave A comes back
 | |
| 
 | |
| Slave A is still trying to replicate from the downed master, since it wasn't
 | |
| online during failover. If `auto_rejoin` is on, Slave A will quickly be
 | |
| redirected to Slave B, the current master.
 | |
| 
 | |
| #### `switchover_on_low_disk_space`
 | |
| 
 | |
| This feature is disabled by default. If enabled, the monitor will attempt to
 | |
| switchover a master server low on disk space with a slave. The switch is only
 | |
| done if a slave without disk space issues is found. If
 | |
| `maintenance_on_low_disk_space` is also enabled, the old master (now a slave)
 | |
| will be put to maintenance during the next monitor iteration.
 | |
| 
 | |
| For this parameter to have any effect, `disk_space_threshold` must be specified
 | |
| for the [server](../Getting-Started/Configuration-Guide.md#disk_space_threshold)
 | |
| or the [monitor](./Monitor-Common.md#disk_space_threshold).
 | |
| Also, [disk_space_check_interval](./Monitor-Common.md#disk_space_check_interval)
 | |
| must be defined for the monitor.
 | |
| ```
 | |
| switchover_on_low_disk_space=true
 | |
| ```
 | |
| 
 | |
| #### `replication_user` and `replication_password`
 | |
| 
 | |
| The username and password of the replication user. These are given as the values
 | |
| for `MASTER_USER` and `MASTER_PASSWORD` whenever a `CHANGE MASTER TO` command is
 | |
| executed.
 | |
| 
 | |
| Both `replication_user` and `replication_password` parameters must be defined if
 | |
| a custom replication user is used. If neither of the parameters is defined, the
 | |
| `CHANGE MASTER TO`-command will use the monitor credentials for the replication
 | |
| user.
 | |
| 
 | |
| The credentials used for replication must have the `REPLICATION SLAVE`
 | |
| privilege.
 | |
| 
 | |
| `replication_password` uses the same encryption scheme as other password
 | |
| parameters. If password encryption is in use, `replication_password` must be
 | |
| encrypted with the same key to avoid erroneous decryption.
 | |
| 
 | |
| #### `replication_master_ssl`
 | |
| 
 | |
| Type: bool Default: off
 | |
| 
 | |
| If set to ON, any `CHANGE MASTER TO`-command generated will set `MASTER_SSL=1` to enable
 | |
| encryption for the replication stream. This setting should only be enabled if the backend
 | |
| servers are configured for ssl. This typically means setting *ssl_ca*, *ssl_cert* and
 | |
| *ssl_key* in the server configuration file. Additionally, credentials for the replication
 | |
| user should require an encrypted connection (`e.g. ALTER USER repl@'%' REQUIRE SSL;`).
 | |
| 
 | |
| If the setting is left OFF, `MASTER_SSL` is not set at all, which will preserve existing
 | |
| settings when redirecting a slave connection.
 | |
| 
 | |
| #### `failover_timeout` and `switchover_timeout`
 | |
| 
 | |
| Time limit for failover and switchover operations. The default
 | |
| values are 90 seconds for both. `switchover_timeout` is also used as the time
 | |
| limit for a rejoin operation. Rejoin should rarely time out, since it is a
 | |
| faster operation than switchover.
 | |
| 
 | |
| The timeouts are specified as documented
 | |
| [here](../Getting-Started/Configuration-Guide.md#durations). If no explicit unit
 | |
| is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent
 | |
| versions a value without a unit may be rejected. Note that since the granularity
 | |
| of the timeouts is seconds, a timeout specified in milliseconds will be rejected,
 | |
| even if the duration is longer than a second.
 | |
| 
 | |
| If no successful failover/switchover takes place within the configured time
 | |
| period, a message is logged and automatic failover is disabled. This prevents
 | |
| further automatic modifications to the misbehaving cluster.
 | |
| 
 | |
| #### `verify_master_failure` and `master_failure_timeout`
 | |
| 
 | |
| Enable additional master failure verification for automatic failover.
 | |
| `verify_master_failure` is a boolean value (default: true) which enables this
 | |
| feature and `master_failure_timeout` defines the timeout (default: 10 seconds).
 | |
| 
 | |
| The master failure timeout is specified as documented
 | |
| [here](../Getting-Started/Configuration-Guide.md#durations). If no explicit unit
 | |
| is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent
 | |
| versions a value without a unit may be rejected. Note that since the granularity
 | |
| of the timeout is seconds, a timeout specified in milliseconds will be rejected,
 | |
| even if the duration is longer than a second.
 | |
| 
 | |
| Failure verification is performed by checking whether the slave servers are
 | |
| still connected to the master and receiving events. An event is either a change
 | |
| in the *Gtid_IO_Pos*-field of the `SHOW SLAVE STATUS` output or a heartbeat
 | |
| event. Effectively, if a slave has received an event within
 | |
| `master_failure_timeout` duration, the master is not considered down when
 | |
| deciding whether to failover, even if MaxScale cannot connect to the master.
 | |
| `master_failure_timeout` should be longer than the `Slave_heartbeat_period` of
 | |
| the slave connection to be effective.
 | |
| 
 | |
| If every slave loses its connection to the master (*Slave_IO_Running* is not
 | |
| "Yes"), master failure is considered verified regardless of timeout. This allows
 | |
| faster failover when the master properly disconnects.
 | |
| 
 | |
| For automatic failover to activate, the `failcount` requirement must also be
 | |
| met.
 | |
| 
 | |
| #### `servers_no_promotion`
 | |
| 
 | |
| This is a comma-separated list of server names that will not be chosen for
 | |
| master promotion during a failover or autoselected for switchover. This does not
 | |
| affect switchover if the user selects the server to promote. Using this setting
 | |
| can disrupt new master selection for failover such that an nonoptimal server is
 | |
| chosen. At worst, this will cause replication to break. Alternatively, failover
 | |
| may fail if all valid promotion candidates are in the exclusion list.
 | |
| 
 | |
| ```
 | |
| servers_no_promotion=backup_dc_server1,backup_dc_server2
 | |
| ```
 | |
| 
 | |
| #### `promotion_sql_file` and `demotion_sql_file`
 | |
| 
 | |
| These optional settings are paths to text files with SQL statements in them.
 | |
| During promotion or demotion, the contents are read line-by-line and executed on
 | |
| the backend. Use these settings to execute custom statements on the servers to
 | |
| complement the built-in operations.
 | |
| 
 | |
| Empty lines or lines starting with '#' are ignored. Any results returned by the
 | |
| statements are ignored. All statements must succeed for the failover, switchover
 | |
| or rejoin to continue. The monitor user may require additional privileges and
 | |
| grants for the custom commands to succeed.
 | |
| 
 | |
| When promoting a slave to master during switchover or failover, the
 | |
| `promotion_sql_file` is read and executed on the new master server after its
 | |
| read-only flag is disabled. The commands are ran *before* starting replication
 | |
| from an external master if any.
 | |
| 
 | |
| `demotion_sql_file` is ran on an old master during demotion to slave, before the
 | |
| old master starts replicating from the new master. The file is also ran before
 | |
| rejoining a standalone server to the cluster, as the standalone server is
 | |
| typically a former master server. When redirecting a slave replicating from a
 | |
| wrong master, the sql-file is not executed.
 | |
| 
 | |
| Since the queries in the files are ran during operations which modify
 | |
| replication topology, care is required. If `promotion_sql_file` contains data
 | |
| modification (DML) queries, the new master server may not be able to
 | |
| successfully replicate from an external master. `demotion_sql_file` should never
 | |
| contain DML queries, as these may not replicate to the slave servers before
 | |
| slave threads are stopped, breaking replication.
 | |
| 
 | |
| ```
 | |
| promotion_sql_file=/home/root/scripts/promotion.sql
 | |
| demotion_sql_file=/home/root/scripts/demotion.sql
 | |
| ```
 | |
| #### `handle_server_events`
 | |
| 
 | |
| This setting is on by default. If enabled, the monitor continuously queries the
 | |
| servers for enabled scheduled events and uses this information when performing
 | |
| cluster operations, enabling and disabling events as appropriate.
 | |
| 
 | |
| When a server is being demoted, any events with "ENABLED" status are set to
 | |
| "SLAVESIDE_DISABLED". When a server is being promoted to master, events that are either
 | |
| "SLAVESIDE_DISABLED" or "DISABLED" are set to "ENABLED" if the same event was also enabled
 | |
| on the old master server last time it was successfully queried. Events are considered
 | |
| identical if they have the same schema and name. When a standalone server is rejoined to
 | |
| the cluster, its events are also disabled since it is now a slave.
 | |
| 
 | |
| The monitor does not check whether the same events were disabled and enabled during a
 | |
| switchover or failover/rejoin. All events that meet the criteria above are altered.
 | |
| 
 | |
| The monitor does not enable or disable the event scheduler itself. For the
 | |
| events to run on the new master server, the scheduler should be enabled by the
 | |
| admin. Enabling it in the server configuration file is recommended.
 | |
| 
 | |
| Events running at high frequency may cause replication to break in a failover
 | |
| scenario. If an old master which was failed over restarts, its event scheduler
 | |
| will be on if set in the server configuration file. Its events will also
 | |
| remember their "ENABLED"-status and run when scheduled. This may happen before
 | |
| the monitor rejoins the server and disables the events. This should only be an
 | |
| issue for events running more often than the monitor interval or events that run
 | |
| immediately after the server has restarted.
 | |
| 
 | |
| ## Troubleshooting
 | |
| 
 | |
| ### Failover/switchover fails
 | |
| 
 | |
| Before performing failover or switchover, the MariaDB Monitor first checks that
 | |
| prerequisites are fulfilled, printing any found errors. This should catch and
 | |
| explain most issues with failover or switchover not working. If the operations
 | |
| are attempted and still fail, then most likely one of the commands the monitor
 | |
| issued to a server failed or timed out. The log should explain which query failed.
 | |
| To print out all queries sent to the servers, start MaxScale with
 | |
| `--debug=enable-statement-logging`. This setting prints all queries sent to the
 | |
| backends by monitors and authenticators.
 | |
| 
 | |
| A typical reason for failure is that a command such as `STOP SLAVE` takes longer than the
 | |
| `backend_read_timeout` of the monitor, causing the connection to break. As of 2.3, the
 | |
| monitor will retry most such queries if the failure was caused by a timeout. The retrying
 | |
| continues until the total time for a failover or switchover has been spent. If the log
 | |
| shows warnings or errors about commands timing out, increasing the backend timeout
 | |
| settings of the monitor should help. Another settings to look at are `query_retries` and
 | |
| `query_retry_timeout`. These are general MaxScale settings described in the
 | |
| [Configuration guide](../Getting-Started/Configuration-Guide.md). Setting
 | |
| `query_retries` to 2 is a reasonable first try.
 | |
| 
 | |
| ### Slave detection shows external masters
 | |
| 
 | |
| If a slave is shown in _maxadmin_ or _maxctrl_ as "Slave of External Server" instead of
 | |
| "Slave", the reason is likely that the "Master_Host"-setting of the replication connection
 | |
| does not match the MaxScale server definition. As of 2.3.2, the MariaDB Monitor by default
 | |
| assumes that the slave connections (as shown by `SHOW ALL SLAVES STATUS`) use the exact
 | |
| same "Master_Host" as used the MaxScale configuration file server definitions. This is
 | |
| controlled by the setting [assume_unique_hostnames](#assume_unique_hostnames).
 | |
| 
 | |
| ## Using the MariaDB Monitor With Binlogrouter
 | |
| 
 | |
| Since MaxScale 2.2 it's possible to detect a replication setup
 | |
| which includes Binlog Server: the required action is to add the
 | |
| binlog server to the list of servers only if _master_id_ identity is set.
 | |
| 
 | |
| For addition information read the
 | |
| [Replication Proxy](../Tutorials/Replication-Proxy-Binlog-Router-Tutorial.md)
 | |
| tutorial.
 | |
| 
 | |
| ## Example 1 - Monitor script
 | |
| 
 | |
| Here is an example shell script which sends an email to an admin@my.org
 | |
| when a server goes down.
 | |
| 
 | |
| ```
 | |
| #!/usr/bin/env bash
 | |
| 
 | |
| #This script assumes that the local mail server is configured properly
 | |
| #The second argument is the event type
 | |
| event=${$2/.*=/}
 | |
| server=${$3/.*=/}
 | |
| message="A server has gone down at `date`."
 | |
| echo $message|mail -s "The event was $event for server $server." admin@my.org
 | |
| 
 | |
| ```
 | |
| 
 | |
| Here is a monitor configuration that only triggers the script when a master
 | |
| or a slave server goes down.
 | |
| 
 | |
| ```
 | |
| [Database-Monitor]
 | |
| type=monitor
 | |
| module=mariadbmon
 | |
| servers=server1,server2
 | |
| script=mail_to_admin.sh
 | |
| events=master_down,slave_down
 | |
| ```
 | |
| 
 | |
| When a master or a slave server goes down, the script is executed, a mail is
 | |
| sent and the administrator will be immediately notified of any possible
 | |
| problems.  This is just a simple example showing what you can do with MaxScale
 | |
| and monitor scripts.
 | 
