299 lines
		
	
	
		
			22 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			299 lines
		
	
	
		
			22 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
# Automatic Failover With MariaDB Monitor
 | 
						|
 | 
						|
The [MariaDB Monitor](../Monitors/MariaDB-Monitor.md) is not only capable
 | 
						|
of monitoring the state of a MariaDB master-slave cluster but is also
 | 
						|
capable of performing _failover_ and _switchover_. In addition, in some
 | 
						|
circumstances it is capable of _rejoining_ a master that has gone down and
 | 
						|
later reappears.
 | 
						|
 | 
						|
Note that the failover (and switchover and rejoin) functionality is only
 | 
						|
supported in conjunction with GTID-based replication and initially only
 | 
						|
for simple topologies, that is, 1 master and several slaves.
 | 
						|
 | 
						|
The failover, switchover and rejoin functionality are inherent parts of
 | 
						|
the _MariaDB Monitor_, but neither automatic failover nor automatic rejoin
 | 
						|
are enabled by default.
 | 
						|
 | 
						|
The following examples have been written with the assumption that there
 | 
						|
are four servers - `server1`, `server2`, `server3` and `server4` - of
 | 
						|
which `server1` is the initial master and the other servers are slaves.
 | 
						|
In addition there is a monitor called _TheMonitor_ that monitors those
 | 
						|
servers.
 | 
						|
 | 
						|
Somewhat simplified, the MaxScale configuration file would look like:
 | 
						|
```
 | 
						|
[server1]
 | 
						|
type=server
 | 
						|
address=192.168.121.51
 | 
						|
port=3306
 | 
						|
protocol=MariaDBBackend
 | 
						|
 | 
						|
[server2]
 | 
						|
...
 | 
						|
 | 
						|
[server3]
 | 
						|
...
 | 
						|
 | 
						|
[server4]
 | 
						|
...
 | 
						|
 | 
						|
[TheMonitor]
 | 
						|
type=monitor
 | 
						|
module=mariadbmon
 | 
						|
servers=server1,server2,server3,server4
 | 
						|
...
 | 
						|
```
 | 
						|
# Manual Failover
 | 
						|
If everything is in order, the state of the cluster will look something
 | 
						|
like this:
 | 
						|
```
 | 
						|
$ maxctrl list servers
 | 
						|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
 | 
						|
│ Server  │ Address         │ Port │ Connections │ State           │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server1 │ 192.168.121.51  │ 3306 │ 0           │ Master, Running │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server2 │ 192.168.121.190 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server3 │ 192.168.121.112 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server4 │ 192.168.121.201 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
 | 
						|
```
 | 
						|
If the master now for any reason goes down, then the cluster state will
 | 
						|
look like this:
 | 
						|
```
 | 
						|
$ maxctrl list servers
 | 
						|
┌─────────┬─────────────────┬──────┬─────────────┬────────────────┐
 | 
						|
│ Server  │ Address         │ Port │ Connections │ State          │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼────────────────┤
 | 
						|
│ server1 │ 192.168.121.51  │ 3306 │ 0           │ Down           │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼────────────────┤
 | 
						|
│ server2 │ 192.168.121.190 │ 3306 │ 0           │ Slave, Running │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼────────────────┤
 | 
						|
│ server3 │ 192.168.121.112 │ 3306 │ 0           │ Slave, Running │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼────────────────┤
 | 
						|
│ server4 │ 192.168.121.201 │ 3306 │ 0           │ Slave, Running │
 | 
						|
└─────────┴─────────────────┴──────┴─────────────┴────────────────┘
 | 
						|
```
 | 
						|
Note that the status for `server1` is _Down_.
 | 
						|
 | 
						|
Since failover is by default _not_ enabled, the failover mechanism must be
 | 
						|
invoked manually:
 | 
						|
```
 | 
						|
$ maxctrl call command mariadbmon failover TheMonitor
 | 
						|
OK
 | 
						|
```
 | 
						|
There are quite a few arguments, so let's look at each one separately
 | 
						|
* `call command` indicates that it is a module command that is to be
 | 
						|
   invoked,
 | 
						|
* `mariadbmon` indicates the module whose command we want to invoke (that
 | 
						|
is the MariaDB Monitor),
 | 
						|
* `failover` is the command we want to invoke, and
 | 
						|
* `TheMonitor` is the first and only argument to that command, the name of
 | 
						|
the monitor as specified in the configuration file.
 | 
						|
 | 
						|
The MariaDB Monitor will now autonomously deduce which slave is the most
 | 
						|
appropriate one to be promoted to master, promote it to master and modify
 | 
						|
the other slaves accordingly.
 | 
						|
 | 
						|
If we now check the cluster state we will see that one of the remaining
 | 
						|
slaves has been made into master.
 | 
						|
 | 
						|
```
 | 
						|
$ maxctrl list servers
 | 
						|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
 | 
						|
│ Server  │ Address         │ Port │ Connections │ State           │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server1 │ 192.168.121.51  │ 3306 │ 0           │ Down            │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server2 │ 192.168.121.190 │ 3306 │ 0           │ Master, Running │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server3 │ 192.168.121.112 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server4 │ 192.168.121.201 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
 | 
						|
```
 | 
						|
If `server1` now reappears, it will not be rejoined to the cluster, as
 | 
						|
shown by the following output:
 | 
						|
```
 | 
						|
$ maxctrl list servers
 | 
						|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
 | 
						|
│ Server  │ Address         │ Port │ Connections │ State           │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server1 │ 192.168.121.51  │ 3306 │ 0           │ Running         │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server2 │ 192.168.121.190 │ 3306 │ 0           │ Master, Running │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server3 │ 192.168.121.112 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server4 │ 192.168.121.201 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
 | 
						|
```
 | 
						|
Had `auto_rejoin=true` been specified in the monitor section, then an
 | 
						|
attempt to rejoin `server1` would have been made.
 | 
						|
 | 
						|
In MaxScale 2.2.1, rejoining cannot be initiated manually, but in a
 | 
						|
subsequent version a command to that effect will be provided.
 | 
						|
 | 
						|
# Automatic Failover
 | 
						|
 | 
						|
To enable automatic failover, simply add `auto_failover=true` to the
 | 
						|
monitor section in the configuration file.
 | 
						|
```
 | 
						|
[TheMonitor]
 | 
						|
type=monitor
 | 
						|
module=mariadbmon
 | 
						|
servers=server1,server2,server3,server4
 | 
						|
auto_failover=true
 | 
						|
...
 | 
						|
```
 | 
						|
When everything is running fine, the cluster state looks like follows:
 | 
						|
```
 | 
						|
$ maxctrl list servers
 | 
						|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
 | 
						|
│ Server  │ Address         │ Port │ Connections │ State           │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server1 │ 192.168.121.51  │ 3306 │ 0           │ Master, Running │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server2 │ 192.168.121.190 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server3 │ 192.168.121.112 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server4 │ 192.168.121.201 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
 | 
						|
```
 | 
						|
If `server1` now goes down, failover will automatically be performed and
 | 
						|
an existing slave promoted to new master.
 | 
						|
```
 | 
						|
$ maxctrl list servers
 | 
						|
┌─────────┬─────────────────┬──────┬─────────────┬────────────────────────┐
 | 
						|
│ Server  │ Address         │ Port │ Connections │ State                  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼────────────────────────┤
 | 
						|
│ server1 │ 192.168.121.51  │ 3306 │ 0           │ Down                   │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼────────────────────────┤
 | 
						|
│ server2 │ 192.168.121.190 │ 3306 │ 0           │ Master, Slave, Running │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼────────────────────────┤
 | 
						|
│ server3 │ 192.168.121.112 │ 3306 │ 0           │ Slave, Running         │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼────────────────────────┤
 | 
						|
│ server4 │ 192.168.121.201 │ 3306 │ 0           │ Slave, Running         │
 | 
						|
└─────────┴─────────────────┴──────┴─────────────┴────────────────────────┘
 | 
						|
```
 | 
						|
If you are continuously monitoring the server states, you may notice for a
 | 
						|
brief period that the state of `server1` is _Down_ and the state of
 | 
						|
`server2` is still _Slave, Running_.
 | 
						|
 | 
						|
# Rejoin
 | 
						|
 | 
						|
To enable automatic rejoin, simply add `auto_rejoin=true` to the
 | 
						|
monitor secion in the configuration file.
 | 
						|
```
 | 
						|
[TheMonitor]
 | 
						|
type=monitor
 | 
						|
module=mariadbmon
 | 
						|
servers=server1,server2,server3,server4
 | 
						|
auto_rejoin=true
 | 
						|
...
 | 
						|
```
 | 
						|
 | 
						|
When automatic rejoin is enabled, the MariaDB Monitor will attempt to
 | 
						|
rejoin a failed master as a slave, if it reappears.
 | 
						|
 | 
						|
When everything is running fine, the cluster state looks like follows:
 | 
						|
```
 | 
						|
$ maxctrl list servers
 | 
						|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
 | 
						|
│ Server  │ Address         │ Port │ Connections │ State           │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server1 │ 192.168.121.51  │ 3306 │ 0           │ Master, Running │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server2 │ 192.168.121.190 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server3 │ 192.168.121.112 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server4 │ 192.168.121.201 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
 | 
						|
```
 | 
						|
Assuming `auto_failover=true` has been specified in the configuration
 | 
						|
file, when `server1` goes down for some reason, failover will be performed
 | 
						|
and we end up with the following cluster state:
 | 
						|
```
 | 
						|
$ maxctrl list servers
 | 
						|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
 | 
						|
│ Server  │ Address         │ Port │ Connections │ State           │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server1 │ 192.168.121.51  │ 3306 │ 0           │ Down            │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server2 │ 192.168.121.190 │ 3306 │ 0           │ Master, Running │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server3 │ 192.168.121.112 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server4 │ 192.168.121.201 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
 | 
						|
```
 | 
						|
If `server1` now reappears, the MariaDB Monitor will detect that and
 | 
						|
attempt to rejoin the old master as a slave.
 | 
						|
 | 
						|
Whether rejoining will succeed depends upon the actual state of the old
 | 
						|
master. For instance, if the old master was modified and the changes had
 | 
						|
not been replicated to the new master, before the old master went down,
 | 
						|
then automatic rejoin will not be possible.
 | 
						|
 | 
						|
If rejoining can be performed, then the cluster state will end up looking
 | 
						|
like:
 | 
						|
```
 | 
						|
$ maxctrl list servers
 | 
						|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
 | 
						|
│ Server  │ Address         │ Port │ Connections │ State           │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server1 │ 192.168.121.51  │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server2 │ 192.168.121.190 │ 3306 │ 0           │ Master, Running │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server3 │ 192.168.121.112 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server4 │ 192.168.121.201 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
 | 
						|
```
 | 
						|
 | 
						|
# Switchover
 | 
						|
 | 
						|
Switchover is for cases when you explicitly want to move the master
 | 
						|
role from one server to another.
 | 
						|
 | 
						|
If we continue from the cluster state at the end of the previous example
 | 
						|
and want to make `server1` master again, then we must issue the following
 | 
						|
command:
 | 
						|
```
 | 
						|
$ maxctrl call command mariadbmon switchover TheMonitor server1 server2
 | 
						|
OK
 | 
						|
```
 | 
						|
There are quite a few arguments, so let's look at each one separately
 | 
						|
* `call command` indicates that it is a module command that is to be
 | 
						|
   invoked,
 | 
						|
* `mariadbmon` indicates the module whose command we want to invoke,
 | 
						|
* `switchover` is the command we want to invoke, and
 | 
						|
* `TheMonitor` is the first argument to the command, the name of the monitor
 | 
						|
as specified in the configuration file,
 | 
						|
* `server1` is the second argument to the command, the name of the server we
 | 
						|
want to make into _master_, and
 | 
						|
* `server2` is the third argument to the command, the name of the _current_
 | 
						|
_master_.
 | 
						|
 | 
						|
If the command executes successfully, we will end up with the following
 | 
						|
cluster state:
 | 
						|
```
 | 
						|
$ maxctrl list servers
 | 
						|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
 | 
						|
│ Server  │ Address         │ Port │ Connections │ State           │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server1 │ 192.168.121.51  │ 3306 │ 0           │ Master, Running │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server2 │ 192.168.121.190 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server3 │ 192.168.121.112 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
 | 
						|
│ server4 │ 192.168.121.201 │ 3306 │ 0           │ Slave, Running  │
 | 
						|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
 | 
						|
```
 |