299 lines
22 KiB
Markdown
299 lines
22 KiB
Markdown
# Automatic Failover With MariaDB Monitor
|
|
|
|
The [MariaDB Monitor](../Monitors/MariaDB-Monitor.md) is not only capable
|
|
of monitoring the state of a MariaDB master-slave cluster but is also
|
|
capable of performing _failover_ and _switchover_. In addition, in some
|
|
circumstances it is capable of _rejoining_ a master that has gone down and
|
|
later reappears.
|
|
|
|
Note that the failover (and switchover and rejoin) functionality is only
|
|
supported in conjunction with GTID-based replication and initially only
|
|
for simple topologies, that is, 1 master and several slaves.
|
|
|
|
The failover, switchover and rejoin functionality are inherent parts of
|
|
the _MariaDB Monitor_, but neither automatic failover nor automatic rejoin
|
|
are enabled by default.
|
|
|
|
The following examples have been written with the assumption that there
|
|
are four servers - `server1`, `server2`, `server3` and `server4` - of
|
|
which `server1` is the initial master and the other servers are slaves.
|
|
In addition there is a monitor called _TheMonitor_ that monitors those
|
|
servers.
|
|
|
|
Somewhat simplified, the MaxScale configuration file would look like:
|
|
```
|
|
[server1]
|
|
type=server
|
|
address=192.168.121.51
|
|
port=3306
|
|
protocol=MariaDBBackend
|
|
|
|
[server2]
|
|
...
|
|
|
|
[server3]
|
|
...
|
|
|
|
[server4]
|
|
...
|
|
|
|
[TheMonitor]
|
|
type=monitor
|
|
module=mariadbmon
|
|
servers=server1,server2,server3,server4
|
|
...
|
|
```
|
|
# Manual Failover
|
|
If everything is in order, the state of the cluster will look something
|
|
like this:
|
|
```
|
|
$ maxctrl list servers
|
|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
|
|
│ Server │ Address │ Port │ Connections │ State │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server1 │ 192.168.121.51 │ 3306 │ 0 │ Master, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server2 │ 192.168.121.190 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server3 │ 192.168.121.112 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server4 │ 192.168.121.201 │ 3306 │ 0 │ Slave, Running │
|
|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
|
|
```
|
|
If the master now for any reason goes down, then the cluster state will
|
|
look like this:
|
|
```
|
|
$ maxctrl list servers
|
|
┌─────────┬─────────────────┬──────┬─────────────┬────────────────┐
|
|
│ Server │ Address │ Port │ Connections │ State │
|
|
├─────────┼─────────────────┼──────┼─────────────┼────────────────┤
|
|
│ server1 │ 192.168.121.51 │ 3306 │ 0 │ Down │
|
|
├─────────┼─────────────────┼──────┼─────────────┼────────────────┤
|
|
│ server2 │ 192.168.121.190 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼────────────────┤
|
|
│ server3 │ 192.168.121.112 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼────────────────┤
|
|
│ server4 │ 192.168.121.201 │ 3306 │ 0 │ Slave, Running │
|
|
└─────────┴─────────────────┴──────┴─────────────┴────────────────┘
|
|
```
|
|
Note that the status for `server1` is _Down_.
|
|
|
|
Since failover is by default _not_ enabled, the failover mechanism must be
|
|
invoked manually:
|
|
```
|
|
$ maxctrl call command mariadbmon failover TheMonitor
|
|
OK
|
|
```
|
|
There are quite a few arguments, so let's look at each one separately
|
|
* `call command` indicates that it is a module command that is to be
|
|
invoked,
|
|
* `mariadbmon` indicates the module whose command we want to invoke (that
|
|
is the MariaDB Monitor),
|
|
* `failover` is the command we want to invoke, and
|
|
* `TheMonitor` is the first and only argument to that command, the name of
|
|
the monitor as specified in the configuration file.
|
|
|
|
The MariaDB Monitor will now autonomously deduce which slave is the most
|
|
appropriate one to be promoted to master, promote it to master and modify
|
|
the other slaves accordingly.
|
|
|
|
If we now check the cluster state we will see that one of the remaining
|
|
slaves has been made into master.
|
|
|
|
```
|
|
$ maxctrl list servers
|
|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
|
|
│ Server │ Address │ Port │ Connections │ State │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server1 │ 192.168.121.51 │ 3306 │ 0 │ Down │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server2 │ 192.168.121.190 │ 3306 │ 0 │ Master, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server3 │ 192.168.121.112 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server4 │ 192.168.121.201 │ 3306 │ 0 │ Slave, Running │
|
|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
|
|
```
|
|
If `server1` now reappears, it will not be rejoined to the cluster, as
|
|
shown by the following output:
|
|
```
|
|
$ maxctrl list servers
|
|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
|
|
│ Server │ Address │ Port │ Connections │ State │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server1 │ 192.168.121.51 │ 3306 │ 0 │ Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server2 │ 192.168.121.190 │ 3306 │ 0 │ Master, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server3 │ 192.168.121.112 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server4 │ 192.168.121.201 │ 3306 │ 0 │ Slave, Running │
|
|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
|
|
```
|
|
Had `auto_rejoin=true` been specified in the monitor section, then an
|
|
attempt to rejoin `server1` would have been made.
|
|
|
|
In MaxScale 2.2.1, rejoining cannot be initiated manually, but in a
|
|
subsequent version a command to that effect will be provided.
|
|
|
|
# Automatic Failover
|
|
|
|
To enable automatic failover, simply add `auto_failover=true` to the
|
|
monitor section in the configuration file.
|
|
```
|
|
[TheMonitor]
|
|
type=monitor
|
|
module=mariadbmon
|
|
servers=server1,server2,server3,server4
|
|
auto_failover=true
|
|
...
|
|
```
|
|
When everything is running fine, the cluster state looks like follows:
|
|
```
|
|
$ maxctrl list servers
|
|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
|
|
│ Server │ Address │ Port │ Connections │ State │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server1 │ 192.168.121.51 │ 3306 │ 0 │ Master, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server2 │ 192.168.121.190 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server3 │ 192.168.121.112 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server4 │ 192.168.121.201 │ 3306 │ 0 │ Slave, Running │
|
|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
|
|
```
|
|
If `server1` now goes down, failover will automatically be performed and
|
|
an existing slave promoted to new master.
|
|
```
|
|
$ maxctrl list servers
|
|
┌─────────┬─────────────────┬──────┬─────────────┬────────────────────────┐
|
|
│ Server │ Address │ Port │ Connections │ State │
|
|
├─────────┼─────────────────┼──────┼─────────────┼────────────────────────┤
|
|
│ server1 │ 192.168.121.51 │ 3306 │ 0 │ Down │
|
|
├─────────┼─────────────────┼──────┼─────────────┼────────────────────────┤
|
|
│ server2 │ 192.168.121.190 │ 3306 │ 0 │ Master, Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼────────────────────────┤
|
|
│ server3 │ 192.168.121.112 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼────────────────────────┤
|
|
│ server4 │ 192.168.121.201 │ 3306 │ 0 │ Slave, Running │
|
|
└─────────┴─────────────────┴──────┴─────────────┴────────────────────────┘
|
|
```
|
|
If you are continuously monitoring the server states, you may notice for a
|
|
brief period that the state of `server1` is _Down_ and the state of
|
|
`server2` is still _Slave, Running_.
|
|
|
|
# Rejoin
|
|
|
|
To enable automatic rejoin, simply add `auto_rejoin=true` to the
|
|
monitor secion in the configuration file.
|
|
```
|
|
[TheMonitor]
|
|
type=monitor
|
|
module=mariadbmon
|
|
servers=server1,server2,server3,server4
|
|
auto_rejoin=true
|
|
...
|
|
```
|
|
|
|
When automatic rejoin is enabled, the MariaDB Monitor will attempt to
|
|
rejoin a failed master as a slave, if it reappears.
|
|
|
|
When everything is running fine, the cluster state looks like follows:
|
|
```
|
|
$ maxctrl list servers
|
|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
|
|
│ Server │ Address │ Port │ Connections │ State │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server1 │ 192.168.121.51 │ 3306 │ 0 │ Master, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server2 │ 192.168.121.190 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server3 │ 192.168.121.112 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server4 │ 192.168.121.201 │ 3306 │ 0 │ Slave, Running │
|
|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
|
|
```
|
|
Assuming `auto_failover=true` has been specified in the configuration
|
|
file, when `server1` goes down for some reason, failover will be performed
|
|
and we end up with the following cluster state:
|
|
```
|
|
$ maxctrl list servers
|
|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
|
|
│ Server │ Address │ Port │ Connections │ State │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server1 │ 192.168.121.51 │ 3306 │ 0 │ Down │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server2 │ 192.168.121.190 │ 3306 │ 0 │ Master, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server3 │ 192.168.121.112 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server4 │ 192.168.121.201 │ 3306 │ 0 │ Slave, Running │
|
|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
|
|
```
|
|
If `server1` now reappears, the MariaDB Monitor will detect that and
|
|
attempt to rejoin the old master as a slave.
|
|
|
|
Whether rejoining will succeed depends upon the actual state of the old
|
|
master. For instance, if the old master was modified and the changes had
|
|
not been replicated to the new master, before the old master went down,
|
|
then automatic rejoin will not be possible.
|
|
|
|
If rejoining can be performed, then the cluster state will end up looking
|
|
like:
|
|
```
|
|
$ maxctrl list servers
|
|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
|
|
│ Server │ Address │ Port │ Connections │ State │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server1 │ 192.168.121.51 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server2 │ 192.168.121.190 │ 3306 │ 0 │ Master, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server3 │ 192.168.121.112 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server4 │ 192.168.121.201 │ 3306 │ 0 │ Slave, Running │
|
|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
|
|
```
|
|
|
|
# Switchover
|
|
|
|
Switchover is for cases when you explicitly want to move the master
|
|
role from one server to another.
|
|
|
|
If we continue from the cluster state at the end of the previous example
|
|
and want to make `server1` master again, then we must issue the following
|
|
command:
|
|
```
|
|
$ maxctrl call command mariadbmon switchover TheMonitor server1 server2
|
|
OK
|
|
```
|
|
There are quite a few arguments, so let's look at each one separately
|
|
* `call command` indicates that it is a module command that is to be
|
|
invoked,
|
|
* `mariadbmon` indicates the module whose command we want to invoke,
|
|
* `switchover` is the command we want to invoke, and
|
|
* `TheMonitor` is the first argument to the command, the name of the monitor
|
|
as specified in the configuration file,
|
|
* `server1` is the second argument to the command, the name of the server we
|
|
want to make into _master_, and
|
|
* `server2` is the third argument to the command, the name of the _current_
|
|
_master_.
|
|
|
|
If the command executes successfully, we will end up with the following
|
|
cluster state:
|
|
```
|
|
$ maxctrl list servers
|
|
┌─────────┬─────────────────┬──────┬─────────────┬─────────────────┐
|
|
│ Server │ Address │ Port │ Connections │ State │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server1 │ 192.168.121.51 │ 3306 │ 0 │ Master, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server2 │ 192.168.121.190 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server3 │ 192.168.121.112 │ 3306 │ 0 │ Slave, Running │
|
|
├─────────┼─────────────────┼──────┼─────────────┼─────────────────┤
|
|
│ server4 │ 192.168.121.201 │ 3306 │ 0 │ Slave, Running │
|
|
└─────────┴─────────────────┴──────┴─────────────┴─────────────────┘
|
|
```
|