Add Keepalived tutorial
This commit is contained in:
@ -37,6 +37,7 @@ These tutorials are for specific use cases and module combinations.
|
|||||||
|
|
||||||
- [Administration Tutorial](Tutorials/Administration-Tutorial.md)
|
- [Administration Tutorial](Tutorials/Administration-Tutorial.md)
|
||||||
- [Avro Router Tutorial](Tutorials/Avrorouter-Tutorial.md)
|
- [Avro Router Tutorial](Tutorials/Avrorouter-Tutorial.md)
|
||||||
|
- [Failover with Keepalived](Tutorials/Failover-with-Keepalived.md)
|
||||||
- [Filter Tutorial](Tutorials/Filter-Tutorial.md)
|
- [Filter Tutorial](Tutorials/Filter-Tutorial.md)
|
||||||
- [Galera Cluster Connection Routing Tutorial](Tutorials/Galera-Cluster-Connection-Routing-Tutorial.md)
|
- [Galera Cluster Connection Routing Tutorial](Tutorials/Galera-Cluster-Connection-Routing-Tutorial.md)
|
||||||
- [Galera Gluster Read Write Splitting Tutorial](Tutorials/Galera-Cluster-Read-Write-Splitting-Tutorial.md)
|
- [Galera Gluster Read Write Splitting Tutorial](Tutorials/Galera-Cluster-Read-Write-Splitting-Tutorial.md)
|
||||||
|
190
Documentation/Tutorials/Failover-with-Keepalived.md
Normal file
190
Documentation/Tutorials/Failover-with-Keepalived.md
Normal file
@ -0,0 +1,190 @@
|
|||||||
|
# Failover with Keepalived
|
||||||
|
|
||||||
|
## Introduction
|
||||||
|
|
||||||
|
[Keepalived](http://www.keepalived.org/index.html) is a routing software for
|
||||||
|
load balancing and high-availability. It has several applications, but for this
|
||||||
|
tutorial the goal is to set up a simple IP failover between two servers running
|
||||||
|
MaxScale. If the main server fails the backup machine takes over, receiving any
|
||||||
|
new connections. The Keepalived settings used in this tutorial follow the
|
||||||
|
example given in [Simple keepalived failover setup on Ubuntu 14.04](
|
||||||
|
https://raymii.org/s/tutorials/Keepalived-Simple-IP-failover-on-Ubuntu.html).
|
||||||
|
|
||||||
|
Two hosts and one client machine are used, all in the same LAN. Hosts run
|
||||||
|
MaxScale and Keepalived. The backend servers may be running on one of the hosts,
|
||||||
|
e.g. in docker containers, or on separate machines for a more realistic setup.
|
||||||
|
Clients connect to the virtual IP (VIP), which is claimed by the current master
|
||||||
|
host.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
Once configured and running, the different Keepalived nodes continuously
|
||||||
|
broadcast their status to the network and listen for each other. If a node does
|
||||||
|
not receive a status message from another node with a higher priority than
|
||||||
|
itself, it will claim the VIP, effectively becoming the master. Thus, a node can
|
||||||
|
be put online or removed by starting and stopping the Keepalived service.
|
||||||
|
|
||||||
|
If the current master node is removed (e.g. by stopping the service or pulling
|
||||||
|
the network cable) the remaining nodes will quickly elect a new master and
|
||||||
|
future traffic to the VIP will be directed to that node. Any connections to the
|
||||||
|
old master node will naturally break. If the old master comes back online, it
|
||||||
|
will again claim the VIP, breaking any connections to the backup machine.
|
||||||
|
|
||||||
|
MaxScale has no knowledge of this even happening. Both MaxScales are running
|
||||||
|
normally, monitoring the backend servers and listening for client connections.
|
||||||
|
Since clients are connecting through the VIP, only the machine claiming the VIP
|
||||||
|
will receive incoming connections. The connections between MaxScale and the
|
||||||
|
backends are using real IPs and are unaffected by the VIP.
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
MaxScale does not require any specific configuration to work with Keepalived in
|
||||||
|
this simple setup, it just needs to be running on both hosts. The MaxScale
|
||||||
|
configurations should be similar to the extent that both look identical to
|
||||||
|
connecting clients. In practice the listening ports and related services should
|
||||||
|
be the same. Setting the service-level setting “version_string” to different
|
||||||
|
values on the MaxScale nodes is recommended, as it will be printed to any
|
||||||
|
connecting clients indicating which node was connected to.
|
||||||
|
|
||||||
|
Keepalived requires specific setups on both machines. On the **primary host**,
|
||||||
|
the */etc/keepalived/keepalived.conf*-file should be as follows.
|
||||||
|
|
||||||
|
```
|
||||||
|
vrrp_instance VI_1 {
|
||||||
|
state MASTER
|
||||||
|
interface eth0
|
||||||
|
virtual_router_id 51
|
||||||
|
priority 150
|
||||||
|
advert_int 1
|
||||||
|
authentication {
|
||||||
|
auth_type PASS
|
||||||
|
auth_pass mypass
|
||||||
|
}
|
||||||
|
virtual_ipaddress {
|
||||||
|
192.168.1.123
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The *state* must be MASTER on both hosts. *virtual_router_id* and *auth_pass*
|
||||||
|
must be identical on all hosts. The *interface* defines the network interface
|
||||||
|
used. This depends on the system, but often the correct value is *eth0*,
|
||||||
|
*enp0s12f3* or similar. *priority* defines the voting strength between different
|
||||||
|
Keepalived instances when negotiating on which should be the master. The
|
||||||
|
instances should have different values of priority. In this example, the backup
|
||||||
|
host(s) could have priority 149, 148 and so on. *advert_int* is the interval
|
||||||
|
between a host “advertising” its existence to other Keepalived host. One second
|
||||||
|
is a reasonable value.
|
||||||
|
|
||||||
|
*virtual_ipaddress* (VIP) is the IP the different Keepalived hosts try to claim
|
||||||
|
and must be identical between the hosts. For IP negotiation to work, the VIP
|
||||||
|
must be in the local network address space and unclaimed by any other machine
|
||||||
|
in the LAN. An example *keepalived.conf*-file for a **backup host** is listed
|
||||||
|
below.
|
||||||
|
|
||||||
|
```
|
||||||
|
vrrp_instance VI_1 {
|
||||||
|
state MASTER
|
||||||
|
interface eth0
|
||||||
|
virtual_router_id 51
|
||||||
|
priority 100
|
||||||
|
advert_int 1
|
||||||
|
authentication {
|
||||||
|
auth_type PASS
|
||||||
|
auth_pass mypass
|
||||||
|
}
|
||||||
|
virtual_ipaddress {
|
||||||
|
192.168.1.123
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Once the Keepalived service is running, recent log entries can be printed with
|
||||||
|
the command `service keepalived status`.
|
||||||
|
|
||||||
|
```
|
||||||
|
Aug 11 10:27:59 maxscale2 Keepalived_vrrp[27369]: VRRP_Instance(VI_1) Received higher prio advert
|
||||||
|
Aug 11 10:27:59 maxscale2 Keepalived_vrrp[27369]: VRRP_Instance(VI_1) Entering BACKUP STATE
|
||||||
|
Aug 11 10:27:59 maxscale2 Keepalived_vrrp[27369]: VRRP_Instance(VI_1) removing protocol VIPs.
|
||||||
|
```
|
||||||
|
|
||||||
|
## MaxScale health check
|
||||||
|
|
||||||
|
So far, none of this tutorial has been MaxScale-specific and the health of the
|
||||||
|
MaxScale process has been ignored. To ensure that MaxScale is running on the
|
||||||
|
current master host, a *check script* should be set. Keepalived runs the script
|
||||||
|
regularly and if the script returns an error value, the Keepalived node will
|
||||||
|
assume that it has failed, stops broadcasting its state and relinquishes the
|
||||||
|
VIP. This allows another node to take the master status and claim the VIP. To
|
||||||
|
define a check script, modify the configuration as follows. The example is for
|
||||||
|
the primary node. See [Keepalived Check and Notify Scripts](
|
||||||
|
https://tobrunet.ch/2013/07/keepalived-check-and-notify-scripts/) for more
|
||||||
|
information.
|
||||||
|
|
||||||
|
```
|
||||||
|
vrrp_script chk_myscript {
|
||||||
|
script "/home/scripts/is_maxscale_running.sh"
|
||||||
|
interval 2 # check every 2 seconds
|
||||||
|
fall 2 # require 2 failures for KO
|
||||||
|
rise 2 # require 2 successes for OK
|
||||||
|
}
|
||||||
|
|
||||||
|
vrrp_instance VI_1 {
|
||||||
|
state MASTER
|
||||||
|
interface wlp2s0
|
||||||
|
virtual_router_id 51
|
||||||
|
priority 150
|
||||||
|
advert_int 1
|
||||||
|
authentication {
|
||||||
|
auth_type PASS
|
||||||
|
auth_pass mypass
|
||||||
|
}
|
||||||
|
virtual_ipaddress {
|
||||||
|
192.168.1.13
|
||||||
|
}
|
||||||
|
track_script {
|
||||||
|
chk_myscript
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
An example script, *is_maxscale_running.sh*, is listed below. The script uses
|
||||||
|
MaxAdmin to try to contact the locally running MaxScale and request a server
|
||||||
|
list, then check that the list has at least some expected elements. The timeout
|
||||||
|
command ensures the MaxAdmin call exits in reasonable time. The script detects
|
||||||
|
if MaxScale has crashed, is stuck or is totally overburdened and no longer
|
||||||
|
responds to connections.
|
||||||
|
|
||||||
|
```
|
||||||
|
#!/bin/bash
|
||||||
|
fileName="maxadmin_output.txt"
|
||||||
|
rm $fileName
|
||||||
|
timeout 2s maxadmin list servers > $fileName
|
||||||
|
to_result=$?
|
||||||
|
if [ $to_result -ge 1 ]
|
||||||
|
then
|
||||||
|
echo Timed out or error, timeout returned $to_result
|
||||||
|
exit 3
|
||||||
|
else
|
||||||
|
echo MaxAdmin success, rval is $to_result
|
||||||
|
echo Checking maxadmin output sanity
|
||||||
|
grep1=$(grep server1 $fileName)
|
||||||
|
grep2=$(grep server2 $fileName)
|
||||||
|
|
||||||
|
if [ "$grep1" ] && [ "$grep2" ]
|
||||||
|
then
|
||||||
|
echo All is fine
|
||||||
|
exit 0
|
||||||
|
else
|
||||||
|
echo Something is wrong
|
||||||
|
exit 3
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
|
Aug 11 10:51:56 maxscale2 Keepalived_vrrp[20257]: VRRP_Script(chk_myscript) failed
|
||||||
|
Aug 11 10:51:57 maxscale2 Keepalived_vrrp[20257]: VRRP_Instance(VI_1) Entering FAULT STATE
|
||||||
|
Aug 11 10:51:57 maxscale2 Keepalived_vrrp[20257]: VRRP_Instance(VI_1) removing protocol VIPs.
|
||||||
|
Aug 11 10:51:57 maxscale2 Keepalived_vrrp[20257]: VRRP_Instance(VI_1) Now in FAULT state
|
||||||
|
```
|
BIN
Documentation/Tutorials/images/Keepalived.png
Normal file
BIN
Documentation/Tutorials/images/Keepalived.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 22 KiB |
Reference in New Issue
Block a user