381 lines
11 KiB
Markdown
381 lines
11 KiB
Markdown
# Data archiving with Mqfilter and Tee filters
|
|
|
|
This tutorial gives a quick look into how you can combine various filters to create
|
|
systems for archiving data for analysis. The aim of this tutorial is to show
|
|
what can be done with MariaDB MaxScale's filters rather than demonstrate a proven method
|
|
of archiving data. For this tutorial you will need two MariaDB servers, one for
|
|
archiving the data and one for actual use, a RabbitMQ server and a MariaDB MaxScale server.
|
|
For testing purposes some of these can locate on the same server but for actual
|
|
use, an HA solution is recommended.
|
|
|
|

|
|
|
|
The use case for this tutorial is a production system with one main server where all
|
|
queries are routed and an archive server where only INSERT, UPDATE and DELETE statements
|
|
are routed. The queries routed to the archive servers are also transformed into a canonical
|
|
format and sent to a RabbitMQ broker for analysis. This setup allows us to control what we
|
|
send to the server and could possibly allow us to filter out DELETE statements completely,
|
|
making the archive server a true archive of all data.
|
|
|
|
## Setting up MariaDB MaxScale
|
|
|
|
The installation of MariaDB MaxScale is covered in the Installation chapter of the [MariaDB MaxScale Tutorial](MaxScale-Tutorial.md).
|
|
|
|
## Setting up the MariaDB servers
|
|
|
|
Since the archive server will not replicate from the main server, we don't need to
|
|
set up replication between the two. The only thing we need to do is to create the
|
|
users we will use for monitoring and authentication.
|
|
|
|
The process of creating monitoring and authentication users for MariaDB MaxScale is described
|
|
in the Creating Database Users section of the
|
|
[MariaDB MaxScale Tutorial](MaxScale-Tutorial.md#creating-database-users).
|
|
|
|
## Setting up RabbitMQ server
|
|
|
|
To set up the RabbitMQ server, follow the instructions for your OS on the [RabbitMQ website](https://www.rabbitmq.com/download.html).
|
|
Useful documentation about access rights can be found on the [Access Control](https://www.rabbitmq.com/access-control.html)
|
|
page and for UNIX systems the [`rabbitmqctl` manpage](https://www.rabbitmq.com/man/rabbitmqctl.1.man.html)
|
|
has all the needed commands to manage your installation of RabbitMQ.
|
|
|
|
For this tutorial, we will use a RabbitMQ server installed on a CentOS 7 from
|
|
the RPM packages. Since CentOS 7 doesn't have the RabbitMQ server in the default
|
|
repositories, we will need two extra repositories: The EPEL repository and the Erlang repository.
|
|
|
|
* [EPEL repositories](https://fedoraproject.org/wiki/EPEL)
|
|
* [Erlang repositories](https://www.erlang-solutions.com/resources/download.html)
|
|
|
|
First we need the EPEL and Erlang repositories for CentOS 7 after which we will install the RabbitMQ server:
|
|
|
|
```
|
|
sudo yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
|
|
sudo yum -y install http://packages.erlang-solutions.com/erlang-solutions-1.0-1.noarch.rpm
|
|
sudo yum -y install https://www.rabbitmq.com/releases/rabbitmq-server/v3.5.6/rabbitmq-server-3.5.6-1.noarch.rpm
|
|
```
|
|
|
|
Next step is to start the RabbitMQ server and configure it.
|
|
|
|
```
|
|
sudo systemctl start rabbitmq-server
|
|
```
|
|
|
|
We need to create the `msguser` account which we will use to relay the messages
|
|
to the broker. This can be done with the `rabbitmqctl` command.
|
|
|
|
```
|
|
sudo rabbitmqctl add_user "msguser" "msgpwd"
|
|
sudo rabbitmqctl set_permissions "msguser" ".*" ".*" ".*"
|
|
```
|
|
|
|
This creates the `msguser` account with `msgpwd` as the password and grants read, write and configurations
|
|
permissions on the default virtual host `/`.
|
|
|
|
## Configuration file creation
|
|
|
|
The final step before testing is the creation of the maxscale.cnf file. We start off by adding the two
|
|
servers to the configuration file.
|
|
|
|
```
|
|
[production-1]
|
|
type=server
|
|
address=192.168.0.200
|
|
port=3306
|
|
protocol=MySQLBackend
|
|
|
|
[archive-1]
|
|
type=server
|
|
address=192.168.0.201
|
|
port=3000
|
|
protocol=MySQLBackend
|
|
```
|
|
|
|
After we have defined the `production-1` and `archive-1` servers, we need a monitor
|
|
module for those servers. This module will detect if connectivity to the servers
|
|
is lost and notify MariaDB MaxScale of the changed server states.
|
|
|
|
```
|
|
[MySQL Monitor]
|
|
type=monitor
|
|
module=mysqlmon
|
|
servers=production-1, archive-1
|
|
user=maxuser
|
|
passwd=maxpwd
|
|
monitor_interval=5000
|
|
```
|
|
|
|
The monitor will use the user `maxuser` with the password `maxpwd` to connect to
|
|
the servers and query them for their state. In the `servers` parameter we have
|
|
listed both of the `production-1` and `archive-1` servers. All objects in the
|
|
MariaDB MaxScale configuration file are referred by their section names. Here the section
|
|
names of the servers are used in the `servers` parameter. The `monitor_interval`
|
|
parameter controls how often the monitor will poll the servers for status. For
|
|
this tutorial, we've set it to 5000 milliseconds.
|
|
|
|
Next we configure the Production and Archive services. The Production service will
|
|
be the main access point to the system.
|
|
|
|
```
|
|
[Production]
|
|
type=service
|
|
router=readconnroute
|
|
servers=production-1
|
|
user=maxuser
|
|
passwd=maxpwd
|
|
filters=Tee
|
|
|
|
[Archive]
|
|
type=service
|
|
router=readconnroute
|
|
servers=archive-1
|
|
user=maxuser
|
|
passwd=maxpwd
|
|
filters=MQ Filter
|
|
```
|
|
|
|
The `filters` parameters for the services refer to the filters we will be creating next.
|
|
The Production service will use the Tee filter to duplicate INSERT, UPDATE and DELETE
|
|
statements to the Archive service. The statements passed to the Archive service will
|
|
use the MQ Filter to send the canonical versions of the statements to the RabbitMQ broker.
|
|
The Production service will use the `production-1` server and the Archive service will
|
|
use the `archive-1` server. Both services user the `maxuser` user with the `maxpwd` password.
|
|
|
|
Next we will configure the listeners for these two services.
|
|
|
|
```
|
|
[Production Listener]
|
|
type=listener
|
|
service=Production
|
|
protocol=MySQLClient
|
|
port=4000
|
|
|
|
[Archive Listener]
|
|
type=listener
|
|
service=Archive
|
|
protocol=MySQLClient
|
|
port=4001
|
|
```
|
|
|
|
The `port` parameter controls which port the listener will listen on and where the client
|
|
connections should be made. The `service` parameter tells which listener belongs to which
|
|
service.
|
|
|
|
After the services and their listeners are configured we will configure the two filters we'll use. We
|
|
begin with the Tee filter.
|
|
|
|
```
|
|
[Tee]
|
|
type=filter
|
|
module=tee
|
|
service=Archive
|
|
match=\(insert\)\|\(update\)\|\(delete\)
|
|
```
|
|
|
|
The `service` parameter controls which service we want to duplicate the statements to.
|
|
This needs to be set to the section name of the archive service: `Archive`. The `match`
|
|
parameter is a regular expression which, if matched, cause the statement to be duplicated.
|
|
Here we will want a regular expression which will match the INSERT, UPDATE and DELETE
|
|
statements.
|
|
|
|
After the Tee filter is configured, we will configure the Mqfilter to communicate with
|
|
the RabbitMQ server we've set up.
|
|
|
|
```
|
|
[MQ Filter]
|
|
type=filter
|
|
module=mqfilter
|
|
hostname=192.168.0.201
|
|
port=5672
|
|
username=msguser
|
|
password=msgpwd
|
|
exchange=msg-ex-1
|
|
queue=msg-queue-1
|
|
key=MaxScale
|
|
```
|
|
|
|
The `hostname` is the address and the `port` is the port of the RabbitMQ server. For
|
|
`username` and `password` we will use the `msguser` and `msgpwd` credentials we
|
|
created earlier. The `exchange` is the name of the RabbitMQ exchange we're using and
|
|
the `key` is the key used for all the sent messages. The messages will be stored
|
|
in the `mxs-queue-1` queue.
|
|
|
|
The last thing to add is the administration interface.
|
|
|
|
```
|
|
[MaxAdmin Service]
|
|
type=service
|
|
router=cli
|
|
|
|
[MaxAdmin Listener]
|
|
type=listener
|
|
service=MaxAdmin Service
|
|
protocol=maxscaled
|
|
socket=default
|
|
```
|
|
|
|
Now we have created the MariaDB MaxScale configuration file and all we need to do is to save
|
|
it in `/etc/maxscale.cnf`, start MariaDB MaxScale and test that it works. The testing will
|
|
be done in the next section.
|
|
|
|
Here is the complete configuration file.
|
|
|
|
```
|
|
# The production and archive servers
|
|
[production-1]
|
|
type=server
|
|
address=192.168.0.200
|
|
port=3306
|
|
protocol=MySQLBackend
|
|
|
|
[archive-1]
|
|
type=server
|
|
address=192.168.0.201
|
|
port=3000
|
|
protocol=MySQLBackend
|
|
|
|
# MySQL server monitor
|
|
[MySQL Monitor]
|
|
type=monitor
|
|
module=mysqlmon
|
|
servers=production-1, archive-1
|
|
user=maxuser
|
|
passwd=maxpwd
|
|
monitor_interval=5000
|
|
|
|
# Production service
|
|
[Production]
|
|
type=service
|
|
router=readconnroute
|
|
servers=production-1
|
|
user=maxuser
|
|
passwd=maxpwd
|
|
filters=Tee
|
|
|
|
# Archive service
|
|
[Archive]
|
|
type=service
|
|
router=readconnroute
|
|
servers=archive-1
|
|
user=maxuser
|
|
passwd=maxpwd
|
|
filters=MQ Filter
|
|
|
|
# Listeners for the services
|
|
[Production Listener]
|
|
type=listener
|
|
service=Production
|
|
protocol=MySQLClient
|
|
port=4000
|
|
|
|
[Archive Listener]
|
|
type=listener
|
|
service=Archive
|
|
protocol=MySQLClient
|
|
port=4001
|
|
|
|
# Tee filter to duplicate insert, update and delete
|
|
# statements to the archive server
|
|
[Tee]
|
|
type=filter
|
|
module=tee
|
|
service=Archive
|
|
match=\(insert\)\|\(update\)\|\(delete\)
|
|
|
|
# Mqfilter to log the canonical versions of the archive
|
|
# server queries to a RabbitMQ broker
|
|
[MQ Filter]
|
|
type=filter
|
|
module=mqfilter
|
|
hostname=192.168.0.201
|
|
port=5672
|
|
username=msguser
|
|
password=msgpwd
|
|
exchange=msg-ex-1
|
|
queue=msg-queue-1
|
|
key=MaxScale
|
|
|
|
# Administration interface setup
|
|
[MaxAdmin Service]
|
|
type=service
|
|
router=cli
|
|
|
|
[MaxAdmin Listener]
|
|
type=listener
|
|
service=MaxAdmin Service
|
|
protocol=maxscaled
|
|
socket=default
|
|
```
|
|
|
|
## Testing the setup
|
|
|
|
Now that we have created the configuration file, prepared the RabbitMQ server
|
|
and the database servers we can start testing the setup. We do that by starting
|
|
MariaDB MaxScale:
|
|
|
|
```
|
|
sudo systemctl start maxscale
|
|
```
|
|
|
|
We can see the state of the two servers with MaxAdmin:
|
|
|
|
```
|
|
sudo maxadmin list servers
|
|
|
|
Servers.
|
|
-------------------+-----------------+-------+-------------+--------------------
|
|
Server | Address | Port | Connections | Status
|
|
-------------------+-----------------+-------+-------------+--------------------
|
|
production-1 | 192.168.0.200 | 3306 | 0 | Running
|
|
archive-1 | 192.168.0.201 | 3000 | 0 | Running
|
|
-------------------+-----------------+-------+-------------+--------------------
|
|
```
|
|
|
|
We can also check the queue status on the RabbitMQ server:
|
|
|
|
```
|
|
sudo rabbitmqctl list_queues
|
|
|
|
Listing queues ...
|
|
```
|
|
|
|
If we create a connection on the Production service on port 4000 and execute
|
|
a set of data modifying statements we should see an equal number of statements
|
|
being sent to the RabbitMQ server:
|
|
|
|
```
|
|
MariaDB [(none)]> insert into test.t1 values(1);
|
|
Query OK, 1 row affected (0.09 sec)
|
|
|
|
MariaDB [(none)]> insert into test.t1 values(2);
|
|
Query OK, 1 row affected (0.07 sec)
|
|
|
|
MariaDB [(none)]> update test.t1 set id = 0;
|
|
Query OK, 2 rows affected (0.05 sec)
|
|
Rows matched: 2 Changed: 2 Warnings: 0
|
|
```
|
|
|
|
On the RabbitMQ server we can see the new queue `msg-queue-1` with three new messages in it.
|
|
|
|
```
|
|
sudo rabbitmqctl list_queues
|
|
|
|
Listing queues ...
|
|
msg-queue-1 3
|
|
```
|
|
|
|
We can also see the data on the archive server:
|
|
|
|
```
|
|
MariaDB [(none)]> select * from test.t1;
|
|
+------+
|
|
| id |
|
|
+------+
|
|
| 0 |
|
|
| 0 |
|
|
+------+
|
|
2 rows in set (0.00 sec)
|
|
```
|
|
|
|
To read the data from the RabbitMQ, we can use the RabbitMQ Consumer tool
|
|
included in the MariaDB MaxScale source. For a tutorial on how to use this tool,
|
|
please read [RabbitMQ Consumer Client](../Filters/RabbitMQ-Consumer-Client.md).
|