Added test cases that verify that the functionality works as
expected. Also made Mariadb_nodes::change_master less verbose when one of
the nodes is down.
The test checks that failover works even when the master of the monitored
cluster is a slave to an external masters. The test also verifies that the
servers do not get unexpected status labels.
Tests that local_address is taken into account. However, at the time
of writing the maxscale VM does not have two usable IP addresses, so
we only test that explicitly specifying an IP-address does not break
things.
Locally it has been confirmed that this indeed works the way it is
supposed to.
- Start 4 threads where each thread sits in a loop and performs
20% updates and 80% selects. Each thread has a table of its own.
- The main thread executes the following in a loop.
- Perform a switchover from the current master to the next (which is
simply the next node % all nodes).
- Keep on doing that for 1.5 minutes.
The expectation is that the switchover will succeed, that is, after the
operation there will be a new master.
- Start 4 threads where each thread sits in a loop and performs
20% updates and 80% selects. Each thread has a table of its own.
- The main thread executes the following in a loop.
- Take down the current master and wait a while (failover assumed
to happen).
- Put up the old master node and wait a while.
Keep on doing that for 1.5 minutes.
At the end check that:
- There is one 'Master'.
- The other nodes are either
- 'Slave' or
- 'Running' in which case it is checked it is because the node could
not be rejoined.
The tests now reset the replication state using queries and switchover instead of
calling fix_replication(). The results are checked so these tests now test
switchover as well.
Also, reduce printing when verbose is on for any test using the get_output()-function
in fail_switch_rejoin_common.cpp.
auto_failover=true
auto_rejoin=false
This test tests the following:
- Regular master-slave setup
- Create a table, insert some data
- Sync all slaves
- Stop a slave
- Insert some more data
- Sync remaining slaves
- Stop the master
- Expect the failover mechanism to pick a new master (server2)
- Bring up the slave
- Perform a switchover from server2 to server4
- Should fail
Currently it does fail, but only due to a timeout.
[mysqlmon] MASTER_GTID_WAIT() timed out on slave 'server4'.
There should be some check that would ensure that the failure happens
faster than that.
This test tests the following:
- Regular master-slave setup
- Create a table, insert some data
- Sync all slaves
- Stop a slave
- Insert some more data
- Sync remaining slaves
- Stop the master
- Expect the failover mechanism to pick a new master
- Bring up the slave
- Expect the slave to be rejoined
- The test starts with the usual setup of 1 master and 3 slaves.
- Then the master is taken down and it is checked that the failover
mechanism promotes some slave to master.
- This is continued until there is a single master left (no
slaves).
The same test now has two versions. In the automatic version failover
begins automatically. In the manual version failover is started with
maxadmin. The tests are otherwise identical.
Use larger BLOB type for mxs812_1, the inserted value exceeds the normal
BLOB size.
Add a baseline check into bulk_insert to verify that direct connections
work (at the moment they don't, needs an investigation).
Updated parameter names in failover_mysqlmon_mrm.
Modified avro_alter to prevent data type conversion errors.
The test is composed of a few parts.
1: Test that failover happens on master failure.
2: Test that a server with slave sql thread stopped is not promoted.
3: Test that a server with log_slave_updates=1 is promoted before others.
The MaxCtrl test suite is now a part of the regression test suite. The
cluster tests are expected to fail as that is yet to be implemented.
Also fixed the return value of TestConnections::ssh_maxscale.