121 Commits

Author SHA1 Message Date
Esa Korhonen
f3fbb297a4 MXS-1937 Enable server events on new master during switchover and failover
Combined some of the shared code in enable/disable_events(). Also disables
events on a joining standalone server.
2018-09-05 15:54:08 +03:00
Esa Korhonen
7e6ce2d13f MXS-1937 Disable server events on master during switchover
The feature is behind a config setting.
2018-09-05 15:54:08 +03:00
Esa Korhonen
7cd1cfdb80 Relay log waiting is part of failover_prepare
Since the servers are not modified before or during the wait, the waiting
can be done in the preparation method. This simplifies the actual failover
somewhat, and allows the monitor to keep running normally while waiting for
the log to clear.
2018-08-30 17:07:34 +03:00
Esa Korhonen
c39177bc8d Relay log clear supports multiple slave connections
Now waits for the relay log of the correct slave connection.
2018-08-29 17:07:52 +03:00
Esa Korhonen
85d8a85cde Update master failure detection from slaves
The detection now works with multiple slave connections.
2018-08-29 17:07:52 +03:00
Markus Mäkelä
91ab59530f
Use pending status in external master checks
When the replication status from the external master is checked, the
pending status must be used. This makes sure that the SlaveStatusArray and
the server state are sync.

Also extended the message that was logged when the external master was
lost. By adding the network address there, it makes it easier to see where
the server was replicating from if only the log file is available.
2018-08-23 15:46:45 +03:00
Esa Korhonen
61bb172033 Cleanup failover/switchover
Replication settings warnings are printed once more. Changed some
parameter names to be more consistent within the monitor.
2018-08-23 10:28:47 +03:00
Esa Korhonen
f2dfd39f79 Clean up JSON diagnostics
Now prints all slave connections.
2018-08-22 12:23:28 +03:00
Esa Korhonen
3777da96bd Miscellaneous cleanup
Removes needless status assignments and unused code. Moves and modifies some comments.
2018-08-22 12:11:33 +03:00
Johan Wikman
3f53eddbde MXS-2020 Replace ss[_info]_dassert with mxb_assert[_message] 2018-08-22 11:34:59 +03:00
Esa Korhonen
03cefcc4ac MXS-2012 Write replication lag to SERVER
Allows routers to read the value.
2018-08-21 11:51:10 +03:00
Esa Korhonen
1c508cd413 MXS-2012 Read and print Seconds_Behind_Master
Replaces the old replication lag detection.
2018-08-21 11:51:10 +03:00
Johan Wikman
b408894f6d MXS-2004 Remove dependency of maxscale/thread.h 2018-08-13 13:38:39 +03:00
Esa Korhonen
681c456bd7 Separate unknown server version from old versions
This allows better failover support detection.
2018-08-13 11:30:21 +03:00
Esa Korhonen
b7c94abb34 Keep track of previously observed slave connections
This reduces the ambiguity of server id:s in the slave status contents.
If a slave connection has been seen properly connected at an earlier time,
it can be trusted to report the correct master server id. This also
fixes some wrong status assignment edge cases with the SERVER_WAS_SLAVE-bit.
The bit will be removed in a later commit.

Even this does not solve the situation when MaxScale is started with
some servers down.
2018-08-09 20:39:19 +03:00
Esa Korhonen
17c84a22c7 Refactor preparations to failover
The two operations are quite similar so the code should look
similar as well and use shared functions.
2018-08-07 16:33:56 +03:00
Esa Korhonen
0a81f78442 Use unique pointer instead of auto-pointer 2018-08-06 13:24:05 +03:00
Esa Korhonen
c0bd5ca3a1 MXS-1905 Switchover if master is low on disk space
Required quite a bit of refactoring.
2018-08-06 13:24:05 +03:00
Esa Korhonen
836db54800 Clean up server status printing
Uses mostly the status functions for reading the flags. Strickly
speaking this breaks the REST API since in some cases (status combinations)
the printed string is different from what was printed before.
2018-08-02 10:42:12 +03:00
Esa Korhonen
1e33ab69f2 Rename server_is_running() to server_is_usable()
The previous name was misleading. The new server_is_running() only
checks for the running bit so that a server is always either running
or down.
2018-07-31 14:53:56 +03:00
Esa Korhonen
18bfca0533 Define inline functions for status variables
The functions are used in MariaDB Monitor.
2018-07-27 11:20:23 +03:00
Esa Korhonen
fbce38878b Turn server status macros to functions 2018-07-25 11:19:47 +03:00
Esa Korhonen
c9570ff616 Check failover applicability to the cluster every turn
This should give an advance warning if a user tries to activate auto_failover
on a cluster which does not support it.
2018-07-20 15:33:47 +03:00
Markus Mäkelä
0750e93eeb
Fix verify_master_failure
The master failure verification would not work if the slaves did not have
a state change since MaxScale had started. This can be fixed by treating
the startup of MaxScale as an event of sorts.
2018-07-17 11:52:19 +03:00
Esa Korhonen
936bcde135 Remove old "detect_standalone_master"-feature, update documentation
The auto_failover is a more reliable solution and should be used instead. Several
unused parameters were removed, although they can still be defined in the config
file. Updated documentation on the relevant parts.
2018-07-16 15:58:16 +03:00
Esa Korhonen
9525d3507b Run manual commands without stopping the monitor
The command is saved in a function object which is read by the monitor
thread. This way, manual and automatic cluster modification commands are
ran in the same step of a monitor cycle.

This update required several modifications in related code.
2018-06-28 16:56:41 +03:00
Esa Korhonen
6bf10904d7 MXS-1845 Only rebuild topology when required
The monitor now detects when a server has changed such that a replication
graph rebuild is needed and only then rebuilds the graph and detects
cycles and master.

Also, some old code is no longer called in the monitor cycle. It will be
removed in later commits. Refactored some of the related functions.
2018-06-28 16:56:41 +03:00
Johan Wikman
cc0299aee6 Update change date of 2.3 2018-06-25 10:07:52 +03:00
Esa Korhonen
8bd9e1d473 Check monitor permissions when reconnecting to server
Previously, the permissions would only be checked at monitor start.
Now, the permissions are checked if [Auth Error] is on or server
was reconnected.
2018-06-20 17:54:46 +03:00
Esa Korhonen
019d62bbb8 MXS-1886 Better auto-rejoin error description and tolerance
Contains changes from commit 09df01752812444c6e7c409a8957d292f7de63cf
adapted to the 2.3 branch.
2018-06-18 16:35:28 +03:00
Esa Korhonen
d3e9cc9a4f MXS-1886 Auto-failover error tolerance
Contains changes from commit 9e68d8ec3ddf1621f533067021c4b3042f695e80
adapted to the 2.3 branch.
2018-06-18 16:35:28 +03:00
Esa Korhonen
2f987d0b10 MXS-1845 Only select a master if current master is no longer usable
The purpose is to make the selected master server sticky. The master is reselected only
if the current master is no longer a valid master.
2018-06-18 11:06:58 +03:00
Esa Korhonen
5324a1bdaa MXS-1845 Assign server roles
Assign server roles (master, slave, relay master, slave of external master)
for a graph with possibly multiple paths to a slave server.
2018-06-13 17:38:53 +03:00
Esa Korhonen
3f82c25c62 MXS-1845 New algorithm for finding the master server
Not yet used, as more is needed to replace the old code. The
algorithm is based on counting the total number of slave nodes
a server has, possibly in multiple layers and/or cycles.
2018-06-13 17:38:33 +03:00
Esa Korhonen
2481de260f Move monitor-dependent code in MariaDBServer to MariaDBMonitor
Removes Monitor-dependency from the MariaDBServer-class.
2018-06-06 22:28:38 +03:00
Johan Wikman
f600b3a769 MXS-1775 Check disk space in MariaDBMonitor 2018-06-06 15:25:57 +03:00
Esa Korhonen
37841183b3 Cleanup server.h
Renamed, rearranged and clarified status bits. Removed unused macros.
2018-06-01 14:29:51 +03:00
Esa Korhonen
4d7aff4ab9 MXS-1845 Find strongly connected components with multiple slave connections
Rewrote the algorithm for clarity.
2018-06-01 14:04:50 +03:00
Esa Korhonen
3ec449339f Only write to SERVER->status at the end of a monitoring loop
This makes the code clearer and reduces race conditions, as the monitor
could be writing SERVER->status while a router is reading it. Also,
the time during which the SERVER struct is locked drops to a fraction.
2018-05-23 16:19:08 +03:00
Esa Korhonen
a31549221b MXS-1845 Print all slave connections at 'show monitors'
The format has changed.
2018-05-18 13:48:08 +03:00
Esa Korhonen
45da5a08d9 MXS-1845 Cleanup monitor backend updating
Now detects errors and prints them.
2018-05-18 13:48:08 +03:00
Esa Korhonen
b71710e066 MXS-1865 Combine server version and type
Binlog server is now handled better.
2018-05-16 13:55:45 +03:00
Esa Korhonen
b29bae6e84 MXS-1865 Update server version only when (re)connecting
Updating it every iteration is needless.
2018-05-16 13:55:45 +03:00
Esa Korhonen
c6b8277281 Cleanup server status querying
No functional changes except that status changes are no longer logged when
querying. It was bugged to begin with.
2018-05-14 11:32:02 +03:00
Esa Korhonen
df4454027a Clean up monitor initialization and destruction
Since monitors are now freed at MaxScale exit, the server data should be freed. Also,
gtid domain variables are now initialized with a common constant.
2018-05-14 11:32:02 +03:00
Esa Korhonen
b92284afc4 Move server querying to MariaDBServer
The query functions still require the base monitor struct because of
mon_ping_or_connect_to_db().
2018-05-14 11:11:08 +03:00
Esa Korhonen
370b3be576 MXS-1703 Support "Preparing" in Slave_IO_Running
Is interpreted as "Connecting".
2018-05-09 12:49:23 +03:00
Esa Korhonen
0c5af4b13f Merge branch '2.2' into develop 2018-05-08 14:10:52 +03:00
Markus Mäkelä
3030799ae1
Fix GTID updating for slaves
The updating of GTIDs was only considered successful if both the current
GTID position and binlog GTID positions were non-empty. If a slave has no
binlogged events, the GTID update would always fail.

This change in behavior caused the mysqlmon_failover_auto and
mysqlmon_failver_manual tests to break. The test disabled the binary log
on one of the servers which caused it to be left out from the rejoining
process.
2018-05-03 09:46:45 +03:00
Esa Korhonen
5d010ff712 Cleanup SERVER struct
Removed one unused field. Rearranged others, clarified comments.
2018-04-27 10:48:56 +03:00