Commit Graph

443 Commits

Author SHA1 Message Date
a398da58a4 Add sleep to execute_cmd_time_limit
If the query fails instantly, the retries end up busy-looping. Now
each try is at least one second.
2018-10-04 20:19:57 +03:00
707506feae A slave must have running slaves to be a relay master
This prevents some questionable status assignments, but also means that
the Relay Master status can be lost if a slave goes down. This is
contrary to Master status which is not lost if slaves go down. Fixes
mxs1961_standalone_rejoin.
2018-10-04 20:16:29 +03:00
374ae2fc9b Only redirect usable slaves
Prevents pointless retrying/waiting when redirecting slaves.
2018-10-04 20:11:12 +03:00
80c731f02a Fix verify_master_failure
The log message had changed, changed test to match. Also, the remaining
delay is now printed.
2018-10-04 13:38:10 +03:00
86ae0c3e4d MXS-1845 Remove unneeded code & cleanup 2018-10-04 13:38:10 +03:00
a1b3a005dd MXS-1845 Relax cluster operation support requirements
Support for more complicated topologies is quite close to completion and
in any case the function was too aggressive.
2018-10-04 13:09:28 +03:00
30eb21914f MXS-1845 Switchover cleanup
Several small changes:
Binlog is flushed at the end of old master demotion.
Only new master is required to catch up to old master.
Use the same replication check method as failover.
2018-10-04 11:45:33 +03:00
49e85d9a28 MXS-1845 Add demotion code
The master demotion in switchover now uses query retrying with
the switchover time limit.
2018-10-04 11:45:33 +03:00
a4747f5b03 Revert the last commit, and an additional fix to the
"Fix code for warnings:" commit.
2018-10-03 17:22:10 +03:00
268e689dc5 Fix code for warnings: unused-but-set-variable and warn_unused_result. 2018-10-03 16:33:24 +03:00
d14b9bfe43 MXS-1845 Cluster stabilization rewrite
No longer writes events to the master, as this creates problems if the
promoted server was not the overall master. Instead, the slave status
output is inspected.
2018-10-02 11:09:16 +03:00
1ca5d02abb MXS-1845 Add redirection code
Should work with multimaster replication.
2018-10-02 11:09:16 +03:00
6b8443aba6 MXS-1845 Complete server promotion code
Now copies slave connections from the previous master. Promotion
code taken into use.
2018-10-01 18:06:39 +03:00
c65edd1298 Enhance StopWatch
Clean up, comments and enhancements. StopWatch lap() didn't mean lap-time, but elapsed time. Changed meaning to lap-time and added split() for split-time.
2018-10-01 09:30:24 +03:00
fe81b399b2 Use maxbase time and clock classes instead of std::chrono 2018-09-27 17:04:59 +03:00
05d18e81ae Use string instead of stringstream
Most of the monitor was already using string for formatted printing.
2018-09-27 16:46:59 +03:00
dd9ff27743 MXS-1845 Rewrite server promotion code
In progress, does not yet overwrite existing code.

The new promotion mechanism automatically retries queries which timed out. It also
handles multimaster situations correctly.
2018-09-26 13:20:29 +03:00
c58041d4fb Format MariaDBMonitor source
Some parts were manually edited for better results. No functional changes.
2018-09-26 12:49:14 +03:00
bfb1c3f1b3 MXS-1944 Store switchover parameters in an object 2018-09-26 12:42:26 +03:00
c20a17238b MXS-1944 Store failover parameters in an object
Several of the parameters are passed on from function to function. Having them all
in an object cleans things up and makes adding more data easier.
2018-09-26 12:26:35 +03:00
dfc10b109d MXS-1937 Add failover/switchover with server events test
The test adds a scheduled server event, the does failover, rejoin and
switchover and checks that event is manipulated correctly. Also includes
a change to the monitor to fix an invalid ALTER EVENT query when the event
definer has wildcard host.
2018-09-24 10:43:17 +03:00
a3adcea524 MXS-1712 Cleanup reset-replication
Now logs messages explaining what has been done. Scheduled events are
disabled/enabled during the operation. Redirection of slaves is done at
the end similar to failover/switchover.
2018-09-21 10:54:36 +03:00
71ffef5708 Partially revert 4ba011266843857bbd3201e5b925a47e88e1808f
Add back leading operator enforcement.
2018-09-20 15:57:30 +03:00
02ac394e38 Cleanup slave status handling
Further reduce direct indexing slave status array to improve compatibility with
multimaster replication.
2018-09-19 13:37:24 +03:00
eeb61216de Add MariaDBMonitor Gtid unit test
Tests the class with different inputs. Also fixes a bug found by the test.
2018-09-14 17:31:00 +03:00
56c84541df MXS-1712 Add reset replication to MariaDB Monitor
The 'reset_replication' module command deletes all slave connections and binlogs,
sets gtid to sequence 0 and restarts replication from the given master. Should be
only used if gtid:s are incompatible but the actual data is known to be in sync.
2018-09-14 17:15:05 +03:00
cb54880b99 MXS-1937 Cleanup event handling
Event handling is now enabled by default. If the monitor cannot query the EVENTS-
table (most likely because of missing credentials), print an error suggesting to
turn the feature off.

When disabling events on a rejoining standalone server (likely a former master),
disable binlog event recording for the session. This prevents the ALTER EVENT
queries from generating binlog events.

Also added documentation and combined similar parts in the code.
2018-09-14 16:54:24 +03:00
12092d1a90 test_cycle_find.cc: Initialize the log 2018-09-12 14:23:09 +03:00
ad71655a36 MXS-2036 Redirect slaves with stopped SQL threads
This is somewhat questionable, as the slaves won't be able to really
replicate from the new master. However, not doing this causes the wrong
master to be selected after failover unless the new master has a majority
of slaves under it.
2018-09-11 10:27:31 +03:00
108638b0cf Format with Uncrustify 0.67 2018-09-10 13:31:39 +03:00
d11c78ad80 Format all sources with Uncrustify
Formatted all sources and manually tuned some files to make the code look
neater.
2018-09-10 13:22:49 +03:00
c447e5cf15 Uncrustify maxscale
See script directory for method. The script to run in the top level
MaxScale directory is called maxscale-uncrustify.sh, which uses
another script, list-src, from the same directory (so you need to set
your PATH). The uncrustify version was 0.66.
2018-09-09 22:26:19 +03:00
f3fbb297a4 MXS-1937 Enable server events on new master during switchover and failover
Combined some of the shared code in enable/disable_events(). Also disables
events on a joining standalone server.
2018-09-05 15:54:08 +03:00
7e6ce2d13f MXS-1937 Disable server events on master during switchover
The feature is behind a config setting.
2018-09-05 15:54:08 +03:00
7cd1cfdb80 Relay log waiting is part of failover_prepare
Since the servers are not modified before or during the wait, the waiting
can be done in the preparation method. This simplifies the actual failover
somewhat, and allows the monitor to keep running normally while waiting for
the log to clear.
2018-08-30 17:07:34 +03:00
c39177bc8d Relay log clear supports multiple slave connections
Now waits for the relay log of the correct slave connection.
2018-08-29 17:07:52 +03:00
85d8a85cde Update master failure detection from slaves
The detection now works with multiple slave connections.
2018-08-29 17:07:52 +03:00
a593d00c65 Simplify failed master detection
No longer depends on monitor events as the other operations do not
either. The active_event/new_event detection was removed, as these
only protect against a rare situation. A similar feature which
protects all the cluster modifications will be implemented later.
2018-08-29 17:04:05 +03:00
9e566bc619 A master that is down with no running slaves can be replaced
This should be a more general way to detect situations where a DBA
or another MaxScale performs a failover.
2018-08-29 16:41:20 +03:00
91ab59530f Use pending status in external master checks
When the replication status from the external master is checked, the
pending status must be used. This makes sure that the SlaveStatusArray and
the server state are sync.

Also extended the message that was logged when the external master was
lost. By adding the network address there, it makes it easier to see where
the server was replicating from if only the log file is available.
2018-08-23 15:46:45 +03:00
61bb172033 Cleanup failover/switchover
Replication settings warnings are printed once more. Changed some
parameter names to be more consistent within the monitor.
2018-08-23 10:28:47 +03:00
f2dfd39f79 Clean up JSON diagnostics
Now prints all slave connections.
2018-08-22 12:23:28 +03:00
916b72a733 Clean up loops
Changed many of the iterator loops to range loops.
2018-08-22 12:22:45 +03:00
3777da96bd Miscellaneous cleanup
Removes needless status assignments and unused code. Moves and modifies some comments.
2018-08-22 12:11:33 +03:00
ab9a9f92cb MXS-2020 Remove maxscale/debug.h
- Removed from all files.
- maxbase/assert.h included where necessary.
2018-08-22 11:35:35 +03:00
3f53eddbde MXS-2020 Replace ss[_info]_dassert with mxb_assert[_message] 2018-08-22 11:34:59 +03:00
24ab3c099c Move top of the file "#pragma once" to after the following comment (swap them). If the comment is a BPL update it to the latest one 2018-08-21 13:13:15 +03:00
03cefcc4ac MXS-2012 Write replication lag to SERVER
Allows routers to read the value.
2018-08-21 11:51:10 +03:00
44a57dbefd Master can be a slave
This is possible if read_only is turned ON on the master and there is
no alternative master to swap to.
2018-08-21 11:51:10 +03:00
1c508cd413 MXS-2012 Read and print Seconds_Behind_Master
Replaces the old replication lag detection.
2018-08-21 11:51:10 +03:00