MaxScale

Author	SHA1	Message	Date
Esa Korhonen	040562f718	MXS-2342 Run MariaDBMonitor diagnostics concurrent with the monitor loop This fixes some situations where MaxAdmin/MaxCtrl would block and wait until a monitor operation or tick is complete. This also fixes a deadlock caused by calling monitor diagnostics inside a monitor script. Concurrency is enabled by adding one mutex per server object to protect array-like fields from concurrent reading/writing.	2019-03-12 10:50:16 +02:00
Esa Korhonen	4fd4b726a1	MXS-2325 Only enable events that were enabled on the master The monitor now continuously updates a list of enabled server events. When promoting a new master in failover/switchover, only events that were enabled on the previous master are enabled on the new. This avoids enabling events that may have been disabled on the master yet stayed in the SLAVESIDE_DISABLED- state on the slave. In the case of reset-replication command, events on the new master are only enabled if the monitor had a master when the command was launched. Otherwise all events remain disabled.	2019-03-04 16:00:07 +02:00
Esa Korhonen	fb52e565fe	Store capabilities of monitored server Checking the version number in various places in the code gets confusing. It's better to check the version in one place and record the relevant data.	2018-11-21 17:36:52 +02:00
Esa Korhonen	1a046bd453	MXS-2168 Add 'assume_unique_hostnames'-setting to MariaDBMonitor Adds the setting and takes it into use during replication graph creation and the most important checks.	2018-11-21 10:30:11 +02:00
Esa Korhonen	6a1cfddb43	MXS-2158 Clean up gtid updating during rejoin Error messages from update_gtids() are now printed. can_replicate_from() no longer updates gtid:s.	2018-11-16 12:56:24 +02:00
Esa Korhonen	14e38e4e08	MXS-2158 Return true if update_gtids() succeeds, even if no data is returned Previously, if the server had no gtid:s, the method would fail leading to a confusing error message. This could even totally stop the monitor from working if a recent server version (10.X) did not have any gtid events.	2018-11-14 10:56:42 +02:00
Esa Korhonen	fb3ccc94d6	MXS-1944 Cleanup function parameters The naming and ordering is now a bit more consistent between promote() and demote().	2018-11-07 12:55:59 +02:00
Esa Korhonen	e4e2235297	MXS-1944 Use time limited methods in rejoin Uses switchover time limit, since the typical rejoin of a standalone server is somewhat similar to a switchover.	2018-11-07 12:55:59 +02:00
Esa Korhonen	a4ce4e4613	MariaDBServer no longer uses ClusterOperation The functions in the server class now only use the general parameters object.	2018-11-07 12:55:59 +02:00
Esa Korhonen	90e6ff078a	Divide ClusterOperation to two types The main class was getting unwieldly and too general. Dividing the fields helps adding support for other operation types. This commit leaves most data duplicated, later commits clean up the affected code.	2018-11-07 12:55:59 +02:00
Esa Korhonen	0cf8ea43f7	Redirect slaves of promotion target This affects situations where the promoted server is a relay or multimaster group member.	2018-10-16 16:09:38 +03:00
Esa Korhonen	2f1512a22d	Cleanup slave connection removal during promotion/demotion The removing and slave status updating is now separated to a function. As the MariaDBServer object now contains the updated slave connections, keeping track of removed connections is no longer required.	2018-10-09 14:29:49 +03:00
Esa Korhonen	c10fab977d	Cleanup slave connection copy & merge The two cases are now separated. In switchover, the promotion and demotion targets can swap connections between each other without worry. In failover, the two connection lists must be merged semi-intelligently. The slave connections of the two servers are now saved to the operation descriptor object at the start of the operation. This allows slave status updating during the operation.	2018-10-09 14:29:49 +03:00
Esa Korhonen	4d6f961695	Cleanup mariadbserver.hh	2018-10-09 14:29:49 +03:00
Esa Korhonen	68d65682b5	Reorganize MariaDBServer code The server-class keeps growing, so the additional classes are moved out of the main class file.	2018-10-09 14:29:49 +03:00
Esa Korhonen	30eb21914f	MXS-1845 Switchover cleanup Several small changes: Binlog is flushed at the end of old master demotion. Only new master is required to catch up to old master. Use the same replication check method as failover.	2018-10-04 11:45:33 +03:00
Esa Korhonen	49e85d9a28	MXS-1845 Add demotion code The master demotion in switchover now uses query retrying with the switchover time limit.	2018-10-04 11:45:33 +03:00
Esa Korhonen	d14b9bfe43	MXS-1845 Cluster stabilization rewrite No longer writes events to the master, as this creates problems if the promoted server was not the overall master. Instead, the slave status output is inspected.	2018-10-02 11:09:16 +03:00
Esa Korhonen	1ca5d02abb	MXS-1845 Add redirection code Should work with multimaster replication.	2018-10-02 11:09:16 +03:00
Esa Korhonen	6b8443aba6	MXS-1845 Complete server promotion code Now copies slave connections from the previous master. Promotion code taken into use.	2018-10-01 18:06:39 +03:00
Esa Korhonen	fe81b399b2	Use maxbase time and clock classes instead of std::chrono	2018-09-27 17:04:59 +03:00
Esa Korhonen	dd9ff27743	MXS-1845 Rewrite server promotion code In progress, does not yet overwrite existing code. The new promotion mechanism automatically retries queries which timed out. It also handles multimaster situations correctly.	2018-09-26 13:20:29 +03:00
Esa Korhonen	c20a17238b	MXS-1944 Store failover parameters in an object Several of the parameters are passed on from function to function. Having them all in an object cleans things up and makes adding more data easier.	2018-09-26 12:26:35 +03:00
Esa Korhonen	02ac394e38	Cleanup slave status handling Further reduce direct indexing slave status array to improve compatibility with multimaster replication.	2018-09-19 13:37:24 +03:00
Esa Korhonen	56c84541df	MXS-1712 Add reset replication to MariaDB Monitor The 'reset_replication' module command deletes all slave connections and binlogs, sets gtid to sequence 0 and restarts replication from the given master. Should be only used if gtid:s are incompatible but the actual data is known to be in sync.	2018-09-14 17:15:05 +03:00
Esa Korhonen	cb54880b99	MXS-1937 Cleanup event handling Event handling is now enabled by default. If the monitor cannot query the EVENTS- table (most likely because of missing credentials), print an error suggesting to turn the feature off. When disabling events on a rejoining standalone server (likely a former master), disable binlog event recording for the session. This prevents the ALTER EVENT queries from generating binlog events. Also added documentation and combined similar parts in the code.	2018-09-14 16:54:24 +03:00
Markus Mäkelä	d11c78ad80	Format all sources with Uncrustify Formatted all sources and manually tuned some files to make the code look neater.	2018-09-10 13:22:49 +03:00
Niclas Antti	c447e5cf15	Uncrustify maxscale See script directory for method. The script to run in the top level MaxScale directory is called maxscale-uncrustify.sh, which uses another script, list-src, from the same directory (so you need to set your PATH). The uncrustify version was 0.66.	2018-09-09 22:26:19 +03:00
Esa Korhonen	f3fbb297a4	MXS-1937 Enable server events on new master during switchover and failover Combined some of the shared code in enable/disable_events(). Also disables events on a joining standalone server.	2018-09-05 15:54:08 +03:00
Esa Korhonen	7e6ce2d13f	MXS-1937 Disable server events on master during switchover The feature is behind a config setting.	2018-09-05 15:54:08 +03:00
Esa Korhonen	7cd1cfdb80	Relay log waiting is part of failover_prepare Since the servers are not modified before or during the wait, the waiting can be done in the preparation method. This simplifies the actual failover somewhat, and allows the monitor to keep running normally while waiting for the log to clear.	2018-08-30 17:07:34 +03:00
Esa Korhonen	c39177bc8d	Relay log clear supports multiple slave connections Now waits for the relay log of the correct slave connection.	2018-08-29 17:07:52 +03:00
Esa Korhonen	85d8a85cde	Update master failure detection from slaves The detection now works with multiple slave connections.	2018-08-29 17:07:52 +03:00
Esa Korhonen	9e566bc619	A master that is down with no running slaves can be replaced This should be a more general way to detect situations where a DBA or another MaxScale performs a failover.	2018-08-29 16:41:20 +03:00
Markus Mäkelä	91ab59530f	Use pending status in external master checks When the replication status from the external master is checked, the pending status must be used. This makes sure that the SlaveStatusArray and the server state are sync. Also extended the message that was logged when the external master was lost. By adding the network address there, it makes it easier to see where the server was replicating from if only the log file is available.	2018-08-23 15:46:45 +03:00
Esa Korhonen	61bb172033	Cleanup failover/switchover Replication settings warnings are printed once more. Changed some parameter names to be more consistent within the monitor.	2018-08-23 10:28:47 +03:00
Esa Korhonen	f2dfd39f79	Clean up JSON diagnostics Now prints all slave connections.	2018-08-22 12:23:28 +03:00
Esa Korhonen	3777da96bd	Miscellaneous cleanup Removes needless status assignments and unused code. Moves and modifies some comments.	2018-08-22 12:11:33 +03:00
Niclas Antti	24ab3c099c	Move top of the file "#pragma once" to after the following comment (swap them). If the comment is a BPL update it to the latest one	2018-08-21 13:13:15 +03:00
Esa Korhonen	03cefcc4ac	MXS-2012 Write replication lag to SERVER Allows routers to read the value.	2018-08-21 11:51:10 +03:00
Esa Korhonen	1c508cd413	MXS-2012 Read and print Seconds_Behind_Master Replaces the old replication lag detection.	2018-08-21 11:51:10 +03:00
Esa Korhonen	681c456bd7	Separate unknown server version from old versions This allows better failover support detection.	2018-08-13 11:30:21 +03:00
Esa Korhonen	b7c94abb34	Keep track of previously observed slave connections This reduces the ambiguity of server id:s in the slave status contents. If a slave connection has been seen properly connected at an earlier time, it can be trusted to report the correct master server id. This also fixes some wrong status assignment edge cases with the SERVER_WAS_SLAVE-bit. The bit will be removed in a later commit. Even this does not solve the situation when MaxScale is started with some servers down.	2018-08-09 20:39:19 +03:00
Esa Korhonen	17c84a22c7	Refactor preparations to failover The two operations are quite similar so the code should look similar as well and use shared functions.	2018-08-07 16:33:56 +03:00
Esa Korhonen	0a81f78442	Use unique pointer instead of auto-pointer	2018-08-06 13:24:05 +03:00
Esa Korhonen	c0bd5ca3a1	MXS-1905 Switchover if master is low on disk space Required quite a bit of refactoring.	2018-08-06 13:24:05 +03:00
Esa Korhonen	1e33ab69f2	Rename server_is_running() to server_is_usable() The previous name was misleading. The new server_is_running() only checks for the running bit so that a server is always either running or down.	2018-07-31 14:53:56 +03:00
Esa Korhonen	c9570ff616	Check failover applicability to the cluster every turn This should give an advance warning if a user tries to activate auto_failover on a cluster which does not support it.	2018-07-20 15:33:47 +03:00
Esa Korhonen	936bcde135	Remove old "detect_standalone_master"-feature, update documentation The auto_failover is a more reliable solution and should be used instead. Several unused parameters were removed, although they can still be defined in the config file. Updated documentation on the relevant parts.	2018-07-16 15:58:16 +03:00
Markus Mäkelä	77a1417479	Replace TR1 headers with standard headers Now that the C++11 standard is the default one, we can remove the TR1 headers and classes.	2018-07-11 14:08:46 +03:00

1 2

86 Commits