MaxScale

Author	SHA1	Message	Date
Esa Korhonen	c2c898ee93	Fix formatting in MariaDB Monitor	2018-01-16 13:27:20 +02:00
Esa Korhonen	ff2ad05d0a	Add manual rejoin command to MariaDB Monitor The rejoin command takes as parameters the monitor name and name of server to rejoin. This change required refactoring the rejoin code.	2018-01-16 13:20:35 +02:00
Esa Korhonen	5f4db64ac7	Better timing for switchover, check slaves for IO/SQL errors Time elapsed is now properly tracked during a switchover. After slave redirection, an event is added to the master. Then, the slaves are queried repeatedly until they advance to the newest event. I/O and SQL errors are also detected.	2018-01-08 15:23:25 +02:00
Esa Korhonen	047c08f577	MXS-1588: Wait on all slaves during switchover During switchover, MASTER_GTID_WAIT is now called on all slaves. This causes switchover to complete slower than before but is safer if log_slave_updates is not on on the new master server. Also, read_only is disabled on the demoted server if waiting on slaves or promotion fails. This should effectively cancel the failover for the old master.	2018-01-03 12:52:33 +02:00
Johan Wikman	9558addbfe	Update module name for mariadbmon	2018-01-02 11:01:28 +02:00
Johan Wikman	d4f9cb661f	MXS-1587 Rename mysqlmon to mariadbmon 'mysqlmon' is still accepted but 'mariadbmon' is loaded instead. This is done at runtime instead of e.g. by using a symbolic link, so that a warning can be logged. The warning is logged and the translation of the module name is made by the code that loads the modules so that it's easy to do the same thing for other modules as well. In a subsequent commit the documentation is updated.	2017-12-27 11:22:27 +02:00
Esa Korhonen	3ccd6eed28	MXS-1588 Fix switchover Change the ordering of the two flushes such that FLUSH LOGS comes last. This seems to make sure gtid:s are updated to newest values before the MASTER_GTID_WAIT-call. Without this fix, switchover does complete succesfully, but some of the slaves may not be able to replicate due to not having same events as new master. Exact reason for this still unclear.	2017-12-22 13:35:36 +02:00
Markus Mäkelä	79652301d8	Fix split of mysqlmon sources For some reason, the source code of mysqlmon was split into C and C++ sources. This caused problems by effectively discarding all changes from 2.1 that are merged into 2.2. This commit merges the changes into the correct file that were added to the wrong file.	2017-12-21 16:24:03 +02:00
Esa Korhonen	7ed9172496	MXS-1533: Rejoin also when the old master cannot be connected Previously, the rejoin would only be ran on servers with a connected slave io thread. This patch runs the rejoin also on slaves which cannot connect to a downed old master while the master hostname or port differs from the current cluster master server.	2017-12-18 13:04:47 +02:00
Johan Wikman	b6e983c0b5	Change default value of detect_standalone_master The default value was changed from false to true.	2017-12-13 13:18:44 +02:00
Markus Mäkelä	79afaa447e	Merge branch '2.1' into 2.2	2017-12-12 13:23:02 +02:00
Esa Korhonen	b29a8eb4f8	MXS-1513: Flush logs before tables during switchover_demote_master In this order, the new binary log file will have 1 event instead of 0. In total, only 1 event is added.	2017-12-08 12:28:50 +02:00
Esa Korhonen	c6daf8c26b	MXS-1513: More accurate error messages The failed query is now printed.	2017-12-07 10:54:30 +02:00
Esa Korhonen	b2bc087508	MXS-1491: Print failcount Failcount is now printed in "show monitors". Also, when master goes down but failover does not yet happen because failcount > 1, a message is logged.	2017-12-07 10:05:45 +02:00
Esa Korhonen	046ed5c93d	MXS-1513: mysql_mon.cc formatting changes Ran astyle, cut some long lines.	2017-12-04 13:53:16 +02:00
Esa Korhonen	c0ab80e459	MXS-1513: Cleanup some messages, change function names No real functionality changes.	2017-12-04 12:53:09 +02:00
Esa Korhonen	45834a89b5	MXS-1533: Rename "auto_join" to "auto_rejoin"	2017-12-04 09:59:29 +02:00
Esa Korhonen	508ce3a703	MXS-1491: Failover can be executed manually Also, renamed config setting "failover" to "auto_failover". Removed setting "switchover" as it is now always enabled.	2017-12-04 09:41:00 +02:00
Esa Korhonen	90f6d78a58	MXS-1533: Add automatic join feature When enabled, the monitor will redirect servers to replicate from the current master. Standalone servers and servers replicating from a slave are redirected.	2017-12-04 09:37:16 +02:00
Esa Korhonen	4cb50f48ad	MXS-1533: Fix relay master identification and root master detection Relay master servers must now have a running slave. Also, fix cluster master detection in get_replication_tree().	2017-11-30 16:16:19 +02:00
Markus Mäkelä	d5d41349ae	MXS-1509: Add `ignore_external_masters` parameter The new parameter allows ignoring of master servers that are external to the monitor configuration. This allows sub-trees of the actual replication tree to be used as fully fledged replication trees.	2017-11-30 12:39:00 +02:00
Esa Korhonen	23cd294dad	MXS-1533: Better handling of multi-domain gtids If the gtid_domain_pos of the master is ever modified, gtid-variables will have multiple domains. Generally, we are only interested in the most recent domain. This is tracked in gtid_domain_id:s and the value of the master is used for filtering the correct domain from all gtid-values. Also, use gtid_current_pos instead of gtid_slave_pos. The advantage of current_pos is that the same variable works also for master servers. The gtid-handling is now more thorough and detects some weird situations.	2017-11-27 16:31:13 +02:00
Esa Korhonen	15e330e127	MXS-1533: Save gtid_domain_id and server version to MySqlServerInfo Cleaned the surrounding code, as it was querying server version twice.	2017-11-27 16:30:43 +02:00
Esa Korhonen	dc2286c774	MXS-1513: Obay user-given new master server If given a readily selected master server, Switchover will use it as the new master. If the given server is invalid, nothing will happen and an error is returned.	2017-11-23 13:37:42 +02:00
Markus Mäkelä	396b81f336	Fix in-source builds The internal header directory conflicted with in-source builds causing a build failure. This is fixed by renaming the internal header directory to something other than maxscale. The renaming pointed out a few problems in a couple of source files that appeared to include internal headers when the headers were in fact public headers. Fixed maxctrl in-source builds by making the copying of the sources optional.	2017-11-22 18:40:18 +02:00
Esa Korhonen	8077d97e25	MXS-1513: Work around the backend_read_timeout-setting The setting limits the maximum time a MASTER_GTID_WAIT-function can wait. To work around this limitation, the function is now called in a loop such that the total timeout is approximately equal to the requested timeout.	2017-11-20 12:31:11 +02:00
Esa Korhonen	59616b5f3e	MXS-1490: Cleanup error printing, add json error Slave redirection is a special case, as there the total failure is only known after all redirects have been attempted. In the failure case, all errors from connections are gathered to one message.	2017-11-20 11:33:47 +02:00
Esa Korhonen	84d1ea0bff	MXS-1490: Perform failover only after failcount monitor loops The same failcount variable is used for the detect_standalone_master- feature.	2017-11-17 10:12:33 +02:00
Esa Korhonen	b63c6504a3	MXS-1513: Switchover script First version of switchover script. Unsafe to run as it has no timeouts for most queries. Also, removed code launching the previous switchover_script.	2017-11-16 10:51:12 +02:00
Markus Mäkelä	f41111b4bd	MXS-1517: Retain stale master bit even on master failure If a server goes down and it has the stale master bit enabled, all other bits for the server are cleared. This allows failed masters that have been replaced to be first detected and then reintroduced into the replication topology.	2017-11-14 16:53:09 +02:00
Esa Korhonen	3a13469691	MXS-1490 Fix bug with gtid_io_pos change check The conditional was opposite to intention.	2017-11-08 10:46:51 +02:00
Esa Korhonen	a1a5947d61	MXS-1490: Parse Gtid_IO_Pos only when using Gtid First check "Using_Gtid", as that should be always valid. If set to "Slave_Pos", parse "Gtid_IO_Pos".	2017-11-08 10:46:51 +02:00
Johan Wikman	4cf01fa88f	Remove 'failover_script' parameter As the failover is now internal to MySQL Monitor, no failover script parameter is needed.	2017-11-07 16:05:44 +02:00
Markus Mäkelä	dce073a684	MXS-1496: Don't assign slave status for masters The slave and stale slave status bits should be cleared from a master if it still has them. Also used the correct functions to manipulate the bits instead of directly setting them in the monitor.	2017-11-07 15:52:28 +02:00
Esa Korhonen	84e95cee96	MXS-1490: Query gtid_slave_pos only during failover The value of the global gtid_slave_pos is only needed during failover, so querying it every monitor loop is unnecessary. The value is now only requested when deciding on a new master server or when waiting for the selected promotion target to clear its relay logs. Also, when waiting for the logs to clear, gtid_io_pos must stay constant or failover is cancelled. Io_pos advancing indicates that the server is still receiving events from the old master.	2017-11-07 13:09:51 +02:00
Esa Korhonen	0bb54511b7	MXS-1490: Query binlog & gtid settings, read @@gtid_slave_pos The Gtid_Slave_Pos returned by SHOW ALL SLAVES STATUS is not quite reliable (MDEV-14182) so the variable version is used instead. Added a convenience function for querying a single row of values. Also, gtid_strict_mode, log_bin and log_slave_updates are now queried during failover. The first only causes a warning message if disabled, the last two affect new master selection.	2017-11-06 12:23:35 +02:00
Johan Wikman	2115ad7911	Make lines <= 110 chars long	2017-11-02 09:29:24 +02:00
Esa Korhonen	e79a95cd96	MXS-1490: Parse Gtid-strings with multiple triplets Gtid_Slave_Pos may contain multiple triplets even with single-source replication if the domain has changed at some point. For failover, we only need to know the current domain values, so the gtid-parsing now accepts an optional domain parameter. The Gtid-class still only stores one triplet of values. When parsing the Show Slave Status result, Gtid_IO_Pos is parsed first. The resulting domain is then read from Gtid_Slave_Pos.	2017-11-01 14:43:13 +02:00
Esa Korhonen	0f2c1ff7d6	MXS-1490: Wait for a slave to clear relay logs before promotion When selecting the new master server, Gtid_IO_Pos is checked to select the slave with the latest event in relay log. If there is a tie, the slave that has processed most events wins. It's possible that the winning slave has unprocessed events. In this case, failover waits for the slave to complete processing the log. The maximum wait is defined in monitor parameter "failover_timeout", defaulting to 90 seconds. If time runs out failover ends in failure. The Gtid struct was separated to its own definition to handle gtid:s easier.	2017-10-31 18:27:16 +02:00
Esa Korhonen	41cd0cd6d7	MXS-1490 Separate SlaveStatus information to its own class The SlaveStatus info is now in a separate class, although it's still embedded in the MYSQL_SERVER_INFO-class. Both classes now use strings intead of char*:s.	2017-10-30 10:33:41 +02:00
Markus Mäkelä	c7c670930c	MXS-1493: Check that master appears dead before verifying it Before the verification of the master's failure is done, the master must first appear to have failed.	2017-10-27 15:31:46 +03:00
Markus Mäkelä	0bc439641a	Add helper function for reading values by field name The helper function provides map-like access to row values. This is used to retrieve the values for all MariaDB 10.0+ versions as there are differences in the returned results between 10.1 and 10.2.	2017-10-27 15:31:46 +03:00
Markus Mäkelä	2d1e5f46fa	Remove use of timestamps in failover code Using timestamps to detect whether MaxScale was active or passive can cause problems if multiple events happen at the same time. This can be avoided by separating events into actively observed and passively observed events. This clarifies the logic by removing the ambiguity of timestamps. As the monitoring threads are separate from the worker threads, it is prudent to use atomic operations to modify and read the state of the MaxScale. This will impose an happens-before relation between MaxScale being set into passive mode and events being classified as being passively observed.	2017-10-27 15:31:46 +03:00
Markus Mäkelä	52473c379b	Extract Gtid_Slave_Pos in mysqlmon The string form value of Gtid_Slave_Pos is extracted into different integer components.	2017-10-27 15:31:46 +03:00
Markus Mäkelä	0be39b8545	MXS-1493: Improve master failure detection The master failure can now be verified by checking when the slaves are connected to the master. If the slaves do not receive any events from the master, the connections are considered as down after a configurable limit. Added two parameters for controlling whether the check is done and for how long the monitor waits before doing the failover.	2017-10-27 15:31:18 +03:00
Markus Mäkelä	26b47d0b90	MXS-1493: Collect slave heartbeats The slave heartbeat count and period are collected from the SHOW ALL SLAVES STATUS output. This, in addition to the relay log position, is used to calculate the point in time when a slave has last interacted with the master. By using this timestamp, the monitor can enforce a minimum "timeout" for the master before a failover is performed.	2017-10-27 15:30:38 +03:00
Esa Korhonen	48a15368d0	MXS-1490-1492: First version of failover script Works in ideal situations and can be tested. Does not consider relay log and only checks that commands were received by a backend. Work in progress.	2017-10-27 10:54:50 +03:00
Markus Mäkelä	114ea49e10	MXS-1494: Add missing replication credentials parameters The parameters weren't added to the list of module parameters.	2017-10-26 17:37:02 +03:00
Esa Korhonen	63c7550196	MXS-1490 Prepare for failover functionality addition Moved mon_process_failover() from monitor.cc to mysql_mon.cc. Renamed some functions and variables related to previous failover functionality to avoid confusion.	2017-10-25 12:24:29 +03:00
Markus Mäkelä	554ae642d7	MXS-1495: Add failover sanity check The sanity check disables the failover functionality if a server is configured to replicate from more than one source.	2017-10-24 23:45:23 +03:00

1 2 3 4 5 ...

446 Commits