Commit Graph

862 Commits

Author SHA1 Message Date
7a5f11b752 Fix wrong check for wsrep_ready
wsrep_ready was check for ON/YES/1/true, but it has to be checked for OFF/NO/0/false as we are removing nodes, not joining.
2019-04-25 07:45:09 +03:00
9f7a7e473e Enable galeramon to track wsrep_desync, wsrep_ready, wsrep_sst_donor_rejects_queries and wsrep_reject_queries 2019-04-25 07:45:09 +03:00
a4c6f3542a MXS-2315: Tokenized CS version extraction
The STL regex implementations have proven to be unreliable on older
systems and replacing the regex with hand-written code for version
extraction is less prone to break.
2019-04-17 11:17:33 +03:00
2ca9337da1 Merge branch '2.2' into 2.3 2019-04-16 16:34:57 +03:00
5ba305c2c1 MXS-2426 Do not permanently disable automatic cluster operations when they fail
Only disabled for "failcount" monitor ticks. Also turns some related log
messages to notices.
2019-04-16 11:26:34 +03:00
f8a22d0ac0 MXS-2344 Add setting for enabling SSL for replication
If the monitor setting "replication_master_ssl" is set to on, any CHANGE MASTER TO-command
will have MASTER_SSL=1. If set to off or unset, MASTER_SSL is left unchanged to match existing
behaviour.
2019-04-15 19:15:45 +03:00
74c888316e Fix csmon version check
The version check still assumed that 1.1.7 has the required functionality.
2019-03-25 18:48:26 +02:00
5cdba97ec7 Merge commit '216eb904c557509ea5a3216e68e274df957ab807' into 2.3 2019-03-22 10:48:31 +02:00
216eb904c5 MXS-1991 Allow replication_user and replication_password be set runtime
Because runtime changes are performed one at a time, adding replication credentials
to a mariadbmon which didn't have any would cause an error to be printed, and
the monitor would not start.

This is now fixed by allowing replication_user without replication_password. This
is not an ideal solution as a configuration file with only replication_user would be
accepted. Also, when adding the credentials to a monitor, replication_user must be
given first to avoid the error.
2019-03-21 17:06:24 +02:00
040562f718 MXS-2342 Run MariaDBMonitor diagnostics concurrent with the monitor loop
This fixes some situations where MaxAdmin/MaxCtrl would block and wait
until a monitor operation or tick is complete. This also fixes a deadlock
caused by calling monitor diagnostics inside a monitor script.

Concurrency is enabled by adding one mutex per server object to protect
array-like fields from concurrent reading/writing.
2019-03-12 10:50:16 +02:00
c8078c99e5 MXS-2325 Fix Debian 8 compilation 2019-03-11 14:39:02 +02:00
50f588db3e MXS-2370 Clarify query timeout warning message
The message now more clearly states if the failure was due to timeout or
a different Connector-C error.
2019-03-11 13:20:50 +02:00
6332f0876b Merge remote-tracking branch 'origin/2.3' into 2.3 2019-03-05 04:59:26 +02:00
4fd4b726a1 MXS-2325 Only enable events that were enabled on the master
The monitor now continuously updates a list of enabled server events. When
promoting a new master in failover/switchover, only events that were enabled
on the previous master are enabled on the new. This avoids enabling events
that may have been disabled on the master yet stayed in the SLAVESIDE_DISABLED-
state on the slave.

In the case of reset-replication command, events on the new master are only
enabled if the monitor had a master when the command was launched. Otherwise
all events remain disabled.
2019-03-04 16:00:07 +02:00
7904cdaefb Fix assume_unique_hostnames
It was always set to true when the servers were created.
2019-03-04 08:53:12 +02:00
5e19d1d044 MXS-2315: Use BRE with std::regex
The default ECMAScript syntax appears to be broken on CentOS 7 which
effectively prevents its use in most cases. A more reliable alternative
would be to use the bundled PCRE2 library but the basic POSIX regular
expressions seem to work.
2019-02-06 12:11:06 +02:00
8cef8b9472 Compile MariaDBMonitor unit tests only if flag is set 2019-01-15 15:44:21 +02:00
8c437c6440 Fix crash in galeramon
An std::string was assigned a null value which will cause a crash.
2019-01-04 12:00:48 +02:00
f87ff431c1 Merge branch '2.2' into 2.3 2018-11-27 11:46:47 +02:00
f41caae5c7 MXS-2175: Fix available_when_donor
If a Galera cluster drops down to a single node, the last node would not
be considered valid. During the failure of the second to last node, the
master would also temporarily lose the master status.

The behavior was changed to always keep the cluster UUID until the cluster
size drops down to zero. This guarantees that the same cluster is used as
long as possible.
2018-11-27 09:22:39 +02:00
b47d4a4105 Use max_statement_time
Used only with supporting server versions. Using the time limit ensures that
the server interrupts the query at the same point Connector-C would cut the
connection. This prevents lingering queries.

Also, cleans up some associated error messages.
2018-11-23 15:15:44 +02:00
0a5a3309e0 Add missing quotes when printing server names
Some of the log messages didn't have the quotes.
2018-11-23 14:02:09 +02:00
fb52e565fe Store capabilities of monitored server
Checking the version number in various places in the code gets confusing.
It's better to check the version in one place and record the relevant data.
2018-11-21 17:36:52 +02:00
01628dd0de Cleanup server version updating 2018-11-21 17:36:52 +02:00
5d4775cac1 MXS-2168 Update test_cycle_find
The test now uses both server id:s and hostname:port combinations.
2018-11-21 10:40:21 +02:00
90da3a4d90 Remove MXS_MONITORED_SERVER mapping from MariaDBMon
The mapping was rarely used.
2018-11-21 10:30:11 +02:00
1a046bd453 MXS-2168 Add 'assume_unique_hostnames'-setting to MariaDBMonitor
Adds the setting and takes it into use during replication graph creation
and the most important checks.
2018-11-21 10:30:11 +02:00
bba0bc0f31 MXS-2158 Relax requirements for manual rejoin
The operation is now allowed even if the rejoining server has empty gtid:s.
Auto-rejoin keeps the safeties on.
2018-11-16 13:03:30 +02:00
6a1cfddb43 MXS-2158 Clean up gtid updating during rejoin
Error messages from update_gtids() are now printed. can_replicate_from()
no longer updates gtid:s.
2018-11-16 12:56:24 +02:00
a377a9fc5a Add gtid event in reset-replication
Adds a "FLUSH TABLES" command at the end so that the new master has a non-empty
gtid_binlog_pos after the operation.
2018-11-14 11:01:48 +02:00
14e38e4e08 MXS-2158 Return true if update_gtids() succeeds, even if no data is returned
Previously, if the server had no gtid:s, the method would fail leading to
a confusing error message. This could even totally stop the monitor from working
if a recent server version (10.X) did not have any gtid events.
2018-11-14 10:56:42 +02:00
ecc7442358 Detect manual commands faster
Previous, MariaDBMonitor would wait until the next monitor interval before detecting
a new manual command. The commands are now checked every 100 ms.
2018-11-08 19:12:00 +02:00
fb3ccc94d6 MXS-1944 Cleanup function parameters
The naming and ordering is now a bit more consistent between promote() and demote().
2018-11-07 12:55:59 +02:00
e4e2235297 MXS-1944 Use time limited methods in rejoin
Uses switchover time limit, since the typical rejoin of a standalone server
is somewhat similar to a switchover.
2018-11-07 12:55:59 +02:00
184e187732 Different cluster operations use different parameter types
Only the parameters used by all operations are in the common class.
2018-11-07 12:55:59 +02:00
a4ce4e4613 MariaDBServer no longer uses ClusterOperation
The functions in the server class now only use the general parameters object.
2018-11-07 12:55:59 +02:00
8877e7180b Continue separation of ClusterOperation elements 2018-11-07 12:55:59 +02:00
90e6ff078a Divide ClusterOperation to two types
The main class was getting unwieldly and too general. Dividing the fields
helps adding support for other operation types.

This commit leaves most data duplicated, later commits clean up the affected code.
2018-11-07 12:55:59 +02:00
11a756a028 Detect undefined references at link time
Instruct the linker to make sure all symbols are resolved at link time.
2018-11-06 21:34:28 +02:00
c5a54d2fe9 Disallow switchover promotion of a server low on disk space
This protects the user and also prevents a neverending series of
automatic switchovers in the case when all servers are low.
2018-11-06 11:44:50 +02:00
b1c469259c MXS-1467: Create csmon
Added the csmon monitor which supports both old and new ColumnStore
servers. As older server versions aren't able to express their role, the
master needs to be designated by the user. When a ColumnStore version is
released that supports the mcsSystemPrimary() function, the master can be
automatically found.
2018-11-05 23:24:06 +02:00
84d45447fc Fix formatting 2018-11-02 11:27:37 +02:00
f34ca0d473 Fix peculiar wrapping 2018-11-01 12:39:18 +02:00
cab40c54e4 Merge branch '2.2' into 2.3 2018-11-01 10:52:42 +02:00
e1dedfb678 Update galeramon.c (#183)
* Update galeramon.c

support wsrep_sst_method "xtrabackup-v2" for available_when_donor maxscale option

* reformat line to fit <=110 chars / support xtrabackup-v2 sst method
2018-10-31 16:00:26 +02:00
906d8cee5b Format all files
Formatted all files with uncrustify.
2018-10-31 09:46:02 +02:00
36b666898c Fix connection merging
The conditional was inverted.
2018-10-16 16:09:38 +03:00
2d61b78439 Fix low disk space maintenance
The setting didn't work because the code updated a status flag which
would be overwritten before being read. Also, promotion code now checks
that the server is not in maintenance.
2018-10-16 16:09:38 +03:00
0c203fa02d Don't redirect duplicate connections
The redirection method checks if a slave connection to the redirection
target already exists. If so, the connection is not modified. Also, failover
better detects duplicate connections during promotion.
2018-10-16 16:09:38 +03:00
e930270b9c Use copy when checking removed connections
The function modifies the reference parameter contents indirectly.
2018-10-16 16:09:38 +03:00