137 Commits

Author SHA1 Message Date
Esa Korhonen
6b14479b6c MXS-2271 Rename MXS_MONITORED_SERVER to MonitorServer 2019-03-19 13:32:38 +02:00
Esa Korhonen
14b4fa632a MXS-2271 Move Monitor inside maxscale-namespace
Rearranged monitor.cc by namespace.
2019-03-15 12:57:35 +02:00
Esa Korhonen
f05a2317d9 Merge branch '2.3' into develop 2019-03-12 11:22:34 +02:00
Esa Korhonen
040562f718 MXS-2342 Run MariaDBMonitor diagnostics concurrent with the monitor loop
This fixes some situations where MaxAdmin/MaxCtrl would block and wait
until a monitor operation or tick is complete. This also fixes a deadlock
caused by calling monitor diagnostics inside a monitor script.

Concurrency is enabled by adding one mutex per server object to protect
array-like fields from concurrent reading/writing.
2019-03-12 10:50:16 +02:00
Esa Korhonen
a8949b2560 MXS-2271 Move free monitor functions into classes
Functions are divided to MonitorManager, Monitor, or the monitored
server.
2019-03-12 10:29:55 +02:00
Esa Korhonen
8abe1c8448 Merge branch '2.3' into develop 2019-03-11 15:46:18 +02:00
Esa Korhonen
50f588db3e MXS-2370 Clarify query timeout warning message
The message now more clearly states if the failure was due to timeout or
a different Connector-C error.
2019-03-11 13:20:50 +02:00
Esa Korhonen
83fc3b1bc2 Merge branch '2.3' into develop 2019-03-04 17:43:53 +02:00
Esa Korhonen
4fd4b726a1 MXS-2325 Only enable events that were enabled on the master
The monitor now continuously updates a list of enabled server events. When
promoting a new master in failover/switchover, only events that were enabled
on the previous master are enabled on the new. This avoids enabling events
that may have been disabled on the master yet stayed in the SLAVESIDE_DISABLED-
state on the slave.

In the case of reset-replication command, events on the new master are only
enabled if the monitor had a master when the command was launched. Otherwise
all events remain disabled.
2019-03-04 16:00:07 +02:00
Esa Korhonen
17fc2ba88a Store error status into QueryResult object
The QueryResult-object remembers if a conversion failed. This makes checking
for errors more convenient, as just one check per row is required. The conversion
functions always return a valid value.
2019-01-22 15:34:19 +02:00
Esa Korhonen
b4d91d4b9a Move query result helper class to maxsql
Added some asserts to ensure the class is used correctly.
2019-01-14 10:43:17 +02:00
Esa Korhonen
cf9ac9004e Fix compilation error
A variable had the same name as a type, which confused some compilers.
2019-01-04 10:40:11 +02:00
Esa Korhonen
40485d746c MXS-2220 Change server name to constant string 2019-01-03 12:13:15 +02:00
Esa Korhonen
09aa54720d MXS-2220 Read server version using public methods
Version related fields have been removed from the public class.
2019-01-03 11:23:14 +02:00
Esa Korhonen
eacf88f6a5 MXS-2220 Add server version and type information struct
The old fields are still used.
2018-12-19 13:18:16 +02:00
Esa Korhonen
6209d737e9 MXS-2220 server_alloc returns internal type
Also adds default initializers to SERVER fields.
2018-12-14 10:31:57 +02:00
Esa Korhonen
3e5818fcb6 MXS-2205 Convert mysql_utils.h to .hh 2018-12-03 14:05:21 +02:00
Esa Korhonen
0c7e737eb7 Move some mysql/mariadb utilities to maxutils
Can be used in system tests later on.
2018-11-30 13:03:37 +02:00
Esa Korhonen
a1e1ac0012 Move string_printf to maxbase
Can be used in tests.
2018-11-29 12:21:40 +02:00
Johan Wikman
55a39268c6 MXS-2163 Make MariaDBMon recognize Clustrix
To allow MariaDBMon to be used with Clustrix we need to handle
Clustrix separately as its apparent version is 5.0.45, which is
lower than what MariaDBMon supports. Further, we must ensure that
Clustrix does not query the slave status as there are no slaves
in the M/S sense in a Clustrix cluster.

NOTE: Once there is a specific Clustrix monitor, this code should
be removed.
2018-11-28 15:27:23 +02:00
Esa Korhonen
b47d4a4105 Use max_statement_time
Used only with supporting server versions. Using the time limit ensures that
the server interrupts the query at the same point Connector-C would cut the
connection. This prevents lingering queries.

Also, cleans up some associated error messages.
2018-11-23 15:15:44 +02:00
Esa Korhonen
0a5a3309e0 Add missing quotes when printing server names
Some of the log messages didn't have the quotes.
2018-11-23 14:02:09 +02:00
Esa Korhonen
fb52e565fe Store capabilities of monitored server
Checking the version number in various places in the code gets confusing.
It's better to check the version in one place and record the relevant data.
2018-11-21 17:36:52 +02:00
Esa Korhonen
01628dd0de Cleanup server version updating 2018-11-21 17:36:52 +02:00
Esa Korhonen
1a046bd453 MXS-2168 Add 'assume_unique_hostnames'-setting to MariaDBMonitor
Adds the setting and takes it into use during replication graph creation
and the most important checks.
2018-11-21 10:30:11 +02:00
Esa Korhonen
6a1cfddb43 MXS-2158 Clean up gtid updating during rejoin
Error messages from update_gtids() are now printed. can_replicate_from()
no longer updates gtid:s.
2018-11-16 12:56:24 +02:00
Esa Korhonen
14e38e4e08 MXS-2158 Return true if update_gtids() succeeds, even if no data is returned
Previously, if the server had no gtid:s, the method would fail leading to
a confusing error message. This could even totally stop the monitor from working
if a recent server version (10.X) did not have any gtid events.
2018-11-14 10:56:42 +02:00
Esa Korhonen
fb3ccc94d6 MXS-1944 Cleanup function parameters
The naming and ordering is now a bit more consistent between promote() and demote().
2018-11-07 12:55:59 +02:00
Esa Korhonen
e4e2235297 MXS-1944 Use time limited methods in rejoin
Uses switchover time limit, since the typical rejoin of a standalone server
is somewhat similar to a switchover.
2018-11-07 12:55:59 +02:00
Esa Korhonen
a4ce4e4613 MariaDBServer no longer uses ClusterOperation
The functions in the server class now only use the general parameters object.
2018-11-07 12:55:59 +02:00
Esa Korhonen
8877e7180b Continue separation of ClusterOperation elements 2018-11-07 12:55:59 +02:00
Esa Korhonen
90e6ff078a Divide ClusterOperation to two types
The main class was getting unwieldly and too general. Dividing the fields
helps adding support for other operation types.

This commit leaves most data duplicated, later commits clean up the affected code.
2018-11-07 12:55:59 +02:00
Esa Korhonen
c5a54d2fe9 Disallow switchover promotion of a server low on disk space
This protects the user and also prevents a neverending series of
automatic switchovers in the case when all servers are low.
2018-11-06 11:44:50 +02:00
Markus Mäkelä
906d8cee5b
Format all files
Formatted all files with uncrustify.
2018-10-31 09:46:02 +02:00
Esa Korhonen
36b666898c Fix connection merging
The conditional was inverted.
2018-10-16 16:09:38 +03:00
Esa Korhonen
2d61b78439 Fix low disk space maintenance
The setting didn't work because the code updated a status flag which
would be overwritten before being read. Also, promotion code now checks
that the server is not in maintenance.
2018-10-16 16:09:38 +03:00
Esa Korhonen
0c203fa02d Don't redirect duplicate connections
The redirection method checks if a slave connection to the redirection
target already exists. If so, the connection is not modified. Also, failover
better detects duplicate connections during promotion.
2018-10-16 16:09:38 +03:00
Esa Korhonen
e930270b9c Use copy when checking removed connections
The function modifies the reference parameter contents indirectly.
2018-10-16 16:09:38 +03:00
Esa Korhonen
f554ef770b Allow switchover for arbitrary topologies
The demoted server no longer needs to be the master.
2018-10-16 16:09:38 +03:00
Esa Korhonen
0cf8ea43f7 Redirect slaves of promotion target
This affects situations where the promoted server is a relay or multimaster
group member.
2018-10-16 16:09:38 +03:00
Esa Korhonen
d0444ff054 SlaveStatus::to_short_string() uses member field
The owner server name is now stored in a field.
2018-10-10 17:26:48 +03:00
Esa Korhonen
2f1512a22d Cleanup slave connection removal during promotion/demotion
The removing and slave status updating is now separated to a function.
As the MariaDBServer object now contains the updated slave connections,
keeping track of removed connections is no longer required.
2018-10-09 14:29:49 +03:00
Esa Korhonen
c10fab977d Cleanup slave connection copy & merge
The two cases are now separated. In switchover, the promotion and
demotion targets can swap connections between each other without worry.
In failover, the two connection lists must be merged semi-intelligently.

The slave connections of the two servers are now saved to the operation
descriptor object at the start of the operation. This allows slave status
updating during the operation.
2018-10-09 14:29:49 +03:00
Esa Korhonen
4d6f961695 Cleanup mariadbserver.hh 2018-10-09 14:29:49 +03:00
Esa Korhonen
68d65682b5 Reorganize MariaDBServer code
The server-class keeps growing, so the additional classes are moved out of
the main class file.
2018-10-09 14:29:49 +03:00
Esa Korhonen
a398da58a4 Add sleep to execute_cmd_time_limit
If the query fails instantly, the retries end up busy-looping. Now
each try is at least one second.
2018-10-04 20:19:57 +03:00
Esa Korhonen
30eb21914f MXS-1845 Switchover cleanup
Several small changes:
Binlog is flushed at the end of old master demotion.
Only new master is required to catch up to old master.
Use the same replication check method as failover.
2018-10-04 11:45:33 +03:00
Esa Korhonen
49e85d9a28 MXS-1845 Add demotion code
The master demotion in switchover now uses query retrying with
the switchover time limit.
2018-10-04 11:45:33 +03:00
Esa Korhonen
d14b9bfe43 MXS-1845 Cluster stabilization rewrite
No longer writes events to the master, as this creates problems if the
promoted server was not the overall master. Instead, the slave status
output is inspected.
2018-10-02 11:09:16 +03:00
Esa Korhonen
1ca5d02abb MXS-1845 Add redirection code
Should work with multimaster replication.
2018-10-02 11:09:16 +03:00