Commit Graph

343 Commits

Author SHA1 Message Date
b33f464eea MXS-1506: Make heartbeat reads atomic
The old hkheartbeat variable was changed to the mxs_clock() function that
simply wraps an atomic load of the variable. This allows it to be
correctly read by MaxScale as well as opening up the possibility of
converting the value load to a relaxed memory order read.

Renamed the header and associated macros. Removed inclusion of the
heartbeat header from the housekeeper header and added it to the files
that were missing it.
2018-04-10 15:29:29 +03:00
434a71bc4d Replace std::to_string with std::stringstream
The GCC 4.4.7 implementation of std::to_string only has overloads for long
long int, long long unsigned int and long double. It's better to use
std::stringstream with this version of GCC to avoid having to write custom
code.
2018-04-09 22:35:40 +03:00
7f06c02f89 MXS-1703 Reduce use of MXS_MONITORED_SERVER in main loop
The base class should only be used with general monitor functions
and macros.
2018-04-09 14:57:16 +03:00
54121ed98b MXS-1703 Cleanup monitor interval waiting
The monitor now measures how long it should wait to get one interval.
2018-04-09 10:56:31 +03:00
8b642dbb5e MXS-1703 Rename typedefs in preparation for more changes
Also moved some code around.
2018-04-09 10:55:55 +03:00
71004a0ebc MXS-1703 Rearrange functions
Some functions were moved to class methods, others were moved to a different file.
All MariaDBMonitor fields are now private. Cleaned header a bit.
2018-04-09 10:54:34 +03:00
147355bbdb MXS-1744 Use gtid querying instead of MASTER_GTID_WAIT when waiting for catchup
MASTER_GTID_WAIT uses gtid_slave_pos when comparing to the target gtid. This creates
problems with multi-domain gtids. It's simpler to just query the server for its
gtids repeatedly. Also, the method is now in MariaDBServer.
2018-04-09 10:51:16 +03:00
174db469f3 MXS-1744 Rename and clean up gtid code, move to separate file
Renamed the two gtid-classes to better match MariaDB documentation. One
domain-server-sequence-combination is now a "Gtid", as these identify
a transaction. A "GtidList" contains a list of Gtid:s for handling multi-domain
server variables composed of multiple comma separated Gtid:s.

Removed unused methods and renamed some existing ones. Moved the Gtid-code
to its own file.
2018-04-06 10:23:02 +03:00
e43678bed9 MXS-1744 Take new Gtid-class into use
Also cleaned up mariadbserver a bit.
2018-04-06 10:16:29 +03:00
aaa8c92886 MXS-1744 Implement a Gtid-class which can store multi-domain gtid:s
The operations between Gtid:s are now more complicated so the class implements
them instead of the monitor. The Old Gtid-container has been renamed
GtidTriplet, and only stores the values for one triplet.
2018-04-04 15:53:42 +03:00
923de851f9 MXS-1703 Move more functions to MariaDBServer
Also, the QueryResult integer reading method now only reads non-negative integers
since the server rarely returns negative values. This frees negative values for
indicating parsing error(s).

Gtid-class was moved back to utility.hh/.cc because the QueryResult-class requires it.
2018-03-28 13:47:07 +03:00
f7ea12b8e4 Merge branch '2.2' into develop 2018-03-28 13:24:54 +03:00
7209080236 MXS-1747 Improve error messages of rejoin operations
Now states which query caused the error.
2018-03-28 12:39:10 +03:00
6c32c7421b MXS-1746 Query global gtid_domain_id instead of session-specific value
The monitor queried the session-specific domain id, which does not follow the global
value while the session is alive. This caused the monitor to follow the wrong gtid
domain if the domain was changed after MaxScale was started. This patch modifies the
query to read the global value instead. Even this is not fool-proof, as existing
sessions can issue writes with the old domain, confusing the gtid-parsing.
2018-03-28 12:23:57 +03:00
279fbf0fbe Fix crash in monitor diagnostics
A const_cast was missing, causing an endless loop.
2018-03-27 13:29:20 +03:00
1d5128fc5b MXS-1703 Move do_show_slave_status() to MariaDBServer
Also took into use the QueryResult helper class.
2018-03-27 12:43:36 +03:00
a4a5641f5b MXS-1703 Add convenience function + class for querying and storing results
An object of the class is returned as an auto_ptr to simplify memory management.
2018-03-26 15:43:54 +03:00
a350de1275 MXS-1703 Move MariaDBServer to its own file
generate_master_gtid_wait_cmd() is now a method of the Gtid-class.
2018-03-23 13:13:01 +02:00
e5dddf5f74 MXS-1703 Clean up monitor main loop function
Several blocks have been moved to their own functions to shorten
the main function.
2018-03-22 15:45:25 +02:00
aa035e4623 MXS-1703 Fix compile error on gcc 4.4
The base class pointer cannot be const since assignment operator
is used with the vector class. Seems to not be an issue with newer
compilers.
2018-03-22 14:28:06 +02:00
0b5601a6c8 MXS-1703 Rename MySqlServerInfo, prepare to use it as the primary class
Renamed to MariaDBServer. The objects have a pointer to the underlying
MXS_MONITORED_SERVER. The purpose is to have the monitor mainly use
MariaDBServer instead of the current mix of MXS_MONITORED_SERVER* and
MySqlServerInfo and to simplify the mapping between the two. Also,
many methods can be moved to the MariaDBServer class later on.

Some functions have been converted from MXS_MONITORED_SERVER* to
MariaDBServer.
2018-03-21 15:35:36 +02:00
d03787b2c9 Merge branch '2.2' into develop 2018-03-21 15:33:15 +02:00
bd8b6dbc6f MXS-1722 Add better error messages to switchover_demote_master()
The error messages should now be a bit more reliable.
2018-03-21 15:04:39 +02:00
5112cb4cfc MXS-1703: Stop monitor before reading class data
In theory, the value of m_master could change between reading it to
local variable and stopping monitor. To be on the safe side, stop the
monitor first.
2018-03-20 15:46:06 +02:00
09a22b26c8 MXS-1703 Fix uninitialized pointer in manual switchover
If the current master was given by user, maxscale would crash.
2018-03-20 14:30:15 +02:00
1bef791572 MXS-1703 Miscellaneous cleanup
1. Move some remaining class data private.
2. Linebreak long lines.
3. Move current master autoselection inside class method.
4. Remove single-use constant #defines.
5. Monitor status is only written inside loop.
2018-03-19 16:08:28 +02:00
4a6fc6b1c8 MXS-1703 Rearrange functions and methods
Lots of cleanup, but mostly distributing functions/methods to correct files.
2018-03-16 18:35:17 +02:00
2be576da31 MXS-1703 Fix refactoring error in get_replication_tree()
Refactoring and removing explicit class pointer caused a local variable
to mix with a class field. This fix renames the local variable.
2018-03-16 14:31:40 +02:00
6afd57122d Merge branch '2.2' into develop 2018-03-16 12:39:55 +02:00
2178667245 MXS-1679 Check for existence of master before continuing failover checks
Seems to fix the issue with MaxScale detecting an old master down event.
2018-03-16 11:26:58 +02:00
d32db326e4 MXS-1703 Manual switchover, failover, rejoin to class methods
This allows privatising several public methods. Also, cleaned up
monitor start and stop a bit.
2018-03-15 13:45:14 +02:00
51188123c8 MXS-1703 Move cluster dicovery code to a separate file
Attempting to break the large main file to smaller chuncks.
2018-03-14 17:52:15 +02:00
693854bd15 MXS-1703 Move most fields/methods to private 2018-03-14 15:08:53 +02:00
5aeac621f9 MXS-1703 Most functions now moved to class methods
Cluster discovery functions still remain.
2018-03-14 15:08:53 +02:00
fb55ea6015 MXS-1703 Move monitor main loop + other entrypoint contents to class methods 2018-03-14 15:08:53 +02:00
ec1a4de480 MXS-1703 Some miscellaneous functions moved to class 2018-03-13 16:09:14 +02:00
dad6a4f9bf Merge branch '2.2' into develop 2018-03-13 11:26:41 +02:00
b982458497 MXS-1679 Add more accurate error printing
The reason for rejoin failing should now be clearer.
2018-03-12 17:16:54 +02:00
5a62adc63e MXS-1678: Detect broken replication with Last_IO_Errno
This commit introduces changes that fix the relay master detection that
was broken by the merge from 2.1 into 2.2 by commit
1ecd791887994209eb29e56e1271f8c407cd0cdf.

In 2.2, the master server ID is used to detect whether a slave is actually
replicating from a master. The value is still displayed even if the slave
is not actively replicating from a master. The commit in 2.1 causes this
value to be stored unconditionally if it is available. By checking the
value of Last_IO_Errno and comparing it to a list of known error codes, we
know whether the slave is replicating properly.

The slave detection in 2.2 correctly identifies a broken slave with a
stopped IO thread. Due to this, the test case must be modified to check
that the relay master is not a slave if the IO thread is stopped.
2018-03-12 14:55:54 +02:00
69383c0943 Merge branch '2.2' into develop 2018-03-12 14:38:37 +02:00
6a8effaea1 MXS-1703: Move more functions to class methods 2018-03-12 10:58:11 +02:00
885d0af50f Merge branch '2.2' into develop 2018-03-09 21:00:16 +02:00
f7b284bbb7 Check IO thread status when verifying master failure
When MaxScale thinks that the master has failed, it tries to verify it by
seeing if the slave server is receiving events. There was a missing IO
thread status check in the slave_receiving_events function which caused
the failover to wait until the verification timed out.

The relay master detection logic also lacked a check for the slave SQL
thread status. The code should check the state of the SQL thread to
determine whether the server is actually a functional slave to a master.
2018-03-09 20:53:56 +02:00
d443e22d1b Merge branch '2.2.3' into 2.2 2018-03-09 20:50:01 +02:00
f4c7a4700a Disable fix to MXS-1678 in 2.2.3
The fix causes a regression in the failover functionality as there is a
dependency between the slave's master ID and how the failover
performs. This dependency should not exist but fixing it causes a problem
with the mysqlmon_rejoin_bad2 test.
2018-03-08 21:03:52 +02:00
b18207282d MXS-1703: Remove unused functions & old comments
Made obsolete with previous refactoring.
2018-03-07 14:18:09 +02:00
728817b001 MXS-1703: Move fail/switch/rejoin functions to class
Most of the related functions are now class methods in a dedicated file.
2018-03-07 13:26:31 +02:00
87911dc6b0 MXS-1703: refactor startup & shutdown
Startup now done in a static method. Constructor initializes some values.
Config parameters loaded in a separate method. Some things still need
looking.
2018-03-07 13:26:24 +02:00
7a505d9976 MXS-1703: Move replication manipulation functions to a separate file
Refactoring continues. This update moves some of the replication manipulation
functions to a separate file and turns them into class methods.
2018-03-07 13:26:14 +02:00
173f44b351 MXS-1703: Move additional classes to separate file
Also use stl containers in monitor definition.
2018-03-07 13:26:02 +02:00