Commit Graph

73 Commits

Author SHA1 Message Date
790d90f229 Update 2.3.16 Change Date 2020-01-15 11:08:51 +02:00
df6c56e7ca Update 2.3.13 Change Date 2019-10-29 12:51:31 +02:00
0a5a3309e0 Add missing quotes when printing server names
Some of the log messages didn't have the quotes.
2018-11-23 14:02:09 +02:00
fb52e565fe Store capabilities of monitored server
Checking the version number in various places in the code gets confusing.
It's better to check the version in one place and record the relevant data.
2018-11-21 17:36:52 +02:00
1a046bd453 MXS-2168 Add 'assume_unique_hostnames'-setting to MariaDBMonitor
Adds the setting and takes it into use during replication graph creation
and the most important checks.
2018-11-21 10:30:11 +02:00
2d61b78439 Fix low disk space maintenance
The setting didn't work because the code updated a status flag which
would be overwritten before being read. Also, promotion code now checks
that the server is not in maintenance.
2018-10-16 16:09:38 +03:00
f2067fcf7c Monitor cleanup
Removes unused code, compacts lines, moves code.
No functional changes.
2018-10-11 11:39:05 +03:00
707506feae A slave must have running slaves to be a relay master
This prevents some questionable status assignments, but also means that
the Relay Master status can be lost if a slave goes down. This is
contrary to Master status which is not lost if slaves go down. Fixes
mxs1961_standalone_rejoin.
2018-10-04 20:16:29 +03:00
05d18e81ae Use string instead of stringstream
Most of the monitor was already using string for formatted printing.
2018-09-27 16:46:59 +03:00
71ffef5708 Partially revert 4ba011266843857bbd3201e5b925a47e88e1808f
Add back leading operator enforcement.
2018-09-20 15:57:30 +03:00
108638b0cf Format with Uncrustify 0.67 2018-09-10 13:31:39 +03:00
d11c78ad80 Format all sources with Uncrustify
Formatted all sources and manually tuned some files to make the code look
neater.
2018-09-10 13:22:49 +03:00
c447e5cf15 Uncrustify maxscale
See script directory for method. The script to run in the top level
MaxScale directory is called maxscale-uncrustify.sh, which uses
another script, list-src, from the same directory (so you need to set
your PATH). The uncrustify version was 0.66.
2018-09-09 22:26:19 +03:00
9e566bc619 A master that is down with no running slaves can be replaced
This should be a more general way to detect situations where a DBA
or another MaxScale performs a failover.
2018-08-29 16:41:20 +03:00
916b72a733 Clean up loops
Changed many of the iterator loops to range loops.
2018-08-22 12:22:45 +03:00
3f53eddbde MXS-2020 Replace ss[_info]_dassert with mxb_assert[_message] 2018-08-22 11:34:59 +03:00
03cefcc4ac MXS-2012 Write replication lag to SERVER
Allows routers to read the value.
2018-08-21 11:51:10 +03:00
44a57dbefd Master can be a slave
This is possible if read_only is turned ON on the master and there is
no alternative master to swap to.
2018-08-21 11:51:10 +03:00
0d762b2019 MXS-2012 Remove old replication lag detection
The method was quite disruptive as it continuosly wrote to the database.
Also removed the MaxScale ID global, as it wasn't used for anything else.
2018-08-21 11:51:10 +03:00
8f257a51fe MXS-2008 Remove unused headers from worker.hh
Add corresponding headers to files that depended on those headers.
2018-08-20 11:15:14 +03:00
e5a90d63e1 Remove SERVER_WAS_SLAVE status bit
Was unused due to MariaDBMonitor changes.
2018-08-13 11:30:20 +03:00
17c84a22c7 Refactor preparations to failover
The two operations are quite similar so the code should look
similar as well and use shared functions.
2018-08-07 16:33:56 +03:00
836db54800 Clean up server status printing
Uses mostly the status functions for reading the flags. Strickly
speaking this breaks the REST API since in some cases (status combinations)
the printed string is different from what was printed before.
2018-08-02 10:42:12 +03:00
1e33ab69f2 Rename server_is_running() to server_is_usable()
The previous name was misleading. The new server_is_running() only
checks for the running bit so that a server is always either running
or down.
2018-07-31 14:53:56 +03:00
89dfc80f86 Better tracking for slave status bits
The monitor can now differentiate between slaves with a running
series of slave connections to the master from slaves with broken
links. Both still get the SERVER_SLAVE-flag if 'detect_stale_slave'
is on.

Also, relay servers must be running.
2018-07-31 14:53:29 +03:00
27084f1368 Handle the situation where the previous master is reselected 2018-07-23 12:17:00 +03:00
382a017518 A master which is down for longer than failcount is considered an invalid master
If auto_failover is disabled and an alternative master exists, the
monitor will swap the master. This may break replication, but the
situation requires that the dba has set up a cluster with multiple
masters.
2018-07-20 15:47:23 +03:00
e0361e335f Fix relay master assignment
The relay master status was assigned to a server based on the last known
replication status of the slaves that have at some point replicated from
it. This can cause false positives and the relay master status is assigned
to servers that have never been observed to act as relay masters.
2018-07-17 11:52:20 +03:00
bded99aea3 Assign slave status even if no master is available
The master validity check now checks if the master is down. This requires
that the slave status is assigned even if no master is available.

The failover precondition is also fulfilled as long as one valid promotion
candidate is found. Previously a slave that didn't use GTID replication
appeared to prevent failover.
2018-07-17 11:52:18 +03:00
f2e0bf3caa Factor out functions
The topology update is now in a method. Also, the m_master-field
is only written inside a method so that the cycle info is always
updated.
2018-07-16 15:58:16 +03:00
936bcde135 Remove old "detect_standalone_master"-feature, update documentation
The auto_failover is a more reliable solution and should be used instead. Several
unused parameters were removed, although they can still be defined in the config
file. Updated documentation on the relevant parts.
2018-07-16 15:58:16 +03:00
9d94230237 Assign status bits only for running servers
In previously the status bits were assigned only for running servers. Due
to the changes done in the monitoring algorithm, the slave and master
status bits are assigned to servers that are down. This change broke a
number of tests and deviates from previous behavior.

To keep the old behavior and to fix the test, the status bits are not
assigned to servers that are down.
2018-07-09 12:10:36 +03:00
03491a45f0 Remove old code
The functionality is elsewhere.
2018-07-03 10:32:06 +03:00
960d08a36a Code cleanup
Removed unused code.
2018-06-29 10:54:34 +03:00
9525d3507b Run manual commands without stopping the monitor
The command is saved in a function object which is read by the monitor
thread. This way, manual and automatic cluster modification commands are
ran in the same step of a monitor cycle.

This update required several modifications in related code.
2018-06-28 16:56:41 +03:00
6bf10904d7 MXS-1845 Only rebuild topology when required
The monitor now detects when a server has changed such that a replication
graph rebuild is needed and only then rebuilds the graph and detects
cycles and master.

Also, some old code is no longer called in the monitor cycle. It will be
removed in later commits. Refactored some of the related functions.
2018-06-28 16:56:41 +03:00
cc0299aee6 Update change date of 2.3 2018-06-25 10:07:52 +03:00
2f987d0b10 MXS-1845 Only select a master if current master is no longer usable
The purpose is to make the selected master server sticky. The master is reselected only
if the current master is no longer a valid master.
2018-06-18 11:06:58 +03:00
5324a1bdaa MXS-1845 Assign server roles
Assign server roles (master, slave, relay master, slave of external master)
for a graph with possibly multiple paths to a slave server.
2018-06-13 17:38:53 +03:00
3f82c25c62 MXS-1845 New algorithm for finding the master server
Not yet used, as more is needed to replace the old code. The
algorithm is based on counting the total number of slave nodes
a server has, possibly in multiple layers and/or cycles.
2018-06-13 17:38:33 +03:00
9a0445cd4e MXS-1845 Save cycle members
The saved data may be useful later on. Also includes some cleanup.
2018-06-05 16:25:04 +03:00
37841183b3 Cleanup server.h
Renamed, rearranged and clarified status bits. Removed unused macros.
2018-06-01 14:29:51 +03:00
4d7aff4ab9 MXS-1845 Find strongly connected components with multiple slave connections
Rewrote the algorithm for clarity.
2018-06-01 14:04:50 +03:00
a82c5911e5 MXS-1775 Rename m_monitor_base to m_monitor
To make it compatible with how the variable is named
in maxscale::MonitorInstance.
2018-06-01 13:48:15 +03:00
f8940d4a2a Use 64bits for server status flags
Monitor journal still uses 32bits.
2018-05-23 16:19:08 +03:00
3ec449339f Only write to SERVER->status at the end of a monitoring loop
This makes the code clearer and reduces race conditions, as the monitor
could be writing SERVER->status while a router is reading it. Also,
the time during which the SERVER struct is locked drops to a fraction.
2018-05-23 16:19:08 +03:00
b29bae6e84 MXS-1865 Update server version only when (re)connecting
Updating it every iteration is needless.
2018-05-16 13:55:45 +03:00
b92284afc4 Move server querying to MariaDBServer
The query functions still require the base monitor struct because of
mon_ping_or_connect_to_db().
2018-05-14 11:11:08 +03:00
10b2b4ac37 MXS-1703 Use monitor-specific array instead of linked list
Also starting cleanup of server specific monitor code.
2018-05-07 13:51:27 +03:00
b44f2cfa36 Remove SERVER->slaves field
The field was only used by MariaDB-Monitor. A later commit will add equivalent
information to the monitor diagnostics function.
2018-05-07 13:51:06 +03:00