Commit Graph

8332 Commits

Author SHA1 Message Date
236e906d88 Revert "Turn MariaDB Monitor struct to class with public fields"
This reverts commit cb6f70119d9857b277306e9af5881fe29c574a32.
2018-02-24 15:37:50 +02:00
13661ab4a6 Revert "MariaDB Monitor: Move additional classes to separate file"
This reverts commit ff55106610881d55db88eca9e2ef6a056cbc8d51.
2018-02-24 15:35:36 +02:00
e721733434 Revert "MariaDBMon: Move replication manipulation functions to a separate file"
This reverts commit 8cdd23dda2add6486abb685834def94c72a09b6c.
2018-02-24 15:35:02 +02:00
8cdd23dda2 MariaDBMon: Move replication manipulation functions to a separate file
Refactoring continues. This update moves some of the replication manipulation
functions to a separate file and turns them into class methods.
2018-02-22 10:51:52 +02:00
e385e6fcd4 Merge branch '2.2' into develop 2018-02-22 10:13:55 +02:00
03eb30fbc6 Check SHOW DATABASES privilege on startup
MySQLAuth requires the SHOW DATABASES privilege to see all the databases
so it should be checked that the current user has the permission. A
missing permission will cause errors that are hard to resolve.
2018-02-22 10:06:29 +02:00
6e9e83ccaf MXS-1674 Change load granularity to 1 second
With a granularity of 1 second, the load will from a human
perspective reflect the current situation. That also means
that the maxadmin output shows "natural" steps; 1s, 1m and 1h.
2018-02-21 13:05:58 +02:00
ff55106610 MariaDB Monitor: Move additional classes to separate file
Also use stl containers in monitor definition.
2018-02-21 12:24:24 +02:00
cb6f70119d Turn MariaDB Monitor struct to class with public fields
Allows using std::string for strings. Also, cleanup.
2018-02-21 11:00:42 +02:00
1ecd791887 MXS-1678: Store master_id even when IO thread is stopped
When the IO thread of a relay master is stopped, the knowledge that it is
not a real master but a relay master is lost. To prevent this loss of
information, the master server's server_id value should always be stored
if it is available.
2018-02-21 09:35:42 +02:00
f3e00431de Fix MXS-1418 regression
If a server is removed from a service, readconnroute will not verify that
the server it is connected to is still the same root master. This fixes
the regression of MXS-1418.
2018-02-20 15:35:52 +02:00
60d57aee61 Compile mariadbmon.h as C++ 2018-02-20 11:14:21 +02:00
b9d80f6061 Use dedicated header in NDBClusterMon
NDBClusterMonitor used the MariaDBMonitor header instead of its own.
2018-02-20 11:09:04 +02:00
fd4fd4eead MXS-1674 Add worker load calculation
By definition, the load is calculated using the following formula:

  L = 100 * ((T - t) / T)

where T is a time period and t the time of that period that the worker
spends in epoll_wait(). So, if there is so much work that epoll_wait()
always returns immediately, then the load is 100 and if the thread
spends the entire period in epoll_wait(), then the load is 0.

The basic idea is that the timeout given to epoll_wait() is adjusted
so that epoll_wait() will always return roughly at 10 seconds interval.
By making a note of when we are about to enter epoll_wait() and when we
return from it, we have all the information we need for calculating the
load.

Due to the nature of things, we will not be able to calculate the load
at exact 10-second boundaries, but it will be pretty close. And the load
is always calculated using the true length of the period.

We will then calculate 1 minute load by averaging the load value for 6
consecutive 10-second periods and the 1 hour load by averaging the load
value of 60 consecutive 1 minute loads.

So, while the 10-second load represents the load of the most recently
measured 10-second period (and not the load of the most recent 10
seconds), the 1 minute load and the 1 hour load represents the load of
the most recent minute and hour respectively.
2018-02-20 09:18:43 +02:00
350eaf0e90 Add complete set of atomit_store-operations 2018-02-16 15:06:24 +02:00
5665ecc591 Merge branch '2.2' into develop 2018-02-15 16:33:36 +02:00
1042b861bb MXS-1669: Fix load average tracking
The output of `show threads` could have a negative historic thread load
average that could be explained by the overflow of the signed 32-bit
integer used to count the number of samples.

The time that each thread started to process an event for a DCB used an
old value that is no longer used. Updating this to DCB::last_read retains
the 2.0 behavior.
2018-02-15 11:18:22 +02:00
754d80da75 Do not auto_rejoin if maxscale is passive 2018-02-14 17:30:02 +02:00
4853fb0c64 Prevent double free of rerouted queries
When the routing of a queued query fails, the route_stored_query function
extract the SQL from and frees the already freed buffer.
2018-02-14 14:47:03 +02:00
3b2ec4ab5a Change references from MySQL to MariaDB
A few were missed when the renaming was done. Also renamed the file to
mariadbmon.cc.
2018-02-14 14:47:03 +02:00
f388e2f838 Merge branch '2.2' into develop 2018-02-12 14:00:40 +02:00
b94f3b8792 Failover: do not check cluster stabilization if no slaves
Caused debug assert.
2018-02-12 12:37:41 +02:00
b8d3da4968 Add error tolerance to "servers_no_promotion"
Previously, if the list contained servers that were not monitored by
the monitor yet were valid servers, an error value would be returned
and the monitor failed to start.

With this update, the non-monitored servers are simply ignored when
forming the final list.

Also, added printing of the list to diagnostics.
2018-02-12 10:49:28 +02:00
faaf43ff39 Add gtid to monitor diagnostics, clean up formatting
Gtid:s are now queried every monitor loop.

dignostics() no longer prints slave related info if the server has
no slave connection.
2018-02-10 12:32:56 +02:00
e346968e0e Merge branch '2.1' into 2.2 2018-02-10 08:28:11 +02:00
b4760c5bbe MXS-1661 Introduce 'users_refresh_time'
It is now possible to explicitly specify how frequently MaxScale
may refresh the users of a service.
2018-02-09 13:33:17 +02:00
ae160f3ff2 MXS-1661 Now only the time affects the reloading of users
Now the users will be reloaded at most once during each
USERS_REFRESH_TIME period. Earlier they could be reloaded at
at most USERS_REFRESH_MAX_PER_TIME times, which in practice meant
that with repeated unauthorized login attempts they were reloaded
N times in rapid succession, without the situation being likely to
change in between.
2018-02-09 13:33:17 +02:00
b23ad6d2ef MXS-1661 Turn error into warning and suppress logging
The error regarding the refresh rate having been exceeded

    error: [RWSplit] Refresh rate limit exceeded ...

has been turned into a warning. Further, the warning will be
logged at most once per refresh period that currently is 30s.
2018-02-09 13:33:17 +02:00
816983691a MXS-1660 Turn client hostname lookup failure into a warning
This is used only in case of everything else fails and this lookup
is not unlikely to fail if the client comes from some machine on
an internal network.
2018-02-09 12:03:13 +02:00
83ce603e3e Fix CentOS 6 build failure
The __sync load builtins do not work with pointers to constant variables.
2018-02-09 09:19:46 +02:00
a0d9c7da74 External master server support for failover/switchover
If the master is replicating from an external master, the monitor will save the
host:port of the external server. During demotion, the old master stops the external
replication while the new master begins it. Also, any commands that would add
to gtid have to be omitted when an external master is in play.
2018-02-08 18:44:08 +02:00
fcde23e6fe Merge branch '2.2' into develop 2018-02-08 18:40:29 +02:00
fa37198da1 MXS-1653: Fix hang on preparation of BEGIN
When a BEGIN statement is prepared using the binary protocol, it returns a
single OK packet. Due to a bug in the code that deals with multi-statement
results and EOF packets, the response was never sent to the client.

Also added back the error messages of failed session commands to the INFO
level. This way it's still possible to see why a session command fails but
the log isn't flooded by them in normal usage.
2018-02-08 16:59:00 +02:00
a6fc2d3f88 Add missing <string> header
load_utils.cc used std::string without including the header.
2018-02-08 13:40:11 +02:00
ed28d986e9 Fix debug assertion when no master is present
If no master is present, the debug assertion would dereference a NULL
pointer.
2018-02-08 13:33:31 +02:00
2a987716a8 Build tests by default
The tests are now built by default. This should make it easier for users
to verify that they have a working MaxScale.

Also made the building of test_parse_kill conditional like the rest of the
tests.
2018-02-08 12:48:56 +02:00
b059d78a30 Fix master loss on split cluster
When four servers (A, B, C and E where E and A replicate from each other
and A is the master for B and C) form a cluster and only three of them (A,
B and C) are configured into MaxScale, a failover operation from A to B
(making B the current master) and a restart of A causes B to lose its
master status.

The following diagram illustrates the state of the cluster at the end of
the process described above.

      +----------------------+
      |        +---+         |
  +------------+ B <-+       |
+-v-+ |        +---+ |       |
| E | |              |       |
+-^-+ |  +---+     +-+-+     |
  +------+ A |     | C |     |
      |  +---+     +---+     |
      |                      |
      +----------------------+

The external server E was not correctly ignored in the replication
topology generation causing both A and B to be seen as the lowest slave
nodes in the tree. From a theoretical point of view this is the correct
interpretation as there are two distinct trees and neither of them
contains any true masters.

In practice, MaxScale should treat any servers that replicate from an
external master as root level master nodes. Doing this guarantees that they
are labeled as masters if they have slaves replicating from them.
2018-02-08 12:48:56 +02:00
2504ff19b3 MXS-1653: Fix slave session command processing
The responses of slaves that arrived before the master were always
compared to the empty value of 0x00. If the slave connection replied after
the master, the comparison was correct.

This commit introduces a map of slaves and their responses that
are handled once the master's response arrives.
2018-02-08 12:48:56 +02:00
5326c8db5c Merge branch '2.2' into develop 2018-02-08 12:48:06 +02:00
b15a460416 MXS-1659 Do not include C++ headers from C header 2018-02-08 12:45:18 +02:00
13498d8ec8 MXS-1655 Allow symbolic links to files in config directory 2018-02-07 16:43:13 +02:00
6c42221c9c MXS-1654 Prevent crash in debug mode if no master is found 2018-02-07 16:43:13 +02:00
fa8f6a5da3 Fix monitor error with empty servers_no_promotion 2018-02-07 16:11:47 +02:00
14a4008c84 MXS-1653: Remove false error message
Removed false error message about failed session commands. An error in
response to a session command is a perfectly valid result.

Also added the explicit commands that the master and slave return to the
warning that is logged when the results differ.
2018-02-07 16:07:17 +02:00
6dbec397dc Fix false debug assertion and clarify error message
The debug assertion wasn't well placed as it is perfectly possible that a
master connnection exists but it is not in use. This can be further
checked by asserting that the master is indeed closed and not in use.

Moved the original debug assertion into a separate branch that should
catch any errors in the routing logic.
2018-02-07 16:07:16 +02:00
8558ace801 Clean up ignore_external_masters code
Removed redundant checks and cleared only bits that aren't already
cleared.
2018-02-07 16:07:16 +02:00
8bf756ca56 Update root_master when using a standalone master
When detect_standalone_master is enabled, the root_master variable was not
updated after the master was changed by the standalone server detection
mechanism. This caused debug assertions to fire in addition to possibly
causing some of the ignore_external_masters logic to break.
2018-02-07 16:07:16 +02:00
1cf3de4a74 Add config parameter for excluding servers from failover
"servers_no_promotion" is a comma-separated list of servers
which cannot be chosen when selecting a new master during failover
(auto or manual), or when automatically selecting a new master
for switchover (currently disabled).

The servers in the list are redirected normally and can be promoted
by switchover when manually selecting a new master.
2018-02-07 14:07:10 +02:00
6f6c11e6a3 Ignore events for closed DCBs
If a closed DCB receives an event, it is ignored.
2018-02-06 16:36:04 +02:00
4089b6b6fd MXS-1647: Detect API version mismatch
If the API versions do not match, MaxScale will treat this as an
error. The API versioning would allow backwards compatible changes but the
functionality to handle that is not implemented in MaxScale.

Updated API versions based on changes done to module APIs in 2.2.
2018-02-06 14:51:07 +02:00