Commit Graph

4766 Commits

Author SHA1 Message Date
0bb54511b7 MXS-1490: Query binlog & gtid settings, read @@gtid_slave_pos
The Gtid_Slave_Pos returned by SHOW ALL SLAVES STATUS is not quite
reliable (MDEV-14182) so the variable version is used instead. Added
a convenience function for querying a single row of values.

Also,  gtid_strict_mode, log_bin and log_slave_updates are now
queried during failover. The first only causes a warning message
if disabled, the last two affect new master selection.
2017-11-06 12:23:35 +02:00
0131841787 Fix dbfwfilter and cachetester dependencies
The two depended on the PCRE2 and Connector-C libraries which means that
the libraries need to be built first. This information needs to be told to
CMake with the add_dependency call.
2017-11-05 19:24:56 +02:00
1287b0e595 Backport authentication fix from 2.2
The authentication code assumed that the initial request only had
authentication related data. This is not true if the client library
predicts that the authentication will succeed and it sends a query right
after it sends the authentication data.
2017-11-03 11:00:54 +02:00
2115ad7911 Make lines <= 110 chars long 2017-11-02 09:29:24 +02:00
e79a95cd96 MXS-1490: Parse Gtid-strings with multiple triplets
Gtid_Slave_Pos may contain multiple triplets even with single-source
replication if the domain has changed at some point. For failover, we
only need to know the current domain values, so the gtid-parsing now
accepts an optional domain parameter. The Gtid-class still only stores
one triplet of values.

When parsing the Show Slave Status result, Gtid_IO_Pos is parsed first.
The resulting domain is then read from Gtid_Slave_Pos.
2017-11-01 14:43:13 +02:00
0f2c1ff7d6 MXS-1490: Wait for a slave to clear relay logs before promotion
When selecting the new master server, Gtid_IO_Pos is checked to
select the slave with the latest event in relay log. If there is a
tie, the slave that has processed most events wins.

It's possible that the winning slave has unprocessed events. In
this case, failover waits for the slave to complete processing the
log. The maximum wait is defined in monitor parameter
"failover_timeout", defaulting to 90 seconds. If time runs out
failover ends in failure.

The Gtid struct was separated to its own definition to handle gtid:s
easier.
2017-10-31 18:27:16 +02:00
daaf8f5c53 Merge branch '2.2' into 2.2-mrm 2017-10-31 16:24:10 +02:00
18bfc515e2 MXS-1474 Set correct default and fix typo 2017-10-31 15:09:54 +02:00
f52a0acbbe MXS-1474 Document and act in the same way
From the documentation:

   * `never`: When there is an active transaction, no data will be returned
     from the cache, but all requests will always be sent to the backend.
     The cache will be populated inside _explicitly_ read-only transactions.
     Inside transactions that are not explicitly read-only, the cache will
     be populated _until_ the first non-SELECT statement.
   * `read_only_transactions`: The cache will be used and populated inside
     _explicitly_ read-only transactions. Inside transactions that are not
     explicitly read-only, the cache will be populated, but not used
     _until_ the first non-SELECT statement.
   * `all_transactions`: The cache will be used and populated inside
     _explicitly_ read-only transactions. Inside transactions that are not
     explicitly read-only, the cache will be used and populated _until_ the
     first non-SELECT statement.
2017-10-31 10:58:03 +02:00
e45ee22ec3 MXS-1474 Refactor for forthcoming changes 2017-10-31 10:58:03 +02:00
93edc230f9 MXS-1474 Use enum instead of boolean
Clearer for the reader with an explicit value indicating the desired
action, instrad of a boolean whose meaning is implicit.
2017-10-31 10:58:03 +02:00
20bb825882 MXS-1474 Factor out functionality
More changes coming, so better to factor out the COM_QUERY handling.
2017-10-31 10:58:03 +02:00
cb5c22269e MXS-1474 Take 'cache_in_transactions' into account
When deciding whether the cache should be consulted or not,
the value of the configuration parameter 'cache_in_transaction'
is taken into account as well.
2017-10-31 10:58:03 +02:00
c15eaf2f36 MXS-1474 Accept 'cache_in_transactions' parameter
Only the handling of the configuration parameter.
2017-10-31 10:58:03 +02:00
3a78b716b8 Merge branch '2.2' into 2.2-mrm 2017-10-30 11:06:34 +02:00
a971aa25da Merge branch '2.1' into 2.2 2017-10-30 11:01:19 +02:00
41cd0cd6d7 MXS-1490 Separate SlaveStatus information to its own class
The SlaveStatus info is now in a separate class, although it's
still embedded in the MYSQL_SERVER_INFO-class. Both classes now
use strings intead of char*:s.
2017-10-30 10:33:41 +02:00
63cbf56cb2 MXS-1500: Fix real_type values
The characters in the type weren't checked for correctness which caused
the processing to read more characters than was intended.
2017-10-30 10:25:03 +02:00
c7c670930c MXS-1493: Check that master appears dead before verifying it
Before the verification of the master's failure is done, the master must
first appear to have failed.
2017-10-27 15:31:46 +03:00
0bc439641a Add helper function for reading values by field name
The helper function provides map-like access to row values. This is used
to retrieve the values for all MariaDB 10.0+ versions as there are
differences in the returned results between 10.1 and 10.2.
2017-10-27 15:31:46 +03:00
2d1e5f46fa Remove use of timestamps in failover code
Using timestamps to detect whether MaxScale was active or passive can
cause problems if multiple events happen at the same time. This can be
avoided by separating events into actively observed and passively observed
events. This clarifies the logic by removing the ambiguity of timestamps.

As the monitoring threads are separate from the worker threads, it is
prudent to use atomic operations to modify and read the state of the
MaxScale. This will impose an happens-before relation between MaxScale
being set into passive mode and events being classified as being passively
observed.
2017-10-27 15:31:46 +03:00
52473c379b Extract Gtid_Slave_Pos in mysqlmon
The string form value of Gtid_Slave_Pos is extracted into different
integer components.
2017-10-27 15:31:46 +03:00
0be39b8545 MXS-1493: Improve master failure detection
The master failure can now be verified by checking when the slaves are
connected to the master. If the slaves do not receive any events from the
master, the connections are considered as down after a configurable limit.

Added two parameters for controlling whether the check is done and for how
long the monitor waits before doing the failover.
2017-10-27 15:31:18 +03:00
26b47d0b90 MXS-1493: Collect slave heartbeats
The slave heartbeat count and period are collected from the SHOW ALL
SLAVES STATUS output. This, in addition to the relay log position, is used
to calculate the point in time when a slave has last interacted with the
master.

By using this timestamp, the monitor can enforce a minimum "timeout" for
the master before a failover is performed.
2017-10-27 15:30:38 +03:00
d9bd977c35 MXS-1499: Add missing fields to SHOW ALL SLAVES STATUS
Now SHOW ALL SLAVES STATUS reports new fields:

Retried_transactions;
Max_relay_log_size,
Executed_log_entries,
Slave_received_heartbeats,
Slave_heartbeat_period,
Gtid_Slave_Pos"
2017-10-27 14:07:53 +02:00
48a15368d0 MXS-1490-1492: First version of failover script
Works in ideal situations and can be tested. Does not consider
relay log and only checks that commands were received by a backend.
Work in progress.
2017-10-27 10:54:50 +03:00
114ea49e10 MXS-1494: Add missing replication credentials parameters
The parameters weren't added to the list of module parameters.
2017-10-26 17:37:02 +03:00
f805716700 MXS-1497: Don't skip events with LOG_EVENT_IGNORABLE_F flag
Currently binlog server doesn't send to slaves these event types:
- MARIADB10_START_ENCRYPTION_EVENT
- IGNORABLE_EVENT

It also skips events with LOG_EVENT_IGNORABLE_F flag.

This modification allows sending events with that flag.
2017-10-26 11:32:06 +02:00
b1f62ec1af MXS-1488: Added SHOW STATUS LIKE 'slave_received_heartbeats'
Add support for show status like 'slave_received_heartbeats' in
binlogserver.
2017-10-25 15:11:07 +02:00
63c7550196 MXS-1490 Prepare for failover functionality addition
Moved mon_process_failover() from monitor.cc to mysql_mon.cc. Renamed
some functions and variables related to previous failover functionality
to avoid confusion.
2017-10-25 12:24:29 +03:00
554ae642d7 MXS-1495: Add failover sanity check
The sanity check disables the failover functionality if a server is
configured to replicate from more than one source.
2017-10-24 23:45:23 +03:00
c3ff2aa1e9 MXS-1495: Move the MYSQL_SERVER_INFO extraction into a function
The get_server_info function takes the monitor handle and a database and
returns the corresponding MYSQL_SERVER_INFO struct. This hides a part of
the actual implementation of the info struct from the monitor code,
allowing future refactoring to be done. It also makes the code a bit more
readable.
2017-10-24 23:44:59 +03:00
95ac9d501c MXS-1494: Add replication credentials to mysqlmon
The credentials used for slave servers can now be controlled with the
replication_user and replication_password parameters.
2017-10-24 23:44:46 +03:00
75a2e190b2 Add function for updating the MYSQL_SERVER_INFO struct
The values in the MYSQL_SERVER_INFO struct can now be updated with the
update_slave_status function.

Also moved the number of configured and running slave configurations into
the info struct. This removes the need to pass output parameters.
2017-10-24 15:43:03 +03:00
efeaecaef2 MXS-1486 When there is fresh data, update the cache entry
If something is SELECTed that should be cached for some, but not
for the current user, the cached entry it nevertheless updated.
That way the cached data will always be the last fetched value
and it is also possible to use this behaviour for explicitly
updating the cache entry.
2017-10-24 15:31:08 +03:00
3cefb53e1d Split server state and info processing into two
The MYSQL_SERVER_INFO struct is updated first and then the server status
is updated. This allows the function to be called without it affecting the
server state.
2017-10-24 15:27:36 +03:00
d6812b91a0 MXS-1485: MariaDB 10 GTID is always on for slave connections
MariaDB 10 GTID is always on for slave connections.
Remove mariadb10_slave_gtid option
2017-10-24 08:42:43 +02:00
65dc9e0d30 MXS-1484: set binlog storage to TREE mode
When mariadb10_master_gtid is on the storage of binlog file is
automatically set to TREE mode.
2017-10-23 13:57:56 +02:00
37c804e0d3 MXS-1327 Warn if debug priority enabled in release mode
Turning debug on has no effect if MaxScale has been built in
release mode. A warning will now be displayed to the user if
that is attempted.
2017-10-19 12:56:12 +03:00
9c03a785ce Fix resultset handling with binary data
When binary data was processed, it was possible that the values were
misinterpreted as OK packets which caused debug assertions to trigger.

In addition to this, readwritesplit did not handle the case when all
packets were routed individually.
2017-10-12 12:29:43 +03:00
96aadcbe83 Fix usage of partial packets when full packets are expected
The authentication phase expects full packets. If the packets aren't
complete a debug assertion would get hit. To detect this, the result of
the extracted buffer needs to be checked.
2017-10-12 12:29:43 +03:00
97d0cc7482 Fix multi-statement execution in readwritesplit
A multi-statements can return multiple resultsets in one response. To
accommodate for this, both the readwritesplit and modutil code must be
altered.

By ignoring complete resultsets in readwritesplit, the code can deduce
whether a result is complete or not.
2017-10-12 12:29:43 +03:00
d0fd65be57 Fix unintentional fallthrough
When LEAST_BEHIND_MASTER routing criteria was used, the info level logging
function would fall through to the default case. In debug builds, this
would trigger a debug assertion.
2017-10-12 12:29:43 +03:00
ca0b9de421 Add missing initialization of MySQLProtocol::collect_result
The variable was not initialized.
2017-10-12 12:29:43 +03:00
f3b0245c0b Return results as sets of packets
Returning the results of a query as a set of packets is currently more
efficient. This is mainly due to the fact that each individual packet for
single packet routing is allocated from the heap which causes a
significant loss in performance.

Took the new capability into use in readwritesplit and modified the
reply_is_complete function to work with non-contiguous results.
2017-10-12 12:29:43 +03:00
c70e1431e3 Fix OK packet status extraction in readwritesplit
As the row count and last insert ID are length-encoded integers, they need
to be handled with the correct functions.
2017-10-12 12:29:43 +03:00
1c329b6041 Fix typo in readwritesplit comments
The comment about the static variable being returned as a reference was
missing the `return` word.
2017-10-12 12:29:43 +03:00
489520a5c0 Process backend packets only once
When the router requires statement based output, the gathering of complete
packets can be skipped as the process of splitting the complete packets
into individual packets implies that only complete packets are handled.

Also added a quicker check for stored protocol commands than a call to
protocol_get_srv_command.
2017-10-12 12:29:43 +03:00
5c9b953d69 Fix crash in backend command tracking
The backend protocol command tracking didn't check whether the session was
the dummy session. The DCB's session is always set to this value when it
is put into the persistent pool.
2017-10-12 12:29:43 +03:00
9d3fc27a3c Fix backend protocol command tracking
If a query was processed in the client protocol module when a prepared
statement was being executed by the backend module, the current command
would get overwritten. This caused a debug assertion in readwritesplit to
trigger as the result was neither a single packet nor a collected result.

The RCAP_TYPE_STMT_INPUT capability guarantees that a buffer contains a
complete packet. This information can be used to track the currently
executed command based on the buffer contents which allows asynchronicity
betweent the client and backend protocol. In practice this only comes in
play when routers queue queries for later execution.
2017-10-12 12:29:43 +03:00