Commit Graph

8897 Commits

Author SHA1 Message Date
551bb81929 Loosen the atomicity requirement for the passive parameter
As the passive parameter is only used by the failover and the failover can
only be initiated by the monitor, there is no true need to synchronize the
reads and write of this parameter.

As all runtime changes are protected by the runtime lock, only partial
reads are of concern. For the supported platforms, this is not a practical
problem and it only confuses the reader when other variables are modified
without atomic operations.
2017-10-27 15:31:46 +03:00
600509be4a Fix master failure tracking
The master failure was assumed to be the only master related event for
each monitoring loop. If the master was switched by an external actor, the
monitor tracking would be out of sync.
2017-10-27 15:31:46 +03:00
c7c670930c MXS-1493: Check that master appears dead before verifying it
Before the verification of the master's failure is done, the master must
first appear to have failed.
2017-10-27 15:31:46 +03:00
0bc439641a Add helper function for reading values by field name
The helper function provides map-like access to row values. This is used
to retrieve the values for all MariaDB 10.0+ versions as there are
differences in the returned results between 10.1 and 10.2.
2017-10-27 15:31:46 +03:00
2d1e5f46fa Remove use of timestamps in failover code
Using timestamps to detect whether MaxScale was active or passive can
cause problems if multiple events happen at the same time. This can be
avoided by separating events into actively observed and passively observed
events. This clarifies the logic by removing the ambiguity of timestamps.

As the monitoring threads are separate from the worker threads, it is
prudent to use atomic operations to modify and read the state of the
MaxScale. This will impose an happens-before relation between MaxScale
being set into passive mode and events being classified as being passively
observed.
2017-10-27 15:31:46 +03:00
37e64bad90 MXS-1493: Add master failure verification test 2017-10-27 15:31:46 +03:00
52473c379b Extract Gtid_Slave_Pos in mysqlmon
The string form value of Gtid_Slave_Pos is extracted into different
integer components.
2017-10-27 15:31:46 +03:00
0be39b8545 MXS-1493: Improve master failure detection
The master failure can now be verified by checking when the slaves are
connected to the master. If the slaves do not receive any events from the
master, the connections are considered as down after a configurable limit.

Added two parameters for controlling whether the check is done and for how
long the monitor waits before doing the failover.
2017-10-27 15:31:18 +03:00
26b47d0b90 MXS-1493: Collect slave heartbeats
The slave heartbeat count and period are collected from the SHOW ALL
SLAVES STATUS output. This, in addition to the relay log position, is used
to calculate the point in time when a slave has last interacted with the
master.

By using this timestamp, the monitor can enforce a minimum "timeout" for
the master before a failover is performed.
2017-10-27 15:30:38 +03:00
48a15368d0 MXS-1490-1492: First version of failover script
Works in ideal situations and can be tested. Does not consider
relay log and only checks that commands were received by a backend.
Work in progress.
2017-10-27 10:54:50 +03:00
114ea49e10 MXS-1494: Add missing replication credentials parameters
The parameters weren't added to the list of module parameters.
2017-10-26 17:37:02 +03:00
b1f62ec1af MXS-1488: Added SHOW STATUS LIKE 'slave_received_heartbeats'
Add support for show status like 'slave_received_heartbeats' in
binlogserver.
2017-10-25 15:11:07 +02:00
63c7550196 MXS-1490 Prepare for failover functionality addition
Moved mon_process_failover() from monitor.cc to mysql_mon.cc. Renamed
some functions and variables related to previous failover functionality
to avoid confusion.
2017-10-25 12:24:29 +03:00
554ae642d7 MXS-1495: Add failover sanity check
The sanity check disables the failover functionality if a server is
configured to replicate from more than one source.
2017-10-24 23:45:23 +03:00
c3ff2aa1e9 MXS-1495: Move the MYSQL_SERVER_INFO extraction into a function
The get_server_info function takes the monitor handle and a database and
returns the corresponding MYSQL_SERVER_INFO struct. This hides a part of
the actual implementation of the info struct from the monitor code,
allowing future refactoring to be done. It also makes the code a bit more
readable.
2017-10-24 23:44:59 +03:00
95ac9d501c MXS-1494: Add replication credentials to mysqlmon
The credentials used for slave servers can now be controlled with the
replication_user and replication_password parameters.
2017-10-24 23:44:46 +03:00
75a2e190b2 Add function for updating the MYSQL_SERVER_INFO struct
The values in the MYSQL_SERVER_INFO struct can now be updated with the
update_slave_status function.

Also moved the number of configured and running slave configurations into
the info struct. This removes the need to pass output parameters.
2017-10-24 15:43:03 +03:00
3cefb53e1d Split server state and info processing into two
The MYSQL_SERVER_INFO struct is updated first and then the server status
is updated. This allows the function to be called without it affecting the
server state.
2017-10-24 15:27:36 +03:00
38f2b1237c Update release date 2017-10-12 12:29:43 +03:00
d45ae60943 Update release notes 2017-10-12 12:29:43 +03:00
9c03a785ce Fix resultset handling with binary data
When binary data was processed, it was possible that the values were
misinterpreted as OK packets which caused debug assertions to trigger.

In addition to this, readwritesplit did not handle the case when all
packets were routed individually.
2017-10-12 12:29:43 +03:00
96aadcbe83 Fix usage of partial packets when full packets are expected
The authentication phase expects full packets. If the packets aren't
complete a debug assertion would get hit. To detect this, the result of
the extracted buffer needs to be checked.
2017-10-12 12:29:43 +03:00
5cea0ede95 Fix hang on multi-statemet query
If multiple queries that only generate OK packets were executed, the
result returned by the server would consist of a chain of OK packets. This
special case needs to be handled by the modutil_count_signal_packets.

The current implementation is very ugly as it simulates a result with at
least one resultset in it. A better implementation would hide it behind a
simple boolean return value and an internal state object.
2017-10-12 12:29:43 +03:00
05e057a703 Fix buffer length calculation in modutil_count_signal_packets
The optimization of the buffer iteration did not decrement the total
buffer length.
2017-10-12 12:29:43 +03:00
e09ae2df20 Update 2.2.0 release notes 2017-10-12 12:29:43 +03:00
97d0cc7482 Fix multi-statement execution in readwritesplit
A multi-statements can return multiple resultsets in one response. To
accommodate for this, both the readwritesplit and modutil code must be
altered.

By ignoring complete resultsets in readwritesplit, the code can deduce
whether a result is complete or not.
2017-10-12 12:29:43 +03:00
3afc89ae80 Fix debug assertion in modutil_count_signal_packets
The original offset needs to be separately tracked to assert that an OK
packet is not the first packet in the buffer. The functional offset into
the buffer is modified to reduce the need to iterate over buffers that
have already been processed.
2017-10-12 12:29:43 +03:00
426f80c8ef Fix muti-result handling in modutil_count_signal_packets
The function assumed that the buffer would not contain a trailing OK
packet that completes a multi-result response.
2017-10-12 12:29:43 +03:00
a04843da9d Add function for logging buffer contents as hex
The gwbuf_hexdump write the contents of the buffer into the info log. This
is quite helpful for debugging of protocol related problems.
2017-10-12 12:29:43 +03:00
d0fd65be57 Fix unintentional fallthrough
When LEAST_BEHIND_MASTER routing criteria was used, the info level logging
function would fall through to the default case. In debug builds, this
would trigger a debug assertion.
2017-10-12 12:29:43 +03:00
ca0b9de421 Add missing initialization of MySQLProtocol::collect_result
The variable was not initialized.
2017-10-12 12:29:43 +03:00
f3b0245c0b Return results as sets of packets
Returning the results of a query as a set of packets is currently more
efficient. This is mainly due to the fact that each individual packet for
single packet routing is allocated from the heap which causes a
significant loss in performance.

Took the new capability into use in readwritesplit and modified the
reply_is_complete function to work with non-contiguous results.
2017-10-12 12:29:43 +03:00
0e7f592bd7 Add file names and line numbers to stacktraces
The GLIBC backtrace functionality doesn't generate file names and line
numbers in the generated stacktrace. This can to be done manually by
executing a set of system commands.

Conceptually doing non-signal-safe operations in a signal handler is very
wrong but as stacktraces are only printed when something has gone horribly
wrong, there is no real need to worry about making things worse.

As a safeguard for fatal errors while the stacktrace is being generated,
it is first dumped into the standard error output of the process. This
will function even if malloc is corrupted.
2017-10-12 12:29:43 +03:00
958e9cc2b0 Fix large_insert_hang compilation failure
Added missing changes that weren't added to last commit.
2017-10-12 12:29:43 +03:00
f2afa5380b Add OK packet processing test
Added a test case which exercises the OK packet handling in
readwritesplit.
2017-10-12 12:29:43 +03:00
c70e1431e3 Fix OK packet status extraction in readwritesplit
As the row count and last insert ID are length-encoded integers, they need
to be handled with the correct functions.
2017-10-12 12:29:43 +03:00
36dfcd4319 Add MaxCtrl test for start/stop maxscale
Added the missing test case for starting and stopping MaxScale.
2017-10-12 12:29:43 +03:00
0614b2ac00 Update MaxCtrl documentation
Fixed the usage help for each command.
2017-10-12 12:29:43 +03:00
a171b4a4ee Update Avro router documentation
Update Avro router documentation
2017-10-12 12:29:43 +03:00
1c329b6041 Fix typo in readwritesplit comments
The comment about the static variable being returned as a reference was
missing the `return` word.
2017-10-12 12:29:43 +03:00
621444e5e4 Fix error messages in sync_slaves
Fixed missing newlines in the error output printf calls of
sync_slaves. Changed the order of commands pers_02 executes to a more
correct way.
2017-10-12 12:29:43 +03:00
489520a5c0 Process backend packets only once
When the router requires statement based output, the gathering of complete
packets can be skipped as the process of splitting the complete packets
into individual packets implies that only complete packets are handled.

Also added a quicker check for stored protocol commands than a call to
protocol_get_srv_command.
2017-10-12 12:29:43 +03:00
5c9b953d69 Fix crash in backend command tracking
The backend protocol command tracking didn't check whether the session was
the dummy session. The DCB's session is always set to this value when it
is put into the persistent pool.
2017-10-12 12:29:43 +03:00
9d3fc27a3c Fix backend protocol command tracking
If a query was processed in the client protocol module when a prepared
statement was being executed by the backend module, the current command
would get overwritten. This caused a debug assertion in readwritesplit to
trigger as the result was neither a single packet nor a collected result.

The RCAP_TYPE_STMT_INPUT capability guarantees that a buffer contains a
complete packet. This information can be used to track the currently
executed command based on the buffer contents which allows asynchronicity
betweent the client and backend protocol. In practice this only comes in
play when routers queue queries for later execution.
2017-10-12 12:29:43 +03:00
1e2ef0be70 Order members to ensure alignment
8 + 4 + 4 ensures 16 with 8 byte alignment, which means that
'data' is certain to be 8 byte aligned. 4 + 8 + 4 might result
in something else in some funky environment.
2017-10-12 12:29:43 +03:00
4ddd9c9ec5 Fix compilation failure in readwritesplit
The debug assertion was missing a parameter.
2017-10-12 12:29:43 +03:00
0a5ade8927 Check before clearing statements stored in the session
If the session has no stored statements, there's no need to clear them.
2017-10-12 12:29:43 +03:00
5d18ab86ea Allocate shared buffer and its data in one chunk
The GWBUF shared buffer and its data is now allocated in one
chunk so that the data directly follows the shared buffer.
That way, creating a GWBUF will involve 2 and not 3 calls to
malloc and freeing one will involve 2 and not 3 calls to free.
2017-10-12 12:29:43 +03:00
e8f8a3bcdb Inline get_backend_from_dcb
The function is used very often so inlining it should help.
2017-10-12 12:29:43 +03:00
fbb45ead1a Add minor performance improvements to readwritesplit
The multi-statement detection did not check for the existence of
semicolons before doing the heavier processing.

Calculcate the packet length only once for the result state management.
2017-10-12 12:29:43 +03:00