Commit Graph

1089 Commits

Author SHA1 Message Date
0b67ce1e82 Merge branch '2.3' into develop 2019-06-20 14:36:48 +03:00
be429c2c57 MXS-2490: Always restore original statement ID
By always restoring the ID, we are guaranteed to only store the query in
the form that it was originally sent in. This should be changed so that
the ID that the client sends can be used as-is in the backends.
2019-06-20 14:27:04 +03:00
805be70a78 Add more information to rwsplit info messages
The statement ID for all binary protocol statements and the error given to
handleError are now logged.
2019-06-20 14:27:03 +03:00
197d577aa1 Set correct replication lag state
The lag state was inverted.
2019-06-19 17:35:19 +03:00
2bd614ce8e MXS-2566: Use connection counts more often
The connection counts are now always used to pick the best servers where
the initial connections are created. This covers both master and slaves
selection. Reconnections done while routing queries still pick the "best"
server according to the slave selection criteria. This allows better
servers to be taken into use when `lazy_connect` is enabled.
2019-06-19 16:12:28 +03:00
64d25a48bd Merge commit 'a60bd376108f71fccf40001c1496f32c11137fe4' into develop 2019-06-18 15:51:17 +03:00
f6a5b59067 MXS-2563: Fix query retrying on slave failure
If one slave is executing a query while another one is executing a session
command and the one that is executing the session command fails, the
ongoing query would get retried even though the server that failed was not
executing it. If the server was executing a session command, nothing needs
to be done.
2019-06-17 14:07:52 +03:00
f8729c272e Erase trailing unexpected ERR packets
If a resultset is followed by an ERR packet that is not expected
(e.g. server is shutting down), the packet must not be sent to the
client. This allows readwritesplit to replace the failing connection with
a new one thus hiding server shutdowns from clients.
2019-06-14 15:18:02 +03:00
7dde0edb54 Clean up unexpected error handling in readwritesplit
By using the Error class, the code can be cleaned up and simplified.
2019-06-14 15:18:01 +03:00
e1b611aa06 MXS-2512 Use existing information
As an error returned by the server is now stored inside RWBackend,
irrespective of whether it is returned solely or e.g. last after
a result set, there is no need to examine the GWBUF in rws, but
we can use the information that exists.
2019-06-11 09:44:27 +03:00
4efa9dbeea Remove maxscale/alloc.h
The remaining contents were moved to maxbase/alloc.h.
2019-06-10 14:11:25 +03:00
44d1b821c3 Merge branch '2.3' into develop 2019-06-03 13:54:55 +03:00
220fea3546 MXS-2464: Retry failed session commands
If the execution of a session command fails on a master, it is retried
again. If the master is not available, the response will be returned from
one of the slaves.
2019-05-31 14:01:15 +03:00
cb089f69e6 Add read retry assertion
The retrying of a read on a slave should only be done when the failing
server is waiting for a result and it was the last server from which a
result was expected.
2019-05-31 14:01:15 +03:00
625740e69d MXS-2464: Fix crash on failed session command
If the master fails when a session command is being executed with
delayed_retry enabled, a null query would get placed into the query
queue. This change simply prevents the crash and closes the session even
though the query could be retried.
2019-05-31 14:01:15 +03:00
ee7e63a611 MXS-2464: Assert that responses are expected
A query should not be queued if no responses are expected. The code that
executes queued queries should be dead code and this assertion would catch
it.
2019-05-31 14:01:14 +03:00
81254953d1 MXS-2520: Allow master reconnection on reads
If only the master is available and a reconnection must take place, it
must be allowed to happen in all cases.
2019-05-29 18:46:33 +03:00
5828061321 Merge branch '2.3' into develop 2019-05-17 14:39:30 +03:00
96a477ec89 MXS-2490: Send error to client on unknown PS handle
If a client requests an unknown binary protocol prepared statement handle,
a custom error shows the actual ID used instead of the "empty" ID of 0
that the backend sends.
2019-05-17 14:13:44 +03:00
bf63698991 MXS-2464: Bring back the runtime query queue check
The code that checked that only non-empty queries are stored in the query
queue was left out when the query queue fix was backported to 2.3. Since
MXS-2464 is caused by a still unknown bug, the runtime check should help
figure out in which cases the problem occurs.
2019-05-17 13:03:03 +03:00
b0d8535ead Allow master changes at transaction start
When a BEGIN statement is being executed without a master connection but
when one can be created, the BEGIN statement would be treated as if a
transaction was already open. Since the statement only starts the
transaction, it is allowed to be routed to a "new" master regardless of
the transaction statem.

This fixes the failure to start a transaction when lazy_connect is
enabled.
2019-05-10 13:20:33 +03:00
418ccf861d Format routers and monitors 2019-05-10 10:31:12 +03:00
23a09a6294 MXS-2455 Use mxb::Buffer::iterator
Simplifies the code and as extra allocations etc. are only
made when info is enabled, and can thus be ignored.
2019-05-09 15:04:03 +03:00
c818b1208a MXS-2455 Recognize transaction rollbacks
All transaction rollback errors have an sql_state like "40XXX".
So, when an error reply is received we check for that and act
accordingly.
2019-05-08 10:00:50 +03:00
3dd9298b18 MXS-2456: Test transaction replay cap
Added a test that makes sure the transaction replay cap is respected. Also
improved the logging to show how many transaction replay attemps have been
done and to log if a replay is not done due to too many attempts.
2019-05-02 16:59:36 +03:00
26b2897280 MXS-2456: Cap transaction replay attempts
In most cases it is reasonable to stop attempting transaction replays
after a certain number of failed attempts. This prevents transactions from
being repeatedly replayed on the same server over and over again if, for
example, it keeps crashing.
2019-05-02 16:59:36 +03:00
ea243fd8ba MXS-2329 Use durations in readwritesplit 2019-04-30 13:02:53 +03:00
643514bbe4 Merge branch '2.3' into develop 2019-04-30 12:46:07 +03:00
dd188962cd MXS-2427 Check all hints when routing
Now considers other routing hints if first one fails. The order is inverted compared
to e.g. namedserver filter settings because of how routing hints are stored. If all hints
are unsuccessful, route to any slave.
2019-04-29 16:49:32 +03:00
01b1d469a8 MXS-2435 Handle recoverable Clustrix errors
If
- transaction replay is enabled,
- an error is returned and
- the error is one of the recoverable Clustrix errors
we will retry the transaction.

If it succeeds, then the client will not notice anything but
for a short delay.

Note that the error message is looked for irrespective of whether
the backend is Clustrix or not. However, as errors are not common
the price for doing that can probably be ignored.

However, a bigger problem is that explicit knowledge of different
backends should *not* be coded into routers.
2019-04-26 10:54:57 +03:00
d8a9405998 MXS-2435 Refactor error message extracting
Access to the error message is needed in different contexts.
Now the extraction code itself can be shared.
2019-04-26 10:54:57 +03:00
07ea6bd9ba MXS-2450: Don't discard history if it's disabled
If the session command history is not enabled, it shouldn't be discarded
when a COM_CHANGE_USER is executed.
2019-04-25 11:49:01 +03:00
24fc82e160 Move large query processing inside RWBackend
The knowledge of which function to call can be internal to RWBackend. This
make the use of the class easier as one can simply write to the backend.
2019-04-18 13:58:34 +03:00
c643f9bc8d Merge branch '2.3' into develop 2019-04-12 13:23:49 +03:00
ec890b33cd Prevent checksum mismatch on second trx replay
If a transaction replay has to be executed twice due to a failure of the
original candidate master, the query queue could contain replayed
queries. The replayed queries would be placed into the queue if a new
connection needs to be created before the transaction replay can start.
2019-04-05 13:33:16 +03:00
6421af1bb4 Backport query queue changes to 2.3
Backported the changes that convert the query queue in readwritesplit into
a proper queue. This changes combines both
5e3198f8313b7bb33df386eb35986bfae1db94a3 and
6042a53cb31046b1100743723567906c5d8208e2 into one commit.
2019-04-05 13:33:16 +03:00
2aa3515fc8 Merge commit '09cb4a885f88d30b5108d215dcdaa5163229a230' into develop 2019-04-04 14:34:17 +03:00
a217dde1f0 MXS-2419: Queue queries executed during trx replay
By storing the queries in the query queue and routing it once the
transaction replay is done, we prevent two problems:

* Multiple transaction replays would overwrite the m_interrupted_query
  buffer that was used to store any queries executed during the
  transaction replay.

* Incorrect ordering of queries when the query queue is not empty and a
  new query is executed during transaction replay.
2019-04-03 12:57:05 +03:00
2dfd7d35ac MXS-2418: Crash on trx replay when log_info is enabled
If the session starts with no master but later one becomes available, when
a transaction is started the code would unconditionally use the master's
name in a log message.
2019-04-03 12:57:05 +03:00
5242cd5ebf Readwritesplit: Graceful maintenance mode
By allowing transactions to the master to end even if the server is in
maintenance mode makes it possible to terminate connections at a known
point. This helps prevent interrupted transactions which can help reduce
errors that are visible to the clients.
2019-04-02 14:21:54 +03:00
5ee9b74770 Fix readwritesplit server selection
If a server with zero weight was chosen as the only candidate, it was
possible that the starting minimum value was smaller than the server
score. This would mean that a candidate wouldn't be chosen if the score
was too high. To preven this, the values are capped to a value smaller
than the initial minimum score.
2019-03-28 13:21:23 +02:00
d40e29d5f6 A little houskeeping.
Increasing counter sizes from int to long for averages.
Rename random functions to end with _co instead of _exclusive to
indicate range [close, open[, and to allow future suffixes oc, cc and oo.
2019-03-27 13:15:14 +02:00
74eeb64fba Don't close connections to servers being drained
The connections to servers being drained should not be closed like they
should be for servers in maintenance mode. The change in functionality
between 2.3 and develop caused the connections to be discarded if the
server was in either maintenance or drain mode.
2019-03-21 18:19:10 +02:00
9bc721afb6 Merge commit '11ee74bad327e7fb15e8388d20e7838b9e49cadf' into 2.3 2019-03-21 17:52:42 +02:00
11ee74bad3 Free the readwritesplit query queue
If the queue isn't empty when the session closes, the queue would leak.
2019-03-21 11:22:40 +02:00
6042a53cb3 Replace raw GWBUF pointers with mxs::Buffer
Now that the query queue is stored in an actual container, it is only
logical to use mxs::Buffer instead of GWBUF as the stored type.
2019-03-18 13:18:52 +02:00
5e3198f831 Replace the plain GWBUF query queue with std::deque
Using a std::deque to store the queries retains the exact state of the
object thus removing the need to parse the query again. It also removes
the need to split the queue into individual packets which makes the code
cleaner.
2019-03-18 13:18:52 +02:00
0001babd26 Clean up readwritesplit routing functions
Moved the more verbose parts of the routing code into subfunctions and
arranged it so that more relevant parts are closer to each other. Also
added the SQL statement that is being delayed to the message.
2019-03-18 13:18:52 +02:00
4bf9fa872c MXS-2313: Use servers of same rank in readwritesplit
When a readwritesplit session has a connection to a master server, servers
of the same rank as the master are used. If no master connection is
available, the server with the highest rank among all connected servers is
used. If there are no open connections, the server with the best rank is
chosen and a connection to it is made.

Connections with different rank values than what is the current rank value
of the session will be discarded. This reduces the use of server with
different ranks when the master server of a session fails. Without the
active pruning of connections, slave connections to primary clusters
without masters would remain in use even after the primary master
fails. This guarantees full switchover to a secondary cluster if a master
change occurs.
2019-03-18 13:12:59 +02:00
109702ee72 Fix replication lag calculation in readwritesplit
The value used to represent the lack of a configured replication lag was
different than was used in other parts of MaxScale.
2019-03-18 13:12:59 +02:00