Commit Graph

1080 Commits

Author SHA1 Message Date
f0f9c21d1c Merge branch '2.3' into develop 2019-01-07 10:54:42 +02:00
40485d746c MXS-2220 Change server name to constant string 2019-01-03 12:13:15 +02:00
09aa54720d MXS-2220 Read server version using public methods
Version related fields have been removed from the public class.
2019-01-03 11:23:14 +02:00
9adbd2f8f0 Cache the local server statistics object
By storing the server statistics object in side the session, the lookup
involved in getting a worker-local value is avoided. Since the lookup is
done multiple times for a single query, it is beneficial to store it in
the session.

As the worker-local value is never deleted, it is safe to store a
reference to it in the session. It is also never updated concurrently so
no atomic operations are necessary.
2019-01-03 09:37:59 +02:00
1fa3b133c7 Make keepalive ping checks more efficient
The code now only checks the need for a keepalive ping once every
keepalive interval. Reduced the number of mxs_clock calls to one so that
all servers use the same value.
2019-01-03 09:37:59 +02:00
4d0a40ef9f Add missing pointer initialization
The change from SRWBackend to RWBackend* had some side effects, namely the
missing automatic initialization into zero values.
2018-12-28 08:19:23 +02:00
5a9e84d39a Merge branch '2.3' into develop 2018-12-18 00:00:00 +02:00
35d31801bb Merge branch '2.2' into 2.3 2018-12-17 23:52:56 +02:00
6209d737e9 MXS-2220 server_alloc returns internal type
Also adds default initializers to SERVER fields.
2018-12-14 10:31:57 +02:00
20fe9b9dca MXS-2196: Rename session states
Minor renaming of the session state enum values. Also exposed the session
state stringification function in the public header and removed the
stringification macro.
2018-12-13 13:27:45 +02:00
1ad4b339dc MXS-2025 Make variables that should be private private 2018-12-11 15:11:25 +02:00
48efa6d027 MXS-2213: Clear stored PS information
The information stored for each prepared statement would not be cleared
until the end of the session. This is a problem if the sessions last for a
very long time as the stored information is unused once a COM_STMT_CLOSE
has been received.

In addition to this, the session command response maps were not cleared
correctly if all backends had processed all session commands.
2018-12-11 13:54:10 +02:00
e979a73cc0 Remove the STRSRVSTATUS macro
Use server_status() instead.
2018-12-10 15:55:07 +02:00
c0c9a9858d MXS-2197 Rename maxscale/log.h to maxscale/log.hh
In files either include maxscale/log.hh or remove include entirelly
as maxscale/ccdefs.hh includes it.
2018-12-10 12:58:17 +02:00
77477d9648 MXS-2196: Rename dcb_role_t to DCB::Role 2018-12-05 15:30:44 +02:00
bec9455a74 MXS-2205 Combine routingworker.h with routingworker.hh 2018-12-05 11:12:20 +02:00
9f721f725e MXS-2205 Convert maxscale/protocol/mysql.h to .hh 2018-12-05 11:12:20 +02:00
0d09b56f58 MXS-2025 RWBackends as a vector of unique_ptr:s
For lifetime management keep RWBackends in a vector of unique_ptrs.
RWSplitSession keeps the unique_ptrs very private, and provides a vector
of plain pointers for all other interfaces.
2018-12-05 10:23:57 +02:00
91f6f374a8 MXS-2025 Use PRWBackends in backend selection. 2018-12-05 10:23:57 +02:00
20b62a3f3d MXS-2025 Change RWBackend usage to a vector of raw ptrs.
This is essentially just a search and replace to change SRWBackend to
RWBackend* and SRWBackendList to PRWBackends, a vector of a raw
pointers. In the next few commits vector<unique_ptr<RWBackend>>
will be used for life time management.

There are a lot of diffs from the global search and replace. Only a few manual
edits had to be done.

list-src -x build | xargs sed -ri 's/SRWBackends/prwbackends/g'
list-src -x build | xargs sed -ri 's/const mxs::SRWBackend\&/const mxs::RWBackend\*/g'
list-src -x build | xargs sed -ri 's/const SRWBackend\&/const RWBackend\*/g'
list-src -x build | xargs sed -ri 's/mxs::SRWBackend\&/mxs::RWBackend\*/g'
list-src -x build | xargs sed -ri 's/mxs::SRWBackend/mxs::RWBackend\*/g'
list-src -x build | xargs sed -ri 's/SRWBackend\(\)/nullptr/g'
list-src -x build | xargs sed -ri 's/mxs::SRWBackend\&/mxs::RWBackend\*/g'
list-src -x build | xargs sed -ri 's/mxs::SRWBackend/mxs::RWBackend\*/g'
list-src -x build | xargs sed -ri 's/SRWBackend\&/RWBackend\*/g'
list-src -x build | xargs sed -ri 's/SRWBackend\b/RWBackend\*/g'
list-src -x build | xargs sed -ri 's/prwbackends/PRWBackends/g'
2018-12-05 10:23:57 +02:00
d96a7dedc5 MXS-2205 Convert maxscale/poll.h to .hh 2018-12-04 14:51:02 +02:00
ad12ff6d06 MXS-2196: Rename dcb.h to dcb.hh 2018-12-04 11:50:43 +02:00
d9ae298102 MXS-2205 Combine maxscale/server.h with maxscale/server.hh
The server-struct is still used in several .h-files.
2018-12-03 16:47:27 +02:00
756593a718 MXS-2205 Combine maxscale/router.h with maxscale/router.hh 2018-12-03 15:28:06 +02:00
97bb7e7e1a MXS-2205 Combine maxscale/modutil.h with maxscale/modutil.hh 2018-12-03 15:28:06 +02:00
3e5818fcb6 MXS-2205 Convert mysql_utils.h to .hh 2018-12-03 14:05:21 +02:00
77585bdb8c MXS-2197: Make config.h and service.h C++ headers
This is the first step into converting the other headers into C++.
2018-11-30 12:15:57 +02:00
da83551493 MXS-2189: Prevent unwanted trx replay
When a transaction is being executed on a slave and the master fails, the
transaction replay would start.
2018-11-27 12:52:45 +02:00
1abcbd64bd MXS-2187: Allow multiple transaction retries
By resetting the replay state the transaction replay can start again on a
new server. This allows the replay process work when a master server is
shutting down.
2018-11-27 12:52:44 +02:00
e6325d39fb Delay initial transaction replay
By delaying the replay for a second, we give the monitor a small chance to
adap to master failures. It'll also prevent rapid re-querying if multiple
transaction replays are supported.
2018-11-27 12:52:44 +02:00
851793cb86 Fix transaction replay debug assertion
A transaction that just completed will go through the start_trx_replay
function as from the client protocol's point of view the transaction is
still open. The debug assertion did not take this into account and would
fail if a successful commit was the last thing done on master that failed.

Also fixed the formatting.
2018-11-27 12:52:44 +02:00
842f9f1d15 Fix transaction replay timeout
The timeout would not be triggered due to the fact that the
delayed_retry_timeout wasn't inspected.
2018-11-26 09:42:12 +02:00
7bf5c07835 Ignore errors sent by servers in shutdown
When a server is stopping, it'll send an error to the client before
terminating the TCP connection. The code in readwritesplit would detect
this error and create a hangup event on the DCB. This would cause it to
appear as if the TCP connection was broken and the router would
immediately try to reconnect to the same server.

By ignoring the error and allowing the connection to die on its own, we
avoid immediately reconnecting and retrying any transactions on the
stopping server. This increases the chances that the monitor will see it
first and assign the server states correctly before the transaction replay
is attempted.
2018-11-26 09:42:12 +02:00
9f6700b329 Skip connection_keepalive during transaction_replay
When a transaction is replayed, there is no target but the routing was
"successful".
2018-11-26 09:42:11 +02:00
925670ae2f Fix false master failure log message
The message would be logged even if the session continues.
2018-11-26 09:42:11 +02:00
8b92c63248 Remove incorrect assertion
The assertion would hold true for a single worker but it can't be
guaranteed to be true on a multi-worker system where the statistics are
distributed across the workers.
2018-11-26 09:42:11 +02:00
dcf53da209 Enable connection_keepalive by default
Enabling the feature by default prevents the master connection from dying
during times when there are very little or no writes. Having a modest ping
interval of 300 seconds serves to minimize the amount of extra work that
both MaxScale and the server have to do while still keeping the
connections in good shape.
2018-11-26 09:42:11 +02:00
cab8a4bde8 MXS-2144: Treat server shutdown as a network error
If the server where a query is being executed is shutting down,
readwritesplit should treat it as an error to make retrying of the query
possible.

By treating server shutdowns as network errors, the same code path that is
used for actual network errors can be taken. This removes the need for any
extra retrying logic for this particular case.
2018-11-14 16:23:47 +02:00
370483fb4b Log slave error message on failed session command
If the master succeeds in executing a session command but the slave fails,
the error message could help explain why it failed. At the moment this is
mainly relevant for inspection of test results.
2018-11-14 16:23:46 +02:00
c32bb18862 Fix transaction replay checksum mismatches
The transaction replay could get mixed up with new queries if the client
managed to perform one while the delayed routing was taking place. A
proper way to solve this would be to cork the client DCB until the
transaction is fully replayed. As this change would be relatively more
complex compared to simply labeling queries that are being retried the
corking implementation is left for later when a more complete solution can
be designed.

This commit also adds some of the missing info logging for the transaction
replaying which makes analysis of failures easier.
2018-11-13 16:48:03 +02:00
ae0e9b359d Fix use of zero-weight servers
The servers with a zero weight would be always used over ones that have a
weight. This means that the behavior was inverted and caused the
mxs2054_hybrid_cluster test to fail in 2.3.

Also fixed a typo in the deprecation message.
2018-11-12 10:13:59 +02:00
b443bb7525 Store PS session commands with internal ID
Commit a9e236497963251f8b4afa07484b88ad97e73a03 changed where the PS ID
for a binary protocol command is replaced with the internal form. This
caused prepared statements that are also session commands to be always
routed with the external ID.

As the external ID is almost always the master's ID, the aforementioned
bug resulted in odd side-effects and the true cause of these was only
revealed when the error message sent by the slave was included in the log
messages.
2018-11-12 10:13:59 +02:00
2a6df0e724 Merge branch '2.2' into 2.3 2018-11-09 14:22:28 +02:00
a9e2364979 Fix unknown PS ID on query re-routing
If a PS command is routed multiple times, the ID will not be reverted to
the external ID in the failure cases. This prevented prepared statements
from being re-routed correctly.
2018-11-09 12:13:22 +02:00
bfc8cb4803 MXS-2151: Always log fatal master connection errors
When the connection to the master is broken, the session is not configured
to use the read-only modes and the monitor can still connect to the
server, the connection will be closed and and error is sent to the
client. To leave some trace of this problem in the MaxScale logs, a
message should always be logged when a network error occurs.
2018-11-09 00:39:32 +02:00
c899f00541 MXS-1780 Collect server response information
As the router is the only one that knows what backends a particular
statement has been sent to, it is the responsibility of the router
to keep the session bookkeeping up to date. If it doesn't we will
know what statements a session has received (provided at least some
component in the routing chain has RCAP_TYPE_STMT_INPUT capability),
but not how long their processing took. Currently only readwritesplit
does that.

All queries are stored and not just COM_QUERY as that makes the
overall bookkeeping simpler; at clientReply() time we do not need to
know whether or not to bookkeep information, we can just do it.

When session information is queried for, we report as much information
we have available.
2018-11-08 12:04:55 +02:00
3ccdb508de Fix bug in roulette wheel
Slot values were changed after the total was calculated. Fix bug
and adjust the offending code.
2018-11-08 10:50:23 +02:00
c692c864e2 MXS-2078 Take new statistics into use 2018-11-08 10:44:32 +02:00
113b1503f6 Expand readwritesplit delayed retry error message
The error now explains if the write failure was due to the
delayed_retry_timeout being reached.
2018-11-06 15:09:14 +02:00
4341c2b6e2 MXS-2142: Set causal_reads_timeout default to 10
The causal_reads_timeout default value is too long when considering the
behavioral changes that MXS-2141 introduced. With a 10 second default
value, a result is returned to the client in a reasonable amount of time.
2018-11-06 15:09:14 +02:00