The expected response counter was not decremented if a transaction replay
was started. This caused the connections to hang which in turn caused the
failure of the mxs1507_trx_stress test case.
If the slave's response differs from the master and the slave sent an
error packet, log the contents of the error. This should make it obvious
as to what caused the failure.
The assertion in routeQuery that expects there to be at least one ongoing
query would be triggered if a query was received after a master had failed
but before the session would close. To make sure the internal logic stays
consistent, the error handler should only decrement the expected response
count if the session can continue.
MySQLAuth now logs the server where the users were loaded from. As only
the initial loading of users causes a log message, it is still possible
for the source server to change without any indication of it.
The first node without a priority would be chosen as the candidate master
and the rest would be ignored. The code must check if neither of the two
nodes have priorities and if so must choose the better one.
This could end up in infinite mutual recursion if no responses are
expected. Although this does not happen now that MXS-2587 is fixed, the
code should not even be there.
If a transaction replay fails, no queries must be routed before the
connection is closed. This could happen if the client received the error
from the replay failure and closes the connection before the fake hangup
generated by the replay failure is processed.
The error was only generated for COM_STMT_EXECUTE commands when all PS
commands should trigger it. In addition, large packets would get sent two
errors upon the arrival of the trailing end.
If an error is generated while a COM_CHANGE_USER is being done, it would
always use the sequence number 1. To properly handle this case and send
the correct sequence number, the COM_CHANGE_USER progress needs to be
tracked at the session level.
The information needs to be shared between the backend and client
protocols as the final OK to the COM_CHANGE_USER, with the sequence number
3, is the one that the backend server returns. Only after this response
has been received and routed to the client can the COM_CHANGE_USER
processing stop.
If a server fails mid-resultset, there's not a lot we can do to recover
the situation. A few cases could be handled (e.g. generate an ERR if the
resultset has proceeded to the row processing stage) but these fall
outside the scope of the original issue.
As is explained in MDEV-19893:
Client reads from socket, gets the packet from 1. with seqno=0, which
it does not expect, since seqno is supposed to be incremented. Client
complains, throws tantrums and exceptions.
To cater for clients that do not expect out-of-bound messages
(i.e. server-initiated packets with seqno 0), all messages generated by
MaxScale should use at least sequence number 1.
Deep-copying prevents subsequent modifications done by the caller from
affecting the data that can be potentially stored in the write queue of
the backend's DCB.
If a COM_STMT_EXECUTE has no metadata in it and it has more than one
parameter, it must be routed to the same backend where the previous
COM_STMT_EXECUTE with the same ID was routed to. This prevents MDEV-19811
that is triggered by MaxScale routing the queries to different backends.
mxs::Buffer::iterator is not a random-access iterator and hence will have
linear cost. This is too costly to do on every packet with even moderately
sized resultsets.
By always restoring the ID, we are guaranteed to only store the query in
the form that it was originally sent in. This should be changed so that
the ID that the client sends can be used as-is in the backends.
The slave connection I/O-tread stays running if replication credentials are
wrong when connecting to master. This causes a switchover/failover timeout.
When this happens, print the error in the slave connection status as this
clarifies the problem to the user.
If one slave is executing a query while another one is executing a session
command and the one that is executing the session command fails, the
ongoing query would get retried even though the server that failed was not
executing it. If the server was executing a session command, nothing needs
to be done.
RWBackend did not expect that a resultset and an unexpected ERR packet
could be stored in the same buffer. This can happen for example if a
server shuts down immediately after the resultset is sent.
If a packet with a KILL query was followed with another packet in the same
network buffer, the code wouldn't work as it expected to receive only one
packet at a time.
By iterating over the servers and sending the master's charset we are
guaranteed a "known good" charset. This also solves the problem where a
deactivated server reference would be used as the charset and server
version source.
If the execution of a session command fails on a master, it is retried
again. If the master is not available, the response will be returned from
one of the slaves.
The retrying of a read on a slave should only be done when the failing
server is waiting for a result and it was the last server from which a
result was expected.
If the master fails when a session command is being executed with
delayed_retry enabled, a null query would get placed into the query
queue. This change simply prevents the crash and closes the session even
though the query could be retried.