The hard limit of 10 seconds is too strict when taking into account the
fact that infinite refreshes was possible before the bug was fixed. This
also makes testing a lot easier where rapid reloads are necessary.
The first node without a priority would be chosen as the candidate master
and the rest would be ignored. The code must check if neither of the two
nodes have priorities and if so must choose the better one.
Certain MariaDB connectors will use the direct execution for batching
COM_STMT_PREPARE and COM_STMT_EXECUTE execution without waiting for the
COM_STMT_PREPARE to complete. In these cases the COM_STMT_EXECUTE (and
other COM_STMT commands as well) will use the special ID 0xffffffff. When
this is detected, it should be substituted with the ID of the latest
statement that was prepared.
This fixes the test failures that stem from users being created right
after maxscale has started. This also should make startups a bit smoother
now that the default value of users_refresh_time has been fixed.
This could end up in infinite mutual recursion if no responses are
expected. Although this does not happen now that MXS-2587 is fixed, the
code should not even be there.
All COM_STMT_SEND_LONG_DATA commands and the COM_STMT_EXECUTE that follows
it must be sent to the same server. This implicitly works for masters but
with multiple slave servers the data could be sent to the wrong server.
By using the code added for MXS-2521, this problem can now be easily
solved by checking what the previous command was.
If a transaction replay fails, no queries must be routed before the
connection is closed. This could happen if the client received the error
from the replay failure and closes the connection before the fake hangup
generated by the replay failure is processed.
When fake hangup events are delivered via DCBs, the current DCB would not
be updated. This would cause error messages without a session ID which
makes failure analysis harder.
The error was only generated for COM_STMT_EXECUTE commands when all PS
commands should trigger it. In addition, large packets would get sent two
errors upon the arrival of the trailing end.
If an error is generated while a COM_CHANGE_USER is being done, it would
always use the sequence number 1. To properly handle this case and send
the correct sequence number, the COM_CHANGE_USER progress needs to be
tracked at the session level.
The information needs to be shared between the backend and client
protocols as the final OK to the COM_CHANGE_USER, with the sequence number
3, is the one that the backend server returns. Only after this response
has been received and routed to the client can the COM_CHANGE_USER
processing stop.
If a server fails mid-resultset, there's not a lot we can do to recover
the situation. A few cases could be handled (e.g. generate an ERR if the
resultset has proceeded to the row processing stage) but these fall
outside the scope of the original issue.
As is explained in MDEV-19893:
Client reads from socket, gets the packet from 1. with seqno=0, which
it does not expect, since seqno is supposed to be incremented. Client
complains, throws tantrums and exceptions.
To cater for clients that do not expect out-of-bound messages
(i.e. server-initiated packets with seqno 0), all messages generated by
MaxScale should use at least sequence number 1.
Deep-copying prevents subsequent modifications done by the caller from
affecting the data that can be potentially stored in the write queue of
the backend's DCB.
It was possible that a one-second outage that caused immediate rejection
of network connections would cause all of the query retry attempts to fail
within a very short period of time. By preventing rapid reconnections,
query_retries is more effective as an error filtering mechanism.
If a COM_STMT_EXECUTE has no metadata in it and it has more than one
parameter, it must be routed to the same backend where the previous
COM_STMT_EXECUTE with the same ID was routed to. This prevents MDEV-19811
that is triggered by MaxScale routing the queries to different backends.
mxs::Buffer::iterator is not a random-access iterator and hence will have
linear cost. This is too costly to do on every packet with even moderately
sized resultsets.
Due to there being no distinction between a temporarily stopped worker and
a permanently stopped one, we must allow posting of messages to stopped
workers.
By always restoring the ID, we are guaranteed to only store the query in
the form that it was originally sent in. This should be changed so that
the ID that the client sends can be used as-is in the backends.
By having a separate FINISHED state and a STOPPED state, it is possible to
know at which point in the worker's lifetime an event is done. Posting of
messages before a worker is started is allowed but posting them after the
worker has stopped is not.
This fixes avrorouter related failures and all other failures that stem
from worker messages being ignored at startup.
By stopping the REST API before the workers and moving the shutdown to the
same worker that handles REST API requests, we prevent the hang on
shutdown. This also makes the signal handler signal-safe.
If a worker has been stopped, tasks must not be executed on it. To prevent
this, the calling code should check whether the worker has been
stopped. This does not prevent the case where a message is successfully
posted to a worker but the worker is stopped before it processes it.