This could end up in infinite mutual recursion if no responses are
expected. Although this does not happen now that MXS-2587 is fixed, the
code should not even be there.
If a transaction replay fails, no queries must be routed before the
connection is closed. This could happen if the client received the error
from the replay failure and closes the connection before the fake hangup
generated by the replay failure is processed.
The error was only generated for COM_STMT_EXECUTE commands when all PS
commands should trigger it. In addition, large packets would get sent two
errors upon the arrival of the trailing end.
If a server fails mid-resultset, there's not a lot we can do to recover
the situation. A few cases could be handled (e.g. generate an ERR if the
resultset has proceeded to the row processing stage) but these fall
outside the scope of the original issue.
If a COM_STMT_EXECUTE has no metadata in it and it has more than one
parameter, it must be routed to the same backend where the previous
COM_STMT_EXECUTE with the same ID was routed to. This prevents MDEV-19811
that is triggered by MaxScale routing the queries to different backends.
By always restoring the ID, we are guaranteed to only store the query in
the form that it was originally sent in. This should be changed so that
the ID that the client sends can be used as-is in the backends.
If one slave is executing a query while another one is executing a session
command and the one that is executing the session command fails, the
ongoing query would get retried even though the server that failed was not
executing it. If the server was executing a session command, nothing needs
to be done.
If the execution of a session command fails on a master, it is retried
again. If the master is not available, the response will be returned from
one of the slaves.
The retrying of a read on a slave should only be done when the failing
server is waiting for a result and it was the last server from which a
result was expected.
If the master fails when a session command is being executed with
delayed_retry enabled, a null query would get placed into the query
queue. This change simply prevents the crash and closes the session even
though the query could be retried.
A query should not be queued if no responses are expected. The code that
executes queued queries should be dead code and this assertion would catch
it.
If a client requests an unknown binary protocol prepared statement handle,
a custom error shows the actual ID used instead of the "empty" ID of 0
that the backend sends.
The code that checked that only non-empty queries are stored in the query
queue was left out when the query queue fix was backported to 2.3. Since
MXS-2464 is caused by a still unknown bug, the runtime check should help
figure out in which cases the problem occurs.
Now considers other routing hints if first one fails. The order is inverted compared
to e.g. namedserver filter settings because of how routing hints are stored. If all hints
are unsuccessful, route to any slave.
If a transaction replay has to be executed twice due to a failure of the
original candidate master, the query queue could contain replayed
queries. The replayed queries would be placed into the queue if a new
connection needs to be created before the transaction replay can start.
Backported the changes that convert the query queue in readwritesplit into
a proper queue. This changes combines both
5e3198f8313b7bb33df386eb35986bfae1db94a3 and
6042a53cb31046b1100743723567906c5d8208e2 into one commit.
By storing the queries in the query queue and routing it once the
transaction replay is done, we prevent two problems:
* Multiple transaction replays would overwrite the m_interrupted_query
buffer that was used to store any queries executed during the
transaction replay.
* Incorrect ordering of queries when the query queue is not empty and a
new query is executed during transaction replay.
If the session starts with no master but later one becomes available, when
a transaction is started the code would unconditionally use the master's
name in a log message.
If a routing of a queued query caused it to be put back on the query
queue, the order in which the queue was reorganized was wrong. The first
query would get appended as the last query which caused the order to be
reversed.
Th discarding of connections in maintenance mode must be done after any
results have been written to them. This prevents closing of the connection
before the actual result is returned.
The candidate selection code used default values that would cause reads
past buffers. The code could also dereference the end iterator which
causes undefined behavior.
Queries in the query queue need to be explicitly parsed since they are
stored in a single buffer and thus share the query classification
information. In the next major version this should be changed into an
array of individual buffers instead of a shared buffer.
The protocol should not track the session state as the parsing is quite
expensive with the current code. This change is a workaround that enables
the parsing only when required. A proper way to handle this would be to do
all the response processing in one place thus avoiding the duplication of
work.
Given the fact that there exist only three possible categories, the map
can be replaced with a static array that needs no memory
allocations. Making this array thread-local allows it to be reused which
places an upper limit on the number of memory allocations.
The documentation stated that at most `max_sescmd_history` commands were
kept but in reality the number of commands kept in the history was one
command smaller than what was documented.
This commit adds a new parameter that, when enabled, prunes the session
command history to a known length. This makes it possible to keep a
client-side pooled connection open indefinitely at the cost of making
reconnections theoretically unsafe. In practice the maximum history length
can be set to a value that encompasses a single session using the pooled
connection with no risk to session state integrity. The default history
length of 50 commands is quite likely to be adequate for the majority of
use-cases.
When the connection state is reset by executing a COM_CHANGE_USER or
COM_RESET_CONNECTION, readwritesplit does not need to store the session
command history that was executed before it. With this, pooled connections
will effectively behave like normal connections if the pooling mechanism
is smart enough to reset the connection. This also prevents unwanted
visibility into the session states of other connections.
If the routing of a session command fails due to problems with the backend
connections, a more verbose error message is logged. The added status
information in the Backend class makes tracking the original cause of the
problem a lot easier due to knowing where, when and why the connection was
closed.
If a server was not chosen as the target of a routing hint, the server
status would not be logged. By logging the server state in the message, it
is easier to figure out why a server wasn't chosen as the routing target.
Both the replication lag and the message printing state are saved in SERVER,
although the values are mostly used by readwritesplit. A log message is printed
both when a server goes over the limit and when it comes back below.
Because of concurrency issues, a message may be printed multiple times before
different threads detect the new message state.
Documentation updated to explain the change.
If the connection to the master is lost, knowing what type of an error
caused the call to handleError helps deduce what was the real reason for
it. Logging the idle time of the connection helps detect when the
wait_timeout of a connection is exceeded.