The use of the server state is not transactional across multiple uses of
the function. This means that any assertions on the target state can fail
if the monitor updates the state between target selection and the
assertion.
This allows the same verbose information to be logged in the cases where
it is of use. Mostly this information can be used to figure out why a
certain session was closed.
If session command execution during server reconnection caused a query to
be queued, the query would be put on the tail end of the queue. This would
cause queries to be reordered if the queue wasn't empty. The correct thing
to do would be to put the next pending query back at the front of the
queue.
If a master reconnection occurred after the session command history was
disabled due to the limit being exceeded, a debug assertion would be hit
in prepare_target. This assert makes sure that a connection can be safely
created to the server which means that in release mode builds the session
state would be inconsistent on the new master.
As this is an unrecoverable situation, the session should stop immediately
even if delayed_retry is enabled. Currently the session will continue
until the delayed retry timeout is hit. This happens due to the fact that
the delayed retry mechanism handles all errors in a similar way.
The error was only generated for COM_STMT_EXECUTE commands when all PS
commands should trigger it. In addition, large packets would get sent two
errors upon the arrival of the trailing end.
If a COM_STMT_EXECUTE has no metadata in it and it has more than one
parameter, it must be routed to the same backend where the previous
COM_STMT_EXECUTE with the same ID was routed to. This prevents MDEV-19811
that is triggered by MaxScale routing the queries to different backends.
By always restoring the ID, we are guaranteed to only store the query in
the form that it was originally sent in. This should be changed so that
the ID that the client sends can be used as-is in the backends.
If a client requests an unknown binary protocol prepared statement handle,
a custom error shows the actual ID used instead of the "empty" ID of 0
that the backend sends.
Now considers other routing hints if first one fails. The order is inverted compared
to e.g. namedserver filter settings because of how routing hints are stored. If all hints
are unsuccessful, route to any slave.
Backported the changes that convert the query queue in readwritesplit into
a proper queue. This changes combines both
5e3198f8313b7bb33df386eb35986bfae1db94a3 and
6042a53cb31046b1100743723567906c5d8208e2 into one commit.
If the session starts with no master but later one becomes available, when
a transaction is started the code would unconditionally use the master's
name in a log message.
The protocol should not track the session state as the parsing is quite
expensive with the current code. This change is a workaround that enables
the parsing only when required. A proper way to handle this would be to do
all the response processing in one place thus avoiding the duplication of
work.
The documentation stated that at most `max_sescmd_history` commands were
kept but in reality the number of commands kept in the history was one
command smaller than what was documented.
This commit adds a new parameter that, when enabled, prunes the session
command history to a known length. This makes it possible to keep a
client-side pooled connection open indefinitely at the cost of making
reconnections theoretically unsafe. In practice the maximum history length
can be set to a value that encompasses a single session using the pooled
connection with no risk to session state integrity. The default history
length of 50 commands is quite likely to be adequate for the majority of
use-cases.
If the routing of a session command fails due to problems with the backend
connections, a more verbose error message is logged. The added status
information in the Backend class makes tracking the original cause of the
problem a lot easier due to knowing where, when and why the connection was
closed.
If a server was not chosen as the target of a routing hint, the server
status would not be logged. By logging the server state in the message, it
is easier to figure out why a server wasn't chosen as the routing target.
Both the replication lag and the message printing state are saved in SERVER,
although the values are mostly used by readwritesplit. A log message is printed
both when a server goes over the limit and when it comes back below.
Because of concurrency issues, a message may be printed multiple times before
different threads detect the new message state.
Documentation updated to explain the change.
By storing the server statistics object in side the session, the lookup
involved in getting a worker-local value is avoided. Since the lookup is
done multiple times for a single query, it is beneficial to store it in
the session.
As the worker-local value is never deleted, it is safe to store a
reference to it in the session. It is also never updated concurrently so
no atomic operations are necessary.
The code now only checks the need for a keepalive ping once every
keepalive interval. Reduced the number of mxs_clock calls to one so that
all servers use the same value.
The information stored for each prepared statement would not be cleared
until the end of the session. This is a problem if the sessions last for a
very long time as the stored information is unused once a COM_STMT_CLOSE
has been received.
In addition to this, the session command response maps were not cleared
correctly if all backends had processed all session commands.
The transaction replay could get mixed up with new queries if the client
managed to perform one while the delayed routing was taking place. A
proper way to solve this would be to cork the client DCB until the
transaction is fully replayed. As this change would be relatively more
complex compared to simply labeling queries that are being retried the
corking implementation is left for later when a more complete solution can
be designed.
This commit also adds some of the missing info logging for the transaction
replaying which makes analysis of failures easier.
Commit a9e236497963251f8b4afa07484b88ad97e73a03 changed where the PS ID
for a binary protocol command is replaced with the internal form. This
caused prepared statements that are also session commands to be always
routed with the external ID.
As the external ID is almost always the master's ID, the aforementioned
bug resulted in odd side-effects and the true cause of these was only
revealed when the error message sent by the slave was included in the log
messages.
If a PS command is routed multiple times, the ID will not be reverted to
the external ID in the failure cases. This prevented prepared statements
from being re-routed correctly.
With causal_reads enabled, the query would return with an error if the
slave was not able to catch up to the master fast enough. By automatically
retrying the query on the master, we're guaranteed that a valid result is
always returned to the client.
The debug assertion is wrong as the code was changed to prioritize hints
over the router target selection. Also removed the superficial check for
master, slave and relay master states as they are implied by the fact that
the connection is in use.
The collection of resultsets needs to be disabled by default when a
response is received to cover the cases where an error is returned.
The collection of results should also not be set for queries that do not
generate any responses.