The error was only generated for COM_STMT_EXECUTE commands when all PS
commands should trigger it. In addition, large packets would get sent two
errors upon the arrival of the trailing end.
If a server fails mid-resultset, there's not a lot we can do to recover
the situation. A few cases could be handled (e.g. generate an ERR if the
resultset has proceeded to the row processing stage) but these fall
outside the scope of the original issue.
If a COM_STMT_EXECUTE has no metadata in it and it has more than one
parameter, it must be routed to the same backend where the previous
COM_STMT_EXECUTE with the same ID was routed to. This prevents MDEV-19811
that is triggered by MaxScale routing the queries to different backends.
By always restoring the ID, we are guaranteed to only store the query in
the form that it was originally sent in. This should be changed so that
the ID that the client sends can be used as-is in the backends.
The connection counts are now always used to pick the best servers where
the initial connections are created. This covers both master and slaves
selection. Reconnections done while routing queries still pick the "best"
server according to the slave selection criteria. This allows better
servers to be taken into use when `lazy_connect` is enabled.
1. Remove persistence of performance data
2. Move global CanonicalPerformance into SmartRouter object
3. Implement another kill_all_others_v2. Left kill_all_others_v1
in case it should be fixed and used instead.
Does the measurments, usage of the same and persistence to file.
Kill is not implemented, so waits for all clusters to fully responds
The performance data uses a mutex and the persistence data file
is written while holding the mutex. This obviously needs to be
improved, but this commit shows the working concept.
If one slave is executing a query while another one is executing a session
command and the one that is executing the session command fails, the
ongoing query would get retried even though the server that failed was not
executing it. If the server was executing a session command, nothing needs
to be done.
If a resultset is followed by an ERR packet that is not expected
(e.g. server is shutting down), the packet must not be sent to the
client. This allows readwritesplit to replace the failing connection with
a new one thus hiding server shutdowns from clients.
The name of the object (i.e. the section name from the configuration
file), is now stored in the configuration object for that object.
That way, more contextual and hence morfe user friendly errors and
warnings can be generated.
If the server (a real one or a service exposed as a server) is
on the same machine as MaxScale, then for performance reasons
a Unix domain socket and not a TCP/IP socket should be used.
As an error returned by the server is now stored inside RWBackend,
irrespective of whether it is returned solely or e.g. last after
a result set, there is no need to examine the GWBUF in rws, but
we can use the information that exists.
This is the base for Smart Router. Review and TODO comments are in the
code. This commit will be squashed several times so don't pay attention to this
specific commit message. I will add and remove TODO's in the code, rather
than save them in git commits. RBCommons will contain the history.
If the execution of a session command fails on a master, it is retried
again. If the master is not available, the response will be returned from
one of the slaves.
The retrying of a read on a slave should only be done when the failing
server is waiting for a result and it was the last server from which a
result was expected.
If the master fails when a session command is being executed with
delayed_retry enabled, a null query would get placed into the query
queue. This change simply prevents the crash and closes the session even
though the query could be retried.
A query should not be queued if no responses are expected. The code that
executes queued queries should be dead code and this assertion would catch
it.