If the client sends another query before reading the previous result, it
was possible that another session finished the shard update and the
following query would reuse this result. This would cause the assertion in
the delayed call to fail as it assumes the shard would always be empty
when it was being called. To correctly handle this case, the delayed call
must be the one that moves the session into the normal routing state.
Putting the sessions that aren't doing the update on hold makes the new
mechanism work the same way the old one did with the exception that it
won't put any extra work on the database itself.
As there are no practical benefits to multiple sessions for the same user
mapping the databases at the same time, limiting them to one update per
user is sensible. This is especially true now that we know the
information_schema tables aren't the most efficient things in the world.
The current code implements this rate limiting by closing any extra
sessions that would start a second update. The final implementation should
suspend them for the duration of the update as it is far more
user-friendly.
The limits are currently global as the shard caches are also global. This
is a performance bottleneck and it could be solved by storing the shard
cache inside of a mxs::WorkerGlobal instead of having it as a global
cache.
If the schemarouter service used a filter, any queued queries would end up
being routed twice for them as well. This would break any filters that
would expect a result for each query and it could also cause a hang if the
filter did not forward it up to the router.
The correct thing to do is to call the routeQuery method of the router
directly.
When a transaction migration starts, the old master must be
unconditionally closed. This is the simplest way of resetting the
connection state and it also helps close unused connections.
The LOAD DATA LOCAL INFILE is handled in a way where it returns two
results that both are complete: the first one with the file being
requested and the second one with the final OK packet. Readwritesplit
called session_book_server_response for both statements which caused the
current query index to drop to -1 which in turn was unconditionally used
as the buffer offset.
The new check for the invalid index value will help prevent crashes in
production while still allowing it to be detected while testing.
Responses generated by replayed session commands must not be treated as
actual responses to retained statements. In 2.5 this is not a problem as
it is done implicitly with the pre-assignment of the server that delivers
the session command response.
Unlike readwritesplit, schemarouter will process all responses from
backends as if they are expected. There are cases where errors are
generated that aren't sent as a response to a query. These queries must be
ignored and not routed to the client. Copying the code as-is from
readwritesplit isn't the cleanest solution but it avoids refactoring code
in a patch release.
The custom error number (2003) used by the backend protocol code was not
an actual error number that the server would send. The error code in
question was for an error that only the C connector returns:
CR_CONN_HOST_ERROR. Using ER_CONNECTION_KILLED as the error number better
conveys the fact that the connection was killed due to a reason not
related to any ongoing query.
By using a known error number that is correctly handled, we also avoid
writing errors to the client in the middle of a resultset or as the
initial response to a result. This explains why the problem described in
MXS-3267 happened in the first place: an unrelated connection was lost in
the middle of a resultset and the error was interpreted as the end of a
resultset. As a result of there being more data to be read, the unexpected
result state messages were logged.
This could happen if a session command triggers a master reconnection and
the connection fails while the history replay is ongoing. The code assumed
that history replay would only happen when a query was in the query queue.
This correctly triggers the session command response processing to accept
results from other servers than the current master backend if the session
can continue. If the session cannot continue, it will be stopped
immediately.
The code used a null GWBUF with gwbuf_append which causes a crash. The
return value of the function that used it was also not correctly handled
and would be mistaken for a different error.
This allows the set of servers used by the service to also participate in
the cache value resolution. This will prevent the most obvious of problems
but any abstractions of the servers will prevent this from working.
Session commands did not trigger a reconnection process which caused
sessions to be closed in cases where recovery was possible.
Added a test case that verifies the patch fixes the problem.
If the session command could not be routed, the log message should contain
the actual command that was routed. This makes failure analysis much
easier.
If a limit on the replication lag is configured, servers with unmeasured
replication lag should not be used. The code in question did use them even
when a limit was set as the value used for undefined lag was -1 which
always measured lower than the limit.
The code relied on last_read for the idle time calculation which caused
the pings that were written to not reset the idle time. This increased the
chance of multiple COM_PING packets being sent to a backend before a reply
was received.