Currently the state change explanations are only added to mariadbmon. They
are less relevant for Galera clusters as they themselves explain why they
change their states but should still be added to make them easier to
analyze.
The event that isn't explained and is most often encountered is the loss
of a Slave status. Most often the loss of a Slave status happens because
either the IO thread or the SQL thread has stopped. Printing the states of
the threads as well as the latest error should hint at what caused the
outage.
The information can be added to the REST API in 2.5 where the monitors can
add extra information to the server JSON.
When a transaction migration starts, the old master must be
unconditionally closed. This is the simplest way of resetting the
connection state and it also helps close unused connections.
The LOAD DATA LOCAL INFILE is handled in a way where it returns two
results that both are complete: the first one with the file being
requested and the second one with the final OK packet. Readwritesplit
called session_book_server_response for both statements which caused the
current query index to drop to -1 which in turn was unconditionally used
as the buffer offset.
The new check for the invalid index value will help prevent crashes in
production while still allowing it to be detected while testing.
Responses generated by replayed session commands must not be treated as
actual responses to retained statements. In 2.5 this is not a problem as
it is done implicitly with the pre-assignment of the server that delivers
the session command response.
During switchover/failover, server events are altered. The ALTER
EVENT command automatically modifies the event charset and collation
to the values of the connetion running the query. This may cause
the event to become invalid.
Fixed this by changing connection charset and collation to the ones
in the event description just before altering it.
The current code assumes that the variable names are in lowercase. This
fixes the galera monitoring that was broken by commit
43068d20b43a34d5f3b4b4db0fcce701b3cd7cad. In addition, lowercase names
also helps when comparisons are done with std::string.
We will continue to look for "clustrix" as well so that MaxScale
will continue to work with older releases. Clustrix was replaced with
xpand in all symbols.
Unlike readwritesplit, schemarouter will process all responses from
backends as if they are expected. There are cases where errors are
generated that aren't sent as a response to a query. These queries must be
ignored and not routed to the client. Copying the code as-is from
readwritesplit isn't the cleanest solution but it avoids refactoring code
in a patch release.
The custom error number (2003) used by the backend protocol code was not
an actual error number that the server would send. The error code in
question was for an error that only the C connector returns:
CR_CONN_HOST_ERROR. Using ER_CONNECTION_KILLED as the error number better
conveys the fact that the connection was killed due to a reason not
related to any ongoing query.
By using a known error number that is correctly handled, we also avoid
writing errors to the client in the middle of a resultset or as the
initial response to a result. This explains why the problem described in
MXS-3267 happened in the first place: an unrelated connection was lost in
the middle of a resultset and the error was interpreted as the end of a
resultset. As a result of there being more data to be read, the unexpected
result state messages were logged.
This could happen if a session command triggers a master reconnection and
the connection fails while the history replay is ongoing. The code assumed
that history replay would only happen when a query was in the query queue.
This correctly triggers the session command response processing to accept
results from other servers than the current master backend if the session
can continue. If the session cannot continue, it will be stopped
immediately.
Fixed leak in load_utils.cc and the cache filter. Also changed all
instances of json_object_set with json_object_set_new to make sure it's
only used when the references are to be stolen.