Large session commands weren't properly handled which caused the router to
think that the trailing end of a multi-packet query was actually a new
query.
This cannot be confidently solved in 2.2 which is why the router session
is now closed the moment a large session command is noticed.
Only commands that can contain an SQL statements should be stored for
retrying (COM_QUERY and COM_EXECUTE). Other commands are either session
commands or do not work with query retrying.
The commands needs to be handled separately from the rest of the result
types.
Added a test case that reproduces the problem and verifies that the change
in code fixes it.
When a LOAD DATA LOCAL INFILE finishes, the client sends an empty
packet. The second case when the client sends an empty packet when the
previous packet was exactly 0xffffff bytes long. These two packets were
confused which caused the internal state to temporarily flip from inactive
to ending and back to inactive.
The aforementioned flip-flopping didn't have any practical differences but
it was caught by a debug assertion.
Only servers that qualify to be connected should be considered as
candidate servers. This triggered a debug assertion when a slave server
failed to execute a session command but it was chosen as a candidate
server later on.
The COM_STMT_FETCH command will create a response. This was a
readwritesplit-specific interpretation of the command and it was wrong.
Also record the currently executed command event for session commands.
Readwritesplit would not handle multiple overlapping COM_STMT_EXECUTE
commands properly if they opened cursors. This was due to the fact that
the result would not be marked as complete and COM_STMT_FETCH commands
were executed as if they did not return results.
The correct implementation is to consider a COM_STMT_EXECUTE that opens a
cursor complete only when the first EOF packet is read (that is, when the
resultset header is read). This allows subsequent COM_STMT_FETCH commands
to be handled separately.
The separate COM_STMT_FETCH handling must count the number of packets that
are being fetched. This allows correct tracking of the state of a
COM_STMT_FETCH by checking that the number of packets is correct or the
second EOF/ERR packet is read.
When a LOAD DATA LOCAL INFILE is actively rejected by the server, the
server sends an error to the client. This error was not detected and the
router was stuck in the special mode that handles LOAD DATA LOCAL INFILE.
The state could be factored out into a boolean variable as the reply
processing can be in two states: Either waiting for the response to
MASTER_GTID_WAIT or updating packet numbers.
The packet number updating can always be done as long as a buffer is
available. The discard_master_wait_gtid_result function discards the OK
packet before the packet numbers are updated so any trailing packets get
corrected properly.
If a query is interrupted that was sent to the master, it is now
retried. This allows all autocommit queries to be transparently retried if
the server in question fails.
Configuring the parameter current doesn't have enough of an effect to
warrant having a configuration option for it.
Also took mxs::extract_sql into use in the INFO level message.
By adding a boolean parameter, the feature can be enabled with sensible
default values. Renamed the query_retry parameters and set the defaults to
acceptable values.
Added new parameters to diagnostic output. Also did some minor renaming.
The GWBUF passed to RWBackend::write must be treated as freed memory if
the write succeeds. This means that the original buffer should be stored,
not the clone.
Now that the readwritesplit uses the same mechanism for both
retry_failed_reads and delayed query retries, the re-routing function
should accept a delay of 0 seconds. This makes the mechanism more suitable
for other uses e.g. delaying of queries in filters.
The logging in the RWSplit::select_connect_backend_servers was quite
cumbersome and doing the logging when the connection is created makes for
a more information rich output (we know what servers were taken into use
and when).
Changed the log messages to use the names of the servers instead of the
address and port.
It is possible that the routing fails even if master_reconnection is
enabled and a second master is available. This is the case when a
transaction is open or autocommit is disabled which is what the
mxs359_master_switch test tests.
The old hkheartbeat variable was changed to the mxs_clock() function that
simply wraps an atomic load of the variable. This allows it to be
correctly read by MaxScale as well as opening up the possibility of
converting the value load to a relaxed memory order read.
Renamed the header and associated macros. Removed inclusion of the
heartbeat header from the housekeeper header and added it to the files
that were missing it.
This is a proof-of-concept that validates the query retrying method. The
actual implementation of the query retrying mechanism needs more thought
as using the housekeeper is not very efficient.