Reading the binlog event header in a separate function makes it easier to
comprehend. Cleaned up some unused variables and code that would is not
used.
This clarifies what parts of the router are specific to the binlogrouter
and what are common between the binlogrouter and avrorouter.
Ideally, the two modules would use the same infrastructure to handle the
processing of replication events. This is the first, albeit small, step
towards making the code in the binlogrouter the common infrastructure.
The Avro instance is now created inside a static class method. This brings
it in line with how other modules create instances.
Converted all strings to std::string and updated their usage.
By storing the table maps in a std::unordered_map, the storage of mapped
tables is made dynamic. This also makes the management of mapped tables a
lot more robust.
Using std::string for names removes the need to handle memory
allocation. Moving the column attributes into a class of its own greatly
simplifies the creation of the TABLE_CREATE as well as modifications that
are done to it.
The HASHTABLE can be replaced with std::unordered_map. This simplifies the
management by making the deletion of old objects automatic.
More cleanup and refactoring is needed to make the contained classes
cleaner.
The avrorouter no longer uses the housekeeper for the conversion
task. This prevents the deadlock which could occur when clients were
notified at the same time that the binlogrouter was adding a master
reconnection task.
The instance and session objects are now C++ structs. The next pointers
for the sessions was removed as it is not the appropriate place to store
this information. This means that the client notification functionality is
broken in this commit.
When large binary protocol packets were handled, a part of the data was
replaced with a non-existing PS ID.
The replacement of the client PS ID to the internal ID and the replacement
of the internal ID to the server specific ID must only be done if a large
packet is not being processed. This can be done on the router level
without adding knowledge of large packets to the RWBackend class.
A specific function, RWBackend::continue_write, was added to make it clear
that the buffer being written is a part of a larger query. The base class
Backend::write could be used but its usage is not self-explanatory.
The MariaDB implementation allows the last GTID to be tracked with the
`last_gtid` variable. To do this, the configuration option
`session_track_system_variables=last_gtid` must be used or it must be
enabled at runtime.
To work around the limitation in the session command handling and
multi-part results, all session commands are now treated as gathered
results. This allows session commands which return result sets to be used
with MaxScale.
This change should not cause problems with practical workloads as they
usually do not return massive resultsets for session commands.
The optimal way to handle the multi-part responses would be to integrate
it into the result completion tracking process. This would allow the
prepared statement IDs to be extracted while the command is being
processed.
By relying on the server to tell us that it is requesting the loading of a
local infile, we can remove one state from the state machine that governs
the loading of local files. It also removes the need to handle error and
success cases separately.
A side-effect of this change is that execution of multi-statement LOAD
DATA LOCAL INFILE no longer hangs. This is done by checking whether the
completion of one command initiates a new load.
The current code recursively checks the reply state and clones the
buffers. Neither of these are required nor should they be done but
refactoring the code is to be done in a separate commit.
Added two helper functions that are used to detect requests for local
infiles and to extract the total packet length from a non-contiguous
GWBUF.
The individual servers were missing a statistic that would give an
estimated query count. As there is no simple way to count queries for all
modules, counting the number of routed protocol packets is a suitable
substitute.
Session commands that span multiple packets are now allowed and will
work. However, if one is executed the session command history is disabled
as no interface for appending to session commands exists.
The backend protocol modules now also correctly track the current
command. This was a pre-requisite for large session commands as they
needed to be gathered into a single buffer and to do this the current
command had to be accurate.
Updated tests to expect success instead of failure for large prepared
statements.
Readwritesplit had redundant parameter values in the
`router_diagnostics`. All module parameters with their current values are
already displayed in the `parameters` member of the resource.
The router did not take large packets into account when determining
whether the server will respond. This caused the response counts to be off
by one for all large packets.
The creation of the EOF packet is not needed as the last packet of a
result set is always guaranteed to be of the correct type. This also
allows non-resultsets to be correctly processed as the internal packet
number will be at 0 when the last result arrives.
Cleaned up some of the function names and changed the signatures to be
better suited for their use-cases.
Use angle bracket includes, combine some of the more unwieldly
conditionals into functions, added more comments.
The resultset processing for MySQL requires some extra work as it lacks
the proper SERVER_MORE_RESULTS_EXIST flag in the last EOF packet. Instead,
the first EOF packet has the SERVER_PS_OUT_PARAMS flag which needs to be
interpreted as a SERVER_MORE_RESULTS_EXIST flag for the second EOF packet.
Also corrected the EOF packet handling to do the flag checks in the code
that deals with the EOF packets.
As the modutil_state parameter is now used for more than large packet
tracking, the correct solution is to store this state object in the
readwritesplit session instead of interpreting it to a boolean value.
Added the `transaction_replay_max_size` parameter that controls the
maximum size of a transaction that can be replayed. If the limit is
exceeded, the stored statements are released thus preventing the
transaction from being replayed.
This limitation prevents accidental misuse of the transaction replaying
system when autocommit is disabled. It also allows the user to control the
amount of memory that MaxScale will use.
The transaction retrying behavior is now configurable and documented. The
`transaction_replay` parameter implicitly enables the required
functionality in the router that it needs.
As the current query was added to the transaction log before it finished,
the m_current_query contained a duplicate of the latest transaction log
entry. To correctly log only successful transactions, the statement should
be added only after it has successfully completed. This change also
removed the unnecessary cloning that took place when the statement was
added to the log before it finished.
With the fixed transaction logging, the value of m_current_query can be
stashed for later retrying while the replay process is happening. If the
replay completes successfully and the checksums match, the interrupted
query is retried.
Also added a clarifying comment to can_retry_query to explain why a query
inside a transaction cannot be retried.
Added the initial implementation of transaction replay. Transactions are
only replayed if the master fails when no statement is being executed.
The validity of the replayed transaction is done by verifying that the
checksums of the returned results are equal.
Added a close function into the Trx class to make resetting its state
easier. Also changed the return type of the pop_stmt to GWBUF* as the
places where it is used expect a raw GWBUF pointer.