The code relied on last_read for the idle time calculation which caused
the pings that were written to not reset the idle time. This increased the
chance of multiple COM_PING packets being sent to a backend before a reply
was received.
The use of the server state is not transactional across multiple uses of
the function. This means that any assertions on the target state can fail
if the monitor updates the state between target selection and the
assertion.
The backend didn't expect AuthSwitchRequest packets in response to the
handshake response packets. This is allowed by the protocol and appears to
happen with at least MySQL 8.0.
The Connector-C was changed to always return only the client's charset,
not the actual charset that the connection ends up using. To cope with
this, the code has to use SQL to join the default character set name to
the default collation for it which can be used to extract the numeric ID
of the charset.
The slave backend would be closed twice if it would both respond with a
different result and be closed due to a hangup before the master
responded.
Added a test case that reproduced the problem.
If the client DCB of the session was passed into the function, it was
possible that the session pointer for it was already set to null. The
session pointer of an open DCB is never null but a client DCB's session
pointer can be null if accessed via the MXS_SESSION object.
By incrementing the counters when the session is created, we know that the
counter will always be decremented correctly. This does cause the listener
session to be counted as an actual session but this is already present in
the statistics calculations and is something we have to live with in 2.3
This change also makes it possible to overshoot the connection count
limitation as the session creation is delayed until authentication
fails. Both of these problems are fixed in 2.4.
The default database was not exposed in the warning that was logged when
authentication failed. The authentication uses the username, host and the
default database to find the user entry and the lack of the default
database made it hard to know for sure which user entry a client should've
matched against.
This causes the connection failure to be counted as an authentication
failure instead of a connection error. The former never causes the host to
be blocked which effectively solves the problem for most cases. The only
case where this would not work is where the network buffer for a backend
DCB is full right after the connection is created.
The password values are now masked with asterisks. This tells whether a
password is set or not but it does not expose any information about the
password itself.
The events are similar to normal query events except that they have an
extra 13 bytes of static data. This data is of no relevance to Maxscale
and thus can be ignored. This also allows the reuse of the same query
event code for execute load query events.
It appears that rollback errors are possible outside of
transactions. Since this was not something we expected to see, logging it
as an error allows us to see why this happens in production deployments.
The errors that are ignored by readwritesplit are now stored as the
current close reason in the Backend. This allows the information about the
error to be retained and it can be used later in the error handler to
display the true reason why the connection was closed.
The hangup and error handlers now have unique messages. Although the
behavior in the handlers is practically the same in both cases, the cause
of the error is not the same.
If a socket error is present, it is added to the error message. If an
error is present, it should clearly show the reason why the TCP socket was
closed.
The is_fake_event boolean helps distinguish fake events from real
ones. This makes figuring out the real source of hangup events easier.
By checking whether the users have changed whenever they are reloaded, we
improve the visibility of the user reloading process. Using a checksum
allows us to easily compress the information with acceptable loss of
accuracy. Using a CAS loop prevents duplicate messages without losing any
updates even if multiple user reloads result in different outcomes.