The backend didn't expect AuthSwitchRequest packets in response to the
handshake response packets. This is allowed by the protocol and appears to
happen with at least MySQL 8.0.
If the client DCB of the session was passed into the function, it was
possible that the session pointer for it was already set to null. The
session pointer of an open DCB is never null but a client DCB's session
pointer can be null if accessed via the MXS_SESSION object.
By incrementing the counters when the session is created, we know that the
counter will always be decremented correctly. This does cause the listener
session to be counted as an actual session but this is already present in
the statistics calculations and is something we have to live with in 2.3
This change also makes it possible to overshoot the connection count
limitation as the session creation is delayed until authentication
fails. Both of these problems are fixed in 2.4.
This causes the connection failure to be counted as an authentication
failure instead of a connection error. The former never causes the host to
be blocked which effectively solves the problem for most cases. The only
case where this would not work is where the network buffer for a backend
DCB is full right after the connection is created.
The hangup and error handlers now have unique messages. Although the
behavior in the handlers is practically the same in both cases, the cause
of the error is not the same.
If a socket error is present, it is added to the error message. If an
error is present, it should clearly show the reason why the TCP socket was
closed.
The is_fake_event boolean helps distinguish fake events from real
ones. This makes figuring out the real source of hangup events easier.
The client count was incremented before authentication was complete, and
should be decremented if it fails. Otherwise service connection limit can
be easily reached.
If an error is generated while a COM_CHANGE_USER is being done, it would
always use the sequence number 1. To properly handle this case and send
the correct sequence number, the COM_CHANGE_USER progress needs to be
tracked at the session level.
The information needs to be shared between the backend and client
protocols as the final OK to the COM_CHANGE_USER, with the sequence number
3, is the one that the backend server returns. Only after this response
has been received and routed to the client can the COM_CHANGE_USER
processing stop.
As is explained in MDEV-19893:
Client reads from socket, gets the packet from 1. with seqno=0, which
it does not expect, since seqno is supposed to be incremented. Client
complains, throws tantrums and exceptions.
To cater for clients that do not expect out-of-bound messages
(i.e. server-initiated packets with seqno 0), all messages generated by
MaxScale should use at least sequence number 1.
Deep-copying prevents subsequent modifications done by the caller from
affecting the data that can be potentially stored in the write queue of
the backend's DCB.
mxs::Buffer::iterator is not a random-access iterator and hence will have
linear cost. This is too costly to do on every packet with even moderately
sized resultsets.
RWBackend did not expect that a resultset and an unexpected ERR packet
could be stored in the same buffer. This can happen for example if a
server shuts down immediately after the resultset is sent.
If a packet with a KILL query was followed with another packet in the same
network buffer, the code wouldn't work as it expected to receive only one
packet at a time.
By iterating over the servers and sending the master's charset we are
guaranteed a "known good" charset. This also solves the problem where a
deactivated server reference would be used as the charset and server
version source.
This makes iterating over packets in buffers faster while still
maintaining the requirements for forward iterators. Not using operator+=
makes it clear that this is not a random access iterator.
If the DCB was closed before the handshake for the LocalCliet connection
was received, the gw_decode_mysql_server_handshake would use the closed
DCB to log the connection ID. Clearing out the pointer prevents it.
The largest part of the code deals with the start of a response. Moving
this into a subfunction makes the function clearer as the switch statement
inside a switch statement is removed.
By processing the packets one at a time, the reply state is updated
correctly regardless of how many packets are received. This removes the
need for the clunky code that used modutil_count_signal_packets to detect
the end of the result set.
Given the assumption that queries are rarely 16MB long and that
realistically the only time that happens is during a large dump of data,
we can limit the size of a single read to at most one MariaDB/MySQL packet
at a time. This change allows the network throttling to engage a lot
sooner and reduces the maximum overshoot of throtting to 16MB.
By logging the connection ID for each created connection, failures can be
traced back from the backend server all the way up to the client
application.
Some SQL clients may default to a different authentication plugin than
"mysql_native_password". Since this is the only one supported by MySQL-
authenticator, the client is instructed to swap its plugin.
If a result consists of only OK packets, they would be processed
recursively which most of the time leads to a stack overflow. This can be
prevented by consuming all OK packets in the result in one go.
If a DCB was closed and a hangup event was sent to it via
dcb_hangup_foreach shortly after it was closed, the DCB would still
receive it even if it was closed. To prevent this, events must only be
delivered to DCBs if they haven't been closed.
The protocol should not track the session state as the parsing is quite
expensive with the current code. This change is a workaround that enables
the parsing only when required. A proper way to handle this would be to do
all the response processing in one place thus avoiding the duplication of
work.