The hangup and error handlers now have unique messages. Although the
behavior in the handlers is practically the same in both cases, the cause
of the error is not the same.
If a socket error is present, it is added to the error message. If an
error is present, it should clearly show the reason why the TCP socket was
closed.
The is_fake_event boolean helps distinguish fake events from real
ones. This makes figuring out the real source of hangup events easier.
The client count was incremented before authentication was complete, and
should be decremented if it fails. Otherwise service connection limit can
be easily reached.
If an error is generated while a COM_CHANGE_USER is being done, it would
always use the sequence number 1. To properly handle this case and send
the correct sequence number, the COM_CHANGE_USER progress needs to be
tracked at the session level.
The information needs to be shared between the backend and client
protocols as the final OK to the COM_CHANGE_USER, with the sequence number
3, is the one that the backend server returns. Only after this response
has been received and routed to the client can the COM_CHANGE_USER
processing stop.
As is explained in MDEV-19893:
Client reads from socket, gets the packet from 1. with seqno=0, which
it does not expect, since seqno is supposed to be incremented. Client
complains, throws tantrums and exceptions.
To cater for clients that do not expect out-of-bound messages
(i.e. server-initiated packets with seqno 0), all messages generated by
MaxScale should use at least sequence number 1.
Deep-copying prevents subsequent modifications done by the caller from
affecting the data that can be potentially stored in the write queue of
the backend's DCB.
mxs::Buffer::iterator is not a random-access iterator and hence will have
linear cost. This is too costly to do on every packet with even moderately
sized resultsets.
RWBackend did not expect that a resultset and an unexpected ERR packet
could be stored in the same buffer. This can happen for example if a
server shuts down immediately after the resultset is sent.
If a packet with a KILL query was followed with another packet in the same
network buffer, the code wouldn't work as it expected to receive only one
packet at a time.
By iterating over the servers and sending the master's charset we are
guaranteed a "known good" charset. This also solves the problem where a
deactivated server reference would be used as the charset and server
version source.
This makes iterating over packets in buffers faster while still
maintaining the requirements for forward iterators. Not using operator+=
makes it clear that this is not a random access iterator.
If the DCB was closed before the handshake for the LocalCliet connection
was received, the gw_decode_mysql_server_handshake would use the closed
DCB to log the connection ID. Clearing out the pointer prevents it.
The largest part of the code deals with the start of a response. Moving
this into a subfunction makes the function clearer as the switch statement
inside a switch statement is removed.
By processing the packets one at a time, the reply state is updated
correctly regardless of how many packets are received. This removes the
need for the clunky code that used modutil_count_signal_packets to detect
the end of the result set.
Given the assumption that queries are rarely 16MB long and that
realistically the only time that happens is during a large dump of data,
we can limit the size of a single read to at most one MariaDB/MySQL packet
at a time. This change allows the network throttling to engage a lot
sooner and reduces the maximum overshoot of throtting to 16MB.
By logging the connection ID for each created connection, failures can be
traced back from the backend server all the way up to the client
application.
Some SQL clients may default to a different authentication plugin than
"mysql_native_password". Since this is the only one supported by MySQL-
authenticator, the client is instructed to swap its plugin.
If a result consists of only OK packets, they would be processed
recursively which most of the time leads to a stack overflow. This can be
prevented by consuming all OK packets in the result in one go.
If a DCB was closed and a hangup event was sent to it via
dcb_hangup_foreach shortly after it was closed, the DCB would still
receive it even if it was closed. To prevent this, events must only be
delivered to DCBs if they haven't been closed.
The protocol should not track the session state as the parsing is quite
expensive with the current code. This change is a workaround that enables
the parsing only when required. A proper way to handle this would be to do
all the response processing in one place thus avoiding the duplication of
work.
If an ignorable packet was followed by more than one queued packets, they
would all get routed in the same batch. This would cause unexpected
replies from the server if multiple ignorable packets were queued up.
There is a race condition between the addition of the DCB into epoll and
the execution of the event that initiates the protocol pointer for the DCB
and sends the handshake to the client. If a hangup event would occur
before the handshake would be sent, it would be possible that the DCB
would get freed before the code that sends the handshake is executed.
By picking the worker who owns the DCB before the DCB is placed into the
owner's epoll instance, we make sure no events arrive on the DCB while the
control is transferred from the accepting worker to the owning
worker.
If the connection to the master is lost, knowing what type of an error
caused the call to handleError helps deduce what was the real reason for
it. Logging the idle time of the connection helps detect when the
wait_timeout of a connection is exceeded.
The prefix was always added even when the original version would've been
acceptable. For example, a version string of 5.5.40 would get converted to
5.5.5-5.5.40 which is quite confusing for older client applications.