In most cases it is reasonable to stop attempting transaction replays
after a certain number of failed attempts. This prevents transactions from
being repeatedly replayed on the same server over and over again if, for
example, it keeps crashing.
Duration values converted to JSON are now again returned as integers. This
keeps the REST API backwards compatible until suffixed durations are no
longer supported at which point all duration values can be represented in
milliseconds.
If a connection attempt is not accepted due to the host being blocked, the
protocol can now return an error message that is sent to the client. Only
mariadb_client implements this as it is the only one who calls the auth
failure methods in the first place.
The RateLimit class stores authentication failure data mapped by the
client IP addresses. The authentication failures are limited
per thread. The limits are still hard-coded and at least the number of
failures should be made configurable.
The simplest, most maintainable and acceptably efficient implementation
for DDoS protection is a thread-local unordered_map. The unwanted
side-effect of "scaling" of the number of allowed authentication failures
is unlikely to be problematic in most use-cases.
As the blocking of a host is only temporary, the behavior differs from the
one in the MariaDB server. This allows the number of failures to be set to
a much lower value negating some of the problems caused by the relatively
simple implementation.
Currently it's too laborious to use duration suffixes when saving
generated configs and also to handle suffixes when changes are made
dynamically using maxctrl.
It will be trivial to do that when the new configuration mechanism
has been taken into use everywhere. That will not happen before
MaxScale 2.5.
So, in MaxScale 2.4 duration suffixes will be accepted in manually
created configuration files, but no warning will be logged if a
suffix is not used.
It's now possible to specify in the config parameter declaration
that the smallest allowed unit is seconds. For parameters whose
granularity is seconds, allowing to specify a duration in
milliseconds would open up a possibility for hard to detect errors.
Now the desired type must be specified when getting a duration.
The type also dictates how durations without suffixes should be
interpreted.
That removes the need for remembering that to convert a returned
millisecond duration to a second duration.
Now considers other routing hints if first one fails. The order is inverted compared
to e.g. namedserver filter settings because of how routing hints are stored. If all hints
are unsuccessful, route to any slave.
The code that selects which worker to assign the DCB to is now completely
in the Listener class. This removes the need to change the ownership of a
DCB after it has been allocated.
The DCB is now fully allocated on the thread that owns it. This guarantees
that the owner is always correct when it is used.
The code in poll_add_dcb still manipulates which worker the DCB is
allocated. This needs to be removed and the detection of special needs
(maxadmin, maxinfo) must be moved into the listener.
If
- transaction replay is enabled,
- an error is returned and
- the error is one of the recoverable Clustrix errors
we will retry the transaction.
If it succeeds, then the client will not notice anything but
for a short delay.
Note that the error message is looked for irrespective of whether
the backend is Clustrix or not. However, as errors are not common
the price for doing that can probably be ignored.
However, a bigger problem is that explicit knowledge of different
backends should *not* be coded into routers.