Some rearrangements to ensure that what should be private
can be kept private.
- WatchdogNotifier made a friend.
- WatchdogWorkaround defined in RoutingWorker and made a friend.
- mxs::WatchdogWorker defined with 'using'.
The systemd watchdog mechanism requries notifications at
regular intervals. If a synchronous operation of some kind
is performed by a worker, then those notfications will not
be generated.
This change provides each worker with a secondary thread that
can be used for triggering those notifications when the worker
itself is busy doing other stuff. The effect is that there will
be an additional thread for each worker, but most of the time
that thread will be idle.
Sofar only the mechanism; in subsequent changes the mechanism
will be taken into use.
If the server where a query is being executed is shutting down,
readwritesplit should treat it as an error to make retrying of the query
possible.
By treating server shutdowns as network errors, the same code path that is
used for actual network errors can be taken. This removes the need for any
extra retrying logic for this particular case.
The transaction replay could get mixed up with new queries if the client
managed to perform one while the delayed routing was taking place. A
proper way to solve this would be to cork the client DCB until the
transaction is fully replayed. As this change would be relatively more
complex compared to simply labeling queries that are being retried the
corking implementation is left for later when a more complete solution can
be designed.
This commit also adds some of the missing info logging for the transaction
replaying which makes analysis of failures easier.
Systemd wathdog notification at a little more than 2/3 of the
systemd configured time. In the service config (maxscale.service)
add e.g. WatchdogSec=30s to set and enable the watchdog.
For building: install libsystemd-dev.
The next commit will modify cmake configuration and code to
conditionally compile the new code based on existence of libsystemd-dev.
By exposing a (currently undocumented) debug endpoint that lets one
monitor interval pass, we make the reuse of the monitor waiting
functionality a lot easier. With it, when MaxScale is started by the test
framework it knows that at least one monitor interval will have passed for
all monitors and that the system is ready to accept queries.
This will simply cause a task to be posted to each worker.
If the workers are running normally, the task will reach the
workers and the associated semaphore posted, and the REST-API
call will return. If any worker is not running normally, the
task will not be processed and the REST-API call will hang.
As the router is the only one that knows what backends a particular
statement has been sent to, it is the responsibility of the router
to keep the session bookkeeping up to date. If it doesn't we will
know what statements a session has received (provided at least some
component in the routing chain has RCAP_TYPE_STMT_INPUT capability),
but not how long their processing took. Currently only readwritesplit
does that.
All queries are stored and not just COM_QUERY as that makes the
overall bookkeeping simpler; at clientReply() time we do not need to
know whether or not to bookkeep information, we can just do it.
When session information is queried for, we report as much information
we have available.
This commit introduces the plumbing support for obtaining
classification information of a statement using the REST-API.
It introduces a URL like
/v1/maxscale/query_classifier/classify?sql=SELECT+1
that in the response will return a JSON object with the
information. Subsequent commits will provide the actual
information.
Fix comments.
Fix a bug in make_valid().
Change sync time (when the average should be pushed to the server EMA)
to only depend on time, not use sample_max. This decreases the amount of
sync calls, and allows for a much shorter sync time. Testing shows this to be
more stabel and allows to make sample_max adaptive .
Fix comments.
Fix a bug in make_valid().
Change sync time (when the average should be pushed to the server EMA)
to only depend on time, not use sample_max. This decreases the amount of
sync calls, and allows for a much shorter sync time. Testing shows this to be
more stabel and allow better control of the sample_max.
The reads now read as much of the data as is available to reduce the
number of distinct malloc calls that need to be made. The SSL_read also
now allocates the buffer before reading into it so that the amount of
copying is reduced.
Also removed some of the not quite helpful debug messages.
The client connections had the Nagle algorithm enabled which could cause
bad performance with smaller workloads. The common network configuration
code in utils.cc, currently used by the backend connections, sets it
properly.
Replaced SPINLOCK with std::mutex where possible, leaving out the more
complex cases. The big offenders remaining are the binlogrouter and the
gateway.cc OpenSSL locks.
In progress, does not yet overwrite existing code.
The new promotion mechanism automatically retries queries which timed out. It also
handles multimaster situations correctly.
Removed the almost equal comparison and subsequent selection based on historical number of connections.
The effect of it was this: Select the server that has historically, weights or not, been slower. Tested this with 2.2
with maxscale on one server and mariadb:s on two servers with different network lags. The tests with historical
selects were clearly slower.
The schemarouter now uses the RWBackend to track the response states. This
fixes the debug assertions that happened with the mxs1113_schemarouter_ps
test.
By splitting the processing and state querying into two separate
functions, the result can be inspected multiple times without triggering
the result processing.
Added `match` and `exclude` functionality. This allows versatile filtering
without a large investment of development time by leveraging the benefits
of PCRE2 regular expressions.
Also cleaned up the filter and removed the single table matching and
active parameter that were obsoleted by the regular expression parameters.
The read-write distribution in readwritesplit is now stored in a map
partitioned by the servers that the router has used. Currently, the
statistics for removed servers aren't dropped so some filtering still
needs to be added.
The statistic.hh header defines a set of functions that complement the
standard library numeric functions. They differ from the standard library
functions in that they take a container reference and a pointer-to-member
as parameters and calculate the statistic based on the pointed-to member.