With the global configuration parameter 'query_classifier_cache'
the query classification cache can be turned on. At the moment it
does not matter what value it has; its presence simply enables the
caching.
Eventually you will be able to specify how much memory the cache
is allowed to consume.
Now takes a structure that, if present, enables the query
classification caching and specifies the properties of the
cache.
For the time being no actual properties are yet available.
The sql mode affects the result of the query classification.
Consequently, we cannot use the cached result if it was generated
when the sql mode was something else than what it currently is.
The mapping from a canonical statement to the query classification
result is maintained by the class QCInfoCache of which there exist
an instance per thread. That way no locking is needed but the
information will be cached multiple times (but that is a smaller
price to pay). Currently the information is stored in a regular
std::unordered_map, which means that the consumed amount of
memory will just keep on growing unless the number of canonical
statements used by clients happens to have an upper bound.
The LRU cache (that provides means for putting a bound on the
amount of memory used and number of items) used in the cache filter
will be generalized and be taken into use here as well.
The key is now the canonical statement itself, which means that
a fair amount of memory will be used. To preserve memory it might
make sense to use a hashed value instead, although that at least
in principle opens up the possibility for unintended collisions.
This feature will also be made configurable.
Only for qc_sqlite.
After a second look qc_mysqlembedded will not support dupping
the statement information. Without additional changes, simply stashing
an info object away, parsing another new GWBUF, deleting that and
then using the stashed away info object will not work; the THD object
will be corrupted. As qc_mysqlembedded is _only_ used for verifying the
sqlite-based parser this is not important anyway.
The query classifier stores information about the statement carried
by a GWBUF in the GWBUF itself. We need to be able to store that
object out side the lifetime of the GWBUF. So, we require that a
query classifier is capable of duplicating references to that object.
Minor cleanup:
- Unit variables places in anonymous namespace.
- Unit variables accessed via this_unit struct.
This is the first commit of many where the mapping from canonical
statement to query classification result is introduced.
The mapping is placed above the actual query classifier, so that all
query classifiers will benefit from it.
The mapping will be maintained by thread so that there will not be
any synchronization issues. Further, initially a simple map without
upper boundary will be used; if this is found to provide measurable
benefits, the map will then be replaced with an LRU mechanism so
that it becomes possible to specify just how much memory the mapping
may use.
The evq_length file held the returned number of descriptors from
the last epoll_wait() call. As such it is highly temporal and not
particularly meaningful.
That has now been removed and the instead the average number of
returned descriptors is maintained. That information changes slowly
and thus carries some meaning.
It is now possible to prevent the masking filter from rejecting
statements using functions in conjunction with fields to be
masked. So now it is possible to not use the blanket rejection
of the masking filter and replace it with more detailed firewall
rules.
The masking filter works only on the result-set. However, if
functions are used, the column names will not be available in
the result-set, and hence masking will not take place.
Now, the statement is checked and if functions are used in
conjunction with columns that should be masked, the statement
is rejected. Thus, functions can no longer be used for bypassing
the masking. That was possible earlier as well, but required
manually setting up the firewall filter.
Before the state of the backend servers is checked, MaxScale needs to be
stopped to prevent the automated failover from interfering in the start-up
process.
If a service has no active servers and users are injected, a warning would
be logged. This is a misleading warning if the service has no servers and
should only be logged if the failure to load any users is an unexpected
situation.
The log manager is the only one that uses the mlist_t versioned list. The
counter that keeps track of the version number was not modified using
atomic operations meaning that the compiler is free to optimize away parts
of the lock-free versioning mechanism that uses it.
To prevent this optimization, the variable is declared volatile. A rewrite
is direly needed but it cannot be done in 2.2.
Displaying the MaxScale version helps identify which package the
executable was bundled with. As the MaxCtrl source is a part of MaxScale,
there's no need for separate versioning.
Empty databases were not mapped to ServerMap because they had no tables
to map. Modified the query to also include empty databases making it
possible to also connect to empty databases through schemarouter.
Removed the excessive comments in favor of a simplified description. Use
stack-allocated TestConnections and simplify assertions.
The main change is the different SQL used to update the user with the old
password. Direct modification of the `mysql`.`user` database isn't very
neat but it guarantees that the value is updated.
The test should stop MaxScale at the start unless the manual debug flag is
given on the command line. This fixes the connection failure of mxs1719
but reveals a problem with the filter itself.
Due to the skewed accept distribution without SO_REUSEPORT, we use
round-robin assignment of workers for new client connections. This
provides better performance as work is more likely to be evenly
distributed across all threads.
Using a least-busy-worker algorithm would provide a more stable result but
this is not trivially simple to implement. For this reason, the
round-robin based approach was chosen for 2.2.