Before this change, the masking could be bypassed simply by
> set @@sql_mode='ANSI_QUOTES';
> select concat("ssn") from person;
The reason is that as the query classifier is not aware of whether
'ANSI_QUOTES' is on or not, it will not know that what above appears
to be the string "ssn", actually is the field name `ssn`. Consequently,
the select will not be blocked and the result returned in cleartext.
It's now possible to instruct the query classifier to report all string
arguments of functions as fields, which will prevent the above. However,
it will also mean that there may be false positives.
If a query spans more than a single packet, it will never be successfully
classified due to the fact that the complete SQL is never available to the
query classifier. For this reason, it is pointless to cache them.
That URL will now return information about the statements in
the query classifier cache. The information is collected using
the same map in a serial manner from all routing workers (that
each have their own cache). Since all caches will contains the
same statements, collecting the information in a serial manner
means that the overall memory consumption will be lower than
what it would be if the information was collected in parallel.
Storing all the runtime errors makes it possible to return all of them
them via the REST API. MaxAdmin will still only show the latest error but
MaxCtrl will now show all errors if more than one error occurs.
The deleter for std::unique_ptr<GWBUF> was not included in that file which
caused it to be deleted with the default deleter. The same should apply to
std::unique_ptr<json_t> as well.
This commit introduces the plumbing support for obtaining
classification information of a statement using the REST-API.
It introduces a URL like
/v1/maxscale/query_classifier/classify?sql=SELECT+1
that in the response will return a JSON object with the
information. Subsequent commits will provide the actual
information.
The cache size now refers to the total memory used by the cache instead of
the per thread limit. This makes it easier to use as well as more
predictable by removing the dependency on the number of worker threads.
Added `match` and `exclude` functionality. This allows versatile filtering
without a large investment of development time by leveraging the benefits
of PCRE2 regular expressions.
Also cleaned up the filter and removed the single table matching and
active parameter that were obsoleted by the regular expression parameters.
See script directory for method. The script to run in the top level
MaxScale directory is called maxscale-uncrustify.sh, which uses
another script, list-src, from the same directory (so you need to set
your PATH). The uncrustify version was 0.66.
qc_thread_init() must now explicitly be called in every thread
and not just in other threads but the one where qc_process_init()
is called.
This change was caused by QC_INIT_SELF initialization actually
being performed in query_classifier.cc. Before this change, there
actually was a leak in the routing worker running in the main
thread, the query classification cache was created twice.
In principle it would be better if the qc information were
obtained via a specific query_classifier resource. However,
there are multiple problems with that (e.g. the qc has no way
of safely accessing information of another thread) and hence
the worker specific qc cache statistics is reported as part of
the worker statistics.
The cache now enforces the defined maximum size by evicting some
entries in case the insertion of a new entry would cause the max
size to be exceeded. Currently the eviction algorithm simply
removes a random element.
With the global configuration parameter 'query_classifier_cache'
the query classification cache can be turned on. At the moment it
does not matter what value it has; its presence simply enables the
caching.
Eventually you will be able to specify how much memory the cache
is allowed to consume.
Now takes a structure that, if present, enables the query
classification caching and specifies the properties of the
cache.
For the time being no actual properties are yet available.