The masking filter will assume payloads less than 2^24 - 1. The
behaviour if payloads larger than than are encountered can be
configured.
The actual implementation follows in a subsequent change.
Added documentation for dbfwfilter and mentioned the new directory in the
release notes. The configuration guide also gives an example of how the
path parameters are resolved.
Moved the qlafilter parameters to module options. This removes the need to
parse the options in the filter.
Split the options into separate parameters. This allows common options to
be combined as enumerations under common parameters.
- Hard TTL; the maximum time a value will be used from the cache.
- Soft TLL; the time after which the cache value should be updated
from the server.
So as not to unnecessarily fetch the same value multiple times, when
the soft TTL has been reached, the value will be updated for the first
client, while all other clients will use the stale value until it has
become updated.
With different soft and hard TTLs there is a definite upper bound for
how old a value can be used.
The luafilter exposes two of the main functions provided by the query
classifier API; the type and operation classification.
The functions can be used by the Lua script with minimal overhead as the
current query being executed is stored only as a pointer. The functions
should only be called inside the `routeQuery` entry point of a Lua script.
0 is now the default of all cache configuration parameters and in
all cases the meaning is the same; that is, no limit. Internally
all limits but ttl are now for the sake of consistency 64-bit.
The documentation listed the rules as a comma separated list when they
were parsed as a whitespace separated list. The match specifiers were also
defined as optional when in fact they were mandatory.
MXS-848 (partially). The QLA-filter now has additional options
to control the printing.
1. "append"
This toggles append-mode, where the filter opens the log files in
update mode (if file already existed) and only adds text to the end.
2. "print_service"
This toggles writing the service name onto each row. Mostly useful
with the unified_file-setting.
3. "print_session"
This toggles writing the session number onto each row. Mostly useful
with the unified_file-setting.
Also, the filter now writes a header to the beginning of the file
when creating it.
The printing has been separated to its own helper-function, in case
more accurate control will be added in the future.
The maximum count and maximum size of the cache can now be
specified and a storage can declare what capabilities it has.
If a storage modile cannot enforce the maximum count or maximum
size limits, the storage is decorated with an LRU storage that
can.
Added a document that describes the module command system and added the
necessary information in the dbfwfilter documentation.
The release notes also point to the newly created document.
Since it's a cache, we do not need to retain the data from one
MaxScale invocation to the next. Consequently, we can turn off
write ahead logging completely if we simply wipe the RocksDB
database at each startup. In addition, the location of the
cache directory can now be specified explicitly, so it can be
placed, for instance, on a RAM disk.
With the advent of qc_get_field_info, columns can now be matched.
However, there is still some undeterminism caused by the table
information not containing contextual information (exactly where
is the table used).
Further, suppose table X contains the column A and table Y contains
the column B, then given a statement like
SELECT a, b from X, Z;
we cannot know whether a is in X or Z, or b in X or Z, without being
aware of the schema, which we currently are not.
Consequently, as long as MaxScale is not aware of the schema, some
heuristics must be applied. For instance, if exactly one table is
referred to, then we can assume that columns that are not explicitly
qualified are from that table.
The rule tests are currently rather rudimentary and need to be
expanded.
The qlafilter now has an option to log all messages to a single
file instead of session-specific files. Session ids are printed
to the beginning of the line when using this mode. Documentation
updated to match. Also, added an option to flush after every
write.
The concept of 'allowed_references' was removed from the
documentation and the code. Now that COM_INIT_DB is tracked,
we will always know what the default database is and hence
we can create a cache key that distinguises between identical
queries targeting different default database (that is not
implemented yet in this change).
The rules for the cache is expressed using a JSON object.
There are two decisions to be made; when to store data to the
cache and when to use data from the cache. The latter is
obviously dependent on the former.
In this change, the 'store' handling is implemented; 'use'
handling will be in a subsequent change.
When a query has been sent to a backend, the response is now
processed to the extent that the cache is capable of figuring
out how many rows are being returned, so that the cache setting
`max_resultset_rows` can be processed.
The code is now also written in such a manner that it should be
insensitive to how a package has been split up into a chain of
GWBUFs.
The cache filter consists of two separate components; the cache
itself that evaluates whether a particular query is subject to
caching and the actual cache storage. The storage is loaded at
runtime by the cache filter. Currently using a custom mechanism;
once the new plugin loading macros/mechanism is in place, I'll see
if that can be used.
There are a few open questions/issues.
- Can a GWBUF delivered to the filter contain more MySQL packets
than one? If yes, then some queueing mechanism needs to be
introduced. Currently the code is written so that the packets
are processes in a loop, which will not work.
- Currently, the storage API is synchronous. That may work with a
storage built upon RocksDB, that writes asynchronously to disk,
but not with memcached that can be (and in MaxScale's case
would have to be) used asynchronously.
Reading may be problematic with RocksDB as values are returned
synchronously. So that will stall the thread being used. However,
as RocksDB uses in-memory caching and it is possible to arrange
so that e.g. selects targeting the same table are stored together,
it is not obvious what the impact would be.
So as not to block the MaxScale worker threads, there'd have to
be a separate thread-pool for RocksDB access and then arrange
the data to be moved across.
But initially, the inteface is synchronous.
- How is the cache configured? The original requirement mentions
all sorts of parameters - database name, table name, column name,
presence of WHERE clause, regexp, date/time of query, user -
but it's not alltogether clear exactly how they should be specified
and how they should interract. So initially all selects will
be subject to caching with a TTL.