The concept of 'allowed_references' was removed from the
documentation and the code. Now that COM_INIT_DB is tracked,
we will always know what the default database is and hence
we can create a cache key that distinguises between identical
queries targeting different default database (that is not
implemented yet in this change).
The rules for the cache is expressed using a JSON object.
There are two decisions to be made; when to store data to the
cache and when to use data from the cache. The latter is
obviously dependent on the former.
In this change, the 'store' handling is implemented; 'use'
handling will be in a subsequent change.
When a query has been sent to a backend, the response is now
processed to the extent that the cache is capable of figuring
out how many rows are being returned, so that the cache setting
`max_resultset_rows` can be processed.
The code is now also written in such a manner that it should be
insensitive to how a package has been split up into a chain of
GWBUFs.
The cache filter consists of two separate components; the cache
itself that evaluates whether a particular query is subject to
caching and the actual cache storage. The storage is loaded at
runtime by the cache filter. Currently using a custom mechanism;
once the new plugin loading macros/mechanism is in place, I'll see
if that can be used.
There are a few open questions/issues.
- Can a GWBUF delivered to the filter contain more MySQL packets
than one? If yes, then some queueing mechanism needs to be
introduced. Currently the code is written so that the packets
are processes in a loop, which will not work.
- Currently, the storage API is synchronous. That may work with a
storage built upon RocksDB, that writes asynchronously to disk,
but not with memcached that can be (and in MaxScale's case
would have to be) used asynchronously.
Reading may be problematic with RocksDB as values are returned
synchronously. So that will stall the thread being used. However,
as RocksDB uses in-memory caching and it is possible to arrange
so that e.g. selects targeting the same table are stored together,
it is not obvious what the impact would be.
So as not to block the MaxScale worker threads, there'd have to
be a separate thread-pool for RocksDB access and then arrange
the data to be moved across.
But initially, the inteface is synchronous.
- How is the cache configured? The original requirement mentions
all sorts of parameters - database name, table name, column name,
presence of WHERE clause, regexp, date/time of query, user -
but it's not alltogether clear exactly how they should be specified
and how they should interract. So initially all selects will
be subject to caching with a TTL.