A linefeed is whitespace, so given the rules
"\n"+ return '\n'
{SPACE} ;
a line consisting of space followed by a linefeed, will be matched
as space and not as a linefeed and hence will cause the parser to
barf.
When the unit tests were run without installing the libraries in their
final locations, the loading of the modules would fail. Using locations
relative to the build directory allows unit testing without having to
install the libraries.
When the database firewall filter is used in white-list mode,
'USE <db>' should be allowed. When connecting, it is always
possible to specify the database anyway so restricting
'USE <db>' serves no purpose.
MXS-1412: while discarding a result set don't buffer any data: this
avoids to store useless data.
Additionally the colum definitions buffer is used instead of the offset
value.
By processing each buffer individually, the need to iterate over the whole
resultset is removed. Profiling showed that most of the time was spent
navigating the linked list of buffers when an offset into the whole
resultset was used instead of an offset to the individual response buffer.
The query classifier should only be used to parse text protocol
statements. The insertstream filter exploited the fact that any statements
that the filter did not expect would be classified as an unknown
commands. This led to repetitive error messages with binary protocol
statements.
If a rule is defined with only an optional part, it should be of the
permission type. This type is used to signal that the rule matches if the
optional constraints are fulfilled.
Due to refactoring, the default type was changed from RT_PERMISSION to
RT_UNDEFINED.
The HintParser wrongly ignored linebreaks, causing parsing faults
e.g. parsing too far or accepting invalid comments. Now, the parser
detects a line break and terminates comments unless they started with
'/*'. Also, fixed a memory leak when parsing parameter-value-combinations.
The modutil_get_SQL()-function allocates storage, while
modutil_extract_SQL() does not. The strings given by the latter
are not 0-terminated so require a length limit when matched using
regexec().
This commit changes the used function in those cases where the
sql-string is not modified nor is the pointer saved for later use.
If a complete response is delivered in many buffers, then calling
gwbuf_length() whenever you need the complete size starts to hurt.
By caching the length of the data received sofar and by updating
the length in clientReply(), gwbuf_length() will be called exactly
once for each buffer(chain) delivered to routeQuery().
If the output buffer given to pcre2_substitute is too small, an error
value is written to the last parameter (output length). That value
should not be used for calculations. This patch gives a copy as
parameter instead.
Coincidentally, this commit fixes the crashes of query classifier tests.
Also, increase buffer growth rate in utils.c.
New parameter added to maxsrows filter:
max_resultset_return=empty|error|ok
Default, 'empty' is to return an empty set, as the current
implementation.
'err' will return an ERR reply with the input SQL statement
'ok' will return an OK packet
When log messages are written with both address and port information, IPv6
addresses can cause confusion if the normal address:port formatting is
used. The RFC 3986 suggests that all IPv6 addresses are expressed as a
bracket enclosed address optionally followed by the port that is separate
from the address by a colon.
In practice, the "all interfaces" address and port number 3306 can be
written in IPv4 numbers-and-dots notation as 0.0.0.0:3306 and in IPv6
notation as [::]:3306. Using the latter format in log messages keeps the
output consistent with all types of addresses.
The details of the standard can be found at the following addresses:
https://www.ietf.org/rfc/rfc3986.txthttps://www.rfc-editor.org/std/std66.txt
It is now possible to specify what information the caller is interested
in. With this the cost for collecting information during the query parsing
that nobody is interested in can be avoided.
The match data needs to be unique for each thread, so for the time
being it is created whenever it is needed. A more performant (although
possibly to a negigible amount) solution would be to have a separate
match data for each thread, but that will have to wait for 2.2.
- Selects are picked out using custom parsing, so if a statement is
anything else but a SELECT, the cache will never cause the statement
to be parsed.
- The setting of of the cache parameter `selects` is taken into account.
If it is `assume_cacheable` then the statement will also not be parsed
even if it is a SELECT.
The original approach was made for RocksDB where it is beneficial
to keep keys of stuff related to each other close to each other.
However, as RocksDB is no longer the primary focus, it just causes
additional cost to dig out the table names.
The key is a 64-bit integer, but crc32 only gives us a 32-bit one.
We create an other 32-bit value by running crc32 over the same SQL,
using the first crc value as adler.
I think that further reduces the chance for clashes:
uint32_t crc0 = crc32(0, Z_NULL, 0);
uint32_t crc1;
uint32_t crc2;
crc1 = crc32(crc0, "codding", 7) => 1774765869
crc2 = crc32(crc1, "codding", 7) => 1409592046
crc1 = crc32(crc0, "gnu", 3) => 1774765869
crc2 = crc32(crc1, "gnu", 3) => 1213798908
Note that the first value is the same, but the second is not.
The process and thread initialization/finalization of the query
classifier plugins is handled using the process and thread
initialization/finalization functions in the module object.
However, the top-level query classifier will also need to perform
process and thread initialization when transaction boundaries are
detected using regular expressions.
Both the listeners and servers now support IPv6 addresses.
The namedserverfilter does not yet use the new structures and needs to be
fixed in a following commit.
The resultset of SELECTs that use functions whose result will
always vary or whose result depend upon the user executing the
query should not be cached. The list of functions is the same
as that specified for the query cache of MariaDB:
https://mariadb.com/kb/en/mariadb/query-cache/