In the case of qc_sqlite, it is done "precisely", while in the
case of qc_mysqlembedded rather bluntly. Not time well spent
to figure out exactly which pointer chains need to be walked.
To make it possible to have more tokens than 255.
Parsers operators (i.e. tokens) is one thing and opcodes
for the virtual machine of sqlite3 is another. Unfortunately
they are not completely separate, but some of the opcodes
in <build-directory>/opcodes.h are the same as the tokens in
<build-directory>/parse.h. And while the parser tokens are now
16-bit, the VM opcodes are 8-bit. However, this is probably not
a problem even if some of the parser tokens that are duplicated
in the opcodes are > 256 as we only use sqlite3 for parsing and
not for executing anything (on the sqlite3 VM).
The original code for catenating an SrcList to another assumed
that the list to be catenated had only 1 element. Now works
regardless of the number of items.
When the tokenizer encounters a keyword, it sniffs whether the
last non-whitespace character before it happens to be a '.' and
if it is, the keyword is assumed to be the second part of a
qualified name.
Thus, before this commit
-- blah.
UPDATE ...
would not be parsed as KEYWORD (UPDATE) followed by stuff, but as
an ID (blah.UPDATE) followed by stuff.
With this change, newlines are no longer counted as whitespace.
GCC is smart enough to detect that the address of a local variable is
returned. Since this appears to be code used for a debug assertion, we can
just return a null pointer.
Problem was that 'handler' is a keyword. To make it work,
the keyword must be listed as one of those that turns into
an id where it cannot be used as a keyword.
Lemon (the sqlite parser generator) destructors are needed for
all rules that return dynamically allocated structures. Otherwise
there may be leaks if a statement is not completely parsed.
DIV and MOD are now also accepted instead of / and % respectively.
MOD is a keyword but (in principle incorrectly) decays into an id
when used in some other context. That is so that it will be
parser by the general function rule ("id ( ... )"). If used
incorrectly, the server will later reject.
In the tokenizer we will now recognize the character set names
of MariaDB and return a specific token for those. However, where
a character set name is not expected, it will automatically be
treated as an identifier.
Note that when the character set name is explicitly specified
for a literal string, the name must be prefixed with an underscore.
That is, if the character set name is "latin1", when used when
specifying a literal string, it's used as "_latin1 'a'".
Note that this does not fix the sqlite3 bug causing a leak, but
since the statement will now correctly be parsed, the leak will
not manifest itself.
The mkopcodeh.tcl of sqlite3 version 3110100 has a bug that
manifests itself so that it generates broken code depending on
what keywords there are and in what order. The mkopcodeh.tcl
from 3200000 does not have that problem.
Originally, the sqlite installation was imported into the MaxScale
repository in the one gigantic MaxScale 1.4 -> 2.0 commit.
Consequently, there is no import commit to compare to if you want
to extract all MaxScale specific changes. To make it simpler in the
future, sqlite will now be imported in a commit of its own.
The purpose is to recognize e.g. /_utf8mb4 0xD091D092D093/ as
a valid string. The rule actually accepts /id integer/, but in
case the statement is something else but an '_' immediately
followed by a character set, followed by a hex number, it will
be rejected by the server so no harm done.
With this change, a parenthesized top-level SELECT, such as
"(SELECT f FROM t)" will be fully parsed. Before this change,
the statement was classified as invalid and would thus have
been sent to the master.
With this change also statements like
(SELECT f FROM t1) UNION (SELECT f FROM t2)
will be correctly classified, although only partially parsed.
With these changes
SET @saved_cs_client= @@character_set_client;
will be classified as QUERY_TYPE_USERVAR_WRITE and
SELECT 1 AS c1 FROM t1 ORDER BY ( SELECT 1 AS c2 FROM
t1 GROUP BY GREATEST(LAST_INSERT_ID(), t1.a) ORDER BY
GREATEST(LAST_INSERT_ID(), t1.a) LIMIT 1);
will be classified as QUERY_TYPE_READ|QUERY_TYPE_MASTER_READ
With this change, a parenthesized top-level SELECT, such as
"(SELECT f FROM t)" will be fully parsed. Before this change,
the statement was classified as invalid and would thus have
been sent to the master.
With this change also statements like
(SELECT f FROM t1) UNION (SELECT f FROM t2)
will be correctly classified, although only partially parsed.
When a statement like 'DESCRIBE tbl' is classified, the table
name will now be available so that a router can check whether the
table is a temporary one. In that case, the statement must be sent
to the master.
RESET QUERY CACHE is reported to be a session command, which will
cause it to be sent to all servers. RESET [MASTER|SLAVE] are
classified as write, which will cause them to be sent to the master.
It could be argued that RESET [MASTER|SLAVE] should cause an error
to be sent to the client.
Recognize the XA keyword and classify the statement as write.
Needs to be dealt with explicitly as sqlite3 assumes there are
no keywords starting with the letter X.
A non version specific executable comment, such as "/*! SELECT 1; */"
is during classification handled as if it would not be a comment. That
is, the contained statement will *always* be parsed.
A version specific executable comment, such as "/*!99999 CREATE PROCEDURE
bypass BEGIN */ SELECT ... " is during classification handled as it would
be a general comment. That is, the contained statement will *never* be
parsed.
In addition, in the latter case the parse result will never be better than
QC_QUERY_PARTIALLY_PARSED. The rationale is that since the comment is version
specific, we cannot know how the server will actually interpret the statement.
This will have an impact on the masking filter and the database firewall that
now will reject statements containing _version specific_ executable comments.