The purpose is to recognize e.g. /_utf8mb4 0xD091D092D093/ as
a valid string. The rule actually accepts /id integer/, but in
case the statement is something else but an '_' immediately
followed by a character set, followed by a hex number, it will
be rejected by the server so no harm done.
With this change, a parenthesized top-level SELECT, such as
"(SELECT f FROM t)" will be fully parsed. Before this change,
the statement was classified as invalid and would thus have
been sent to the master.
With this change also statements like
(SELECT f FROM t1) UNION (SELECT f FROM t2)
will be correctly classified, although only partially parsed.
With these changes
SET @saved_cs_client= @@character_set_client;
will be classified as QUERY_TYPE_USERVAR_WRITE and
SELECT 1 AS c1 FROM t1 ORDER BY ( SELECT 1 AS c2 FROM
t1 GROUP BY GREATEST(LAST_INSERT_ID(), t1.a) ORDER BY
GREATEST(LAST_INSERT_ID(), t1.a) LIMIT 1);
will be classified as QUERY_TYPE_READ|QUERY_TYPE_MASTER_READ
With this change, a parenthesized top-level SELECT, such as
"(SELECT f FROM t)" will be fully parsed. Before this change,
the statement was classified as invalid and would thus have
been sent to the master.
With this change also statements like
(SELECT f FROM t1) UNION (SELECT f FROM t2)
will be correctly classified, although only partially parsed.
On RHEL8 the former may give rise to incorrect
error: 'char* strncpy(char*, const char*, size_t)' destination
unchanged after copying no bytes [-Werror=stringop-truncation]
When a statement like 'DESCRIBE tbl' is classified, the table
name will now be available so that a router can check whether the
table is a temporary one. In that case, the statement must be sent
to the master.
Before this change, if the firewall was configured to block the use
of certain columns, it could be be bypassed simply by
> set @@sql_mode='ANSI_QUOTES';
> select "ssn" from person;
The reason is that as the query classifier is not aware of whether
'ANSI_QUOTES' is on or not, it will not know that what above appears
to be the string "ssn", actually is the field name `ssn`. Consequently,
the select will not be blocked and the result returned in cleartext.
It's now possible to instruct the query classifier to report all strings
as fields, which will prevent the above. However, it will also mean that
there may be false positives.
Before this change, the masking could be bypassed simply by
> set @@sql_mode='ANSI_QUOTES';
> select concat("ssn") from person;
The reason is that as the query classifier is not aware of whether
'ANSI_QUOTES' is on or not, it will not know that what above appears
to be the string "ssn", actually is the field name `ssn`. Consequently,
the select will not be blocked and the result returned in cleartext.
It's now possible to instruct the query classifier to report all string
arguments of functions as fields, which will prevent the above. However,
it will also mean that there may be false positives.
The type mask of CREATE, ALTER, etc. that cause an implicit commit
will no longer contain the bit QUERY_TYPE_COMMIT.
As an implicit commit does not change the transaction state as seen
by MaxScale, it does not make sense to claim that the statement is
a commit.
RESET QUERY CACHE is reported to be a session command, which will
cause it to be sent to all servers. RESET [MASTER|SLAVE] are
classified as write, which will cause them to be sent to the master.
It could be argued that RESET [MASTER|SLAVE] should cause an error
to be sent to the client.
RESET is neither parsed not tokenized, so RESET statements ends
up being sent to the master. That is fine for all but RESET QUERY
CACHE that should be sent to all servers.
Recognize the XA keyword and classify the statement as write.
Needs to be dealt with explicitly as sqlite3 assumes there are
no keywords starting with the letter X.
That URL will now return information about the statements in
the query classifier cache. The information is collected using
the same map in a serial manner from all routing workers (that
each have their own cache). Since all caches will contains the
same statements, collecting the information in a serial manner
means that the overall memory consumption will be lower than
what it would be if the information was collected in parallel.