MXS-2473 Simplify regular expression settings documentation

The settings "match", "exclude" and "options" are now explained once
in the general documentation. The individual filter documentation refers
to the general explanation.
This commit is contained in:
Esa Korhonen 2019-05-15 17:04:33 +03:00
parent 96a477ec89
commit bb706394f6
11 changed files with 148 additions and 248 deletions

View File

@ -7,7 +7,7 @@ This filter was introduced in MariaDB MaxScale 2.3.0.
The `binlogfilter` can be combined with a `binlogrouter` service to selectively
replicate the binary log events to slave servers.
The filter uses two parameters, `match` and `exclude`, to decide which events
The filter uses two parameters, *match* and *exclude*, to decide which events
are replicated. If a binlog event does not match or is excluded, the event is
replaced with an empty data event. The empty event is always 35 bytes which
translates to a space reduction in most cases.
@ -18,33 +18,25 @@ that there are no ambiguities in the event filtering.
## Configuration
Both the `match` and `exclude` parameters are optional. If neither of them is
defined, the filter does nothing and all events are replicated.
### `match` and `exclude`
Both the *match* and *exclude* parameters are optional and work mostly as other
[typical regular expression parameters](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters).
If neither of them is defined, the filter does nothing and all events are replicated. This
filter does not accept regular expression options as a separate parameter, such settings
must be defined in the patterns themselves. See the
[PCRE2 api documentation](https://www.pcre.org/current/doc/html/pcre2api.html#SEC20) for
more information.
The two parameters are matched against the database and table name concatenated
with a period. For example, the string the patterns are matched against for the
database `test` and table `t1` is `test.t1`.
For statement based replication, the pattern is matched against all the tables
in the statements. If any of the tables matches the `match` pattern, the event
is replicated. If any of the tables matches the `exclude` pattern, the event is
in the statements. If any of the tables matches the *match* pattern, the event
is replicated. If any of the tables matches the *exclude* pattern, the event is
not replicated.
### `match`
A [PCRE2 regular expression](../Getting-Started/Configuration-Guide.md#regular-expressions)
that is matched against the database and table name. If the pattern matches, the
event is replicated to the slave. If no `match` parameter is defined, all events
are considered to match.
### `exclude`
A [PCRE2 regular expression](../Getting-Started/Configuration-Guide.md#regular-expressions)
that is matched against the database and table name. If the pattern matches, the
event is excluded and is not replicated to the slave. If no `exclude` pattern is
defined, the event filtering is controlled completely by the `match` parameter.
## Example Configuration
With the following configuration, only events belonging to database `customers`

View File

@ -35,22 +35,6 @@ comment. The `match`-comment typically has no effect, since write queries by
default trigger the filter anyway. It can be used to override an ignore-type
regular expression that would othewise prevent triggering.
## Filter Options
The CCR filter accepts the following options.
|Option |Description |
|-----------|--------------------------------------------|
|ignorecase |Use case-insensitive matching (default) |
|case |Use case-sensitive matching |
|extended |Use extended regular expression syntax (ERE)|
To use multiple filter options, list them in a comma-separated list.
```
options=case,extended
```
## Filter Parameters
The CCR filter has no mandatory parameters.
@ -81,27 +65,17 @@ the counter reaches zero, the statements are routed normally. If a new data
modifying SQL statement is processed, the counter is reset to the value of
_count_.
### `match`
### `match`, `ignore` and `options`
An optional parameter that can be used to control which statements trigger the
statement re-routing. The parameter value is a regular expression that is used
to match against the SQL text. Only non-SELECT statements are inspected. If this
parameter is defined, *only* matching SQL-queries will trigger the filter
(assuming no ccr hint comments in the query).
These [regular expression settings](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters)
control which statements trigger statement re-routing. Only non-SELECT statements are
inspected. For CCRFilter, the *exclude*-parameter is instead named *ignore*, yet works
similarly.
```
match=.*INSERT.*
```
### `ignore`
An optional parameter that can be used to control which statements don't trigger
the statement re-routing. This does the opposite of the _match_ parameter. The
parameter value is a regular expression that is used to match against the SQL
text. Only non-SELECT statements are inspected.
```
ignore=.*UPDATE.*
options=case,extended
```
## Example Configuration

View File

@ -318,10 +318,11 @@ rule examplerule match not_function columns ssn
#### `regex`
This rule blocks all queries matching a regex enclosed in single or double
quotes. The regex string expects a PCRE2 syntax regular expression. For more
information about the PCRE2 syntax, read the [PCRE2
documentation](http://www.pcre.org/current/doc/html/pcre2syntax.html).
This rule blocks all queries matching the regular expression. The regex string expects a
PCRE2 syntax regular expression. For more information about PCRE2 syntax, read the
[PCRE2 documentation](http://www.pcre.org/current/doc/html/pcre2syntax.html). Unlike
typical MaxScale regex parameters, the value should be enclosed in single or double
quotes, not in `/.../`. Any compilation options must be included in the pattern itself.
##### Example

View File

@ -50,23 +50,29 @@ filters=NamedServerFilter
## Filter Parameters
The NamedServerFilter requires two mandatory parameters.
NamedServerFilter requires at least one *matchXY* - *targetXY* pair.
### `matchXY`
### `matchXY`, `options`
Regular expression the SQL-query is matched against. XY must be a number in the
range 01 - 25. Each *match* setting must have a similarly indexed *target*
setting.
*matchXY* defines a
[PCRE2 regular expression](../Getting-Started/Configuration-Guide.md#regular-expressions)
against which the incoming SQL query is matched. *XY* must be a number in the range
01 - 25. Each *match*-setting pairs with a similarly indexed *target*- setting. If one is
defined, the other must be defined as well. If a query matches the pattern, the filter
attaches a routing hint defined by the *target*-setting to the query. The
*options*-parameter affects how the patterns are compiled as
[usual](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters).
```
match01=^SELECT
options=case,extended
```
### `targetXY`
This is the hint which will be attached to the queries matching the regex. If a
compatible router is used in the service the query will be routed accordingly.
The target can be one of the following:
The hint which is attached to the queries matching the regular expression defined by
*matchXY*. If a compatible router is used in the service the query will be routed
accordingly. The target can be one of the following:
* a server name (adds a `HINT_ROUTE_TO_NAMED_SERVER` hint)
* a list of server names, comma-separated (adds several
@ -115,26 +121,7 @@ names is simply left as is and routed straight through.
user=john
```
## Filter Options
The named server filter accepts the following options.
|Option |Description |
|----------|--------------------------------------------|
|ignorecase|Use case-insensitive matching (default) |
|case |Use case-sensitive matching |
|extended |Ignore white space and # comments |
To use multiple filter options, list them in a comma-separated list.
```
options=case,extended
```
**Note:** The *ignorecase* and *case* options are mutually exclusive and only
one of them should be used.
## Notes
## Additional remarks
The maximum number of accepted *match* - *target* pairs may be higher and can
change if other features are added to the filter. A minimum of 25 is guaranteed
@ -143,7 +130,7 @@ for now.
In the configuration, the indexed match and target settings may be in any order
and may skip numbers. During SQL-query matching, however, the regexes are tested
in ascending order: match01, match02, match03 and so on. As soon as a match is
found for a qiven query, the routing hints are written and the packet is
found for a given query, the routing hints are written and the packet is
forwarded to the next filter or router. Any possibly remaining match regexes are
ignored. This means the *match* - *target* pairs should be indexed in priority
order, or, if priority is not a factor, in order of decreasing match

View File

@ -23,28 +23,6 @@ password=mypasswd
filters=MyLogFilter
```
## Filter Options
The QLA filter accepts the following options.
Option | Description
-------| -----------
ignorecase | Use case-insensitive matching
case | Use case-sensitive matching
extended | Use extended regular expression syntax (ERE)
To use multiple filter options, list them in a comma-separated list. If no
options are given, default will be used. Multiple options can be enabled
simultaneously.
```
options=case,extended
```
**Note**: older the version of the QLA filter in 0.7 of MariaDB MaxScale used
the `options` to define the location of the log files. This functionality is not
supported anymore and the `filebase` parameter should be used instead.
## Filter Parameters
The QLA filter has one mandatory parameter, `filebase`, and a number of optional
@ -60,25 +38,18 @@ added to the filename for each written session file. For unified log files,
filebase=/tmp/SqlQueryLog
```
The filebase may also be set as the filter option. If both option and parameter
are set, the parameter setting will be used and the filter option ignored.
### `match`, `exclude` and `options`
### `match` and `exclude`
These optional parameters limit logging on a query level. The parameter values
are regular expressions which are matched against the SQL query text. Only SQL
statements that match the regular expression in *match* but do not match the
*exclude* expression are logged.
These
[regular expression settings](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters)
limit which queries are logged.
```
match=select.*from.*customer.*where
exclude=^insert
options=case,extended
```
*match* is checked before *exclude*. If *match* is empty, all queries are
considered matching. If *exclude* is empty, no query is exluded. If both are
empty, all queries are logged.
### `user` and `source`
These optional parameters limit logging on a session level. If `user` is

View File

@ -28,27 +28,29 @@ password=mypasswd
filters=MyRegexfilter
```
## Filter Options
The Regex filter accepts the options ignorecase or case. These define if the pattern text should take the case of the string it is matching against into consideration or not.
## Filter Parameters
The Regex filter requires two mandatory parameters to be defined.
The Regex filter has two mandatory parameters: *match* and *replace*.
### `match`
### `match`, `options`
A parameter that can be used to match text in the SQL statement which should be replaced.
*match* is a
[PCRE2 regular expression](../Getting-Started/Configuration-Guide.md#regular-expressions)
which defines the text in the SQL statements that is replaced.
The *options*-parameter affects how the patterns are compiled as
[usual](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters).
Regex filter does not support the `extended`-option.
```
match=TYPE[ ]*=
options=case
```
If the filter option ignorecase is used all regular expressions are evaluated with the option to ignore the case of the text, therefore a match option of select will match both type, TYPE and any form of the word with upper or lowercase characters.
### `replace`
The replace parameter defines the text that should replace the text in the SQL text which matches the match.
This is the text that should replace the part of the SQL-query matching the pattern
defined in *match*.
```
replace=ENGINE =

View File

@ -36,46 +36,14 @@ filters=DataMartFilter
The tee filter requires a mandatory parameter to define the service to replicate
statements to and accepts a number of optional parameters.
### `match`
### `match`, `exclude` and `options`
An optional parameter used to limit the queries that will be replicated by the
tee filter. The parameter value is a PCRE2 regular expression that is used to
match against the SQL text. Only SQL statements that match the text passed as
the value of this parameter will be sent to the service defined in the filter
section.
These [regular expression settings](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters)
limit the queries replicated by the tee filter.
```
match=/insert.*into.*order*/
```
### `exclude`
An optional parameter used to limit the queries that will be replicated by the
tee filter. The parameter value is a PCRE2 regular expression that is used to
match against the SQL text. Any SQL statements that match the text passed as the
value of this parameter will be excluded from the replication stream.
```
exclude=/select.*from.*t1/
```
If both `match` and `exclude` parameters are defined, `exclude` takes
precedence.
### `options`
The options parameter controls the regular expression options. The following
options are accepted.
|Option |Description |
|----------|--------------------------------------------|
|ignorecase|Use case-insensitive matching |
|case |Use case-sensitive matching |
|extended |Use extended regular expression syntax (ERE)|
To use multiple filter options, list them in a comma-separated list.
```
options=case,extended
```

View File

@ -9,7 +9,7 @@ Table of Contents
* [Filter Parameters](#filter-parameters)
* [filebase](#filebase)
* [count](#count)
* [match](#match)
* [match, exclude and options](#match-exclude-and-options)
* [exclude](#exclude)
* [source](#source)
* [user](#user)
@ -44,22 +44,6 @@ password=mypasswd
filters=MyLogFilter
```
### Filter Options
The top filter accepts the following options.
|Option |Description |
|----------|--------------------------------------------|
|ignorecase|Use case-insensitive matching |
|case |Use case-sensitive matching |
|extended |Use extended regular expression syntax (ERE)|
To use multiple filter options, list them in a comma-separated list.
```
options=case,extended
```
### Filter Parameters
The top filter has one mandatory parameter, `filebase`, and a number of optional
@ -88,36 +72,17 @@ count=30
The default value for the number of statements recorded is 10.
#### `match`
#### `match`, `exclude` and `options`
An optional parameter that can be used to limit the queries that will be logged
by the top filter. The parameter value is a regular expression that is used to
match against the SQL text. Only SQL statements that matches the text passed as
the value of this parameter will be logged.
These [regular expression settings](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters)
limit the queries logged by the top filter.
```
match=select.*from.*customer.*where
```
All regular expressions are evaluated with the option to ignore the case of the
text, therefore a match option of select will match both select, SELECT and any
form of the word with upper or lowercase characters.
#### `exclude`
An optional parameter that can be used to limit the queries that will be logged
by the top filter. The parameter value is a regular expression that is used to
match against the SQL text. SQL statements that match the text passed as the
value of this parameter will be excluded from the log output.
```
exclude=where
options=case,extended
```
All regular expressions are evaluated with the option to ignore the case of the
text, therefore an exclude option of select will exclude statements that contain
both where, WHERE or any form of the word with upper or lowercase characters.
#### `source`
The optional source parameter defines an address that is used to match against

View File

@ -247,16 +247,66 @@ max_size=1000000M
max_size=1000G
max_size=1T
```
#### Regular Expressions
When a regular expression (regex) parameter is accepted, the pattern string
should be enclosed in slashes e.g. `match=/^select/` defines the pattern
`^select`. The slashes allow whitespace to be read from the ends of the regex
string contrary to a normal string parameter and are removed before compiling
the pattern. For backwards compatibility, the slashes are not yet mandatory.
Omitting them is, however, deprecated and will be rejected in the next release
of MaxScale. Currently, *QLAFilter* accepts parameters in regular expression
form.
Many modules have settings which accept a regular expression. In most cases, these
settings are named either *match* or *exclude*, and are used to filter users or queries.
MaxScale uses the [PCRE2-library](https://www.pcre.org/current/doc/html/) for matching
regular expressions.
When writing a regular expression (regex) type parameter to a MaxScale configuration file,
the pattern string should be enclosed in slashes e.g. `^select` -> `match=/^select/`. This
clarifies where the pattern begins and ends, even if it includes whitespace. Without
slashes the configuration loader trims the pattern from the ends. The slashes are removed
before compiling the pattern. For backwards compatibility, the slashes are not yet
mandatory. Omitting them is, however, deprecated and will be rejected in a future release
of MaxScale. Currently, *binlogfilter*, *ccrfilter*, *qlafilter*, *tee* and *avrorouter*
accept parameters in this type of regular expression form. Some other modules may not
handle the slashes yet correctly.
PCRE2 supports a complicated regular expression
[syntax](https://www.pcre.org/current/doc/html/pcre2syntax.html). MaxScale typically uses
regular expressions simply, only checking whether the pattern and subject match at some
point. For example, using the QLAFilter and setting `match=/SELECT/` causes the filter to
accept any query with the text "SELECT" somewhere within. To force the pattern to only
match at the beginning of the query, set `match=/^SELECT/`. To only match the end, set
`match=/SELECT$/`.
Modules which accept regular expression parameters also often accept options which affect
how the patterns are compiled. Typically, this setting is called *options* and accepts
values such as `ignorecase`, `case` and `extended`. `ignorecase` causes the regular
expression matcher to ignore letter case, and is often on by default. `extended` ignores
whitespace in the pattern. `case` turns on case-sensitive matching. These settings can
also be defined in the pattern itself, so they can be used even in modules without
pattern compilation settings. The pattern settings are `(?i)` for `ignorecase` and `(?x)`
for `extended`. See the
[PCRE2 api documentation](https://www.pcre.org/current/doc/html/pcre2api.html#SEC20)
for more information.
##### Standard regular expression settings for filters
Many filters use the settings *match*, *exclude* and *options*. Since these settings are
used in a similar way across these filters, the settings are explained here. The
documentation of the filters link here and describe any exceptions to this
generalized explanation.
These settings typically limit the queries the filter module acts on. *match* and
*exclude* define PCRE2 regular expression patterns while *options* affects how both of the
patterns are compiled. *options* works as explained above, accepting the values
`ignorecase`, `case` and `extended`, with `ignorecase` being the default.
The queries are matched as they arrive to the filter on their way to a routing module. If
*match* is defined, the filter only acts on queries matching that pattern. If *match* is
not defined, all queries are considered to match.
If *exclude* is defined, the filter only acts on queries not matching that pattern. If
*exclude* is not defined, nothing is excluded.
If both are defined, the query needs to match *match* but not match *exclude*.
Even if a filter does not act on a query, the query is not lost. The query is simply
passed on to the next module in the processing chain as if the filter was not there.
### Global Settings

View File

@ -29,8 +29,7 @@ Table of Contents
* [Router Parameters](#router-parameters)
* [source](#source)
* [codec](#codec)
* [match](#match)
* [exclude](#exclude)
* [match and exclude](#match-and-exclude)
* [Router Options](#router-options)
* [General Options](#general-options)
* [binlogdir](#binlogdir)
@ -104,39 +103,11 @@ _deflate_. These are the mandatory compression algorithms required by the
Avro specification. For more information about the compression types,
refer to the [Avro specification](https://avro.apache.org/docs/current/spec.html#Required+Codecs).
#### `match`
#### `match` and `exclude`
Only process events for tables that match this PCRE2 regular expression. See
[Regular Expressions](../Getting-Started/Configuration-Guide.md#regular-expressions)
for more information about regular expressions.
This parameter was added in MaxScale 2.2.14.
#### `exclude`
Ignore events for tables that match this PCRE2 regular expression. This can be
combined with the `match` parameter to implement table event filtering.
This parameter was added in MaxScale 2.2.14.
**Note:** Since the 2.1 version of MaxScale, all of the router options can also
be defined as parameters.
```
[replication-router]
type=service
router=binlogrouter
router_options=server-id=4000,binlogdir=/var/lib/mysql,filestem=binlog
user=maxuser
password=maxpwd
[avro-router]
type=service
router=avrorouter
binlogdir=/var/lib/mysql
filestem=binlog
avrodir=/var/lib/maxscale
```
These [regular expression settings](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters)
filter events for processing depending on table names. Avrorouter does not support the
*options*-parameter for regular expressions.
### Router Options
@ -202,6 +173,24 @@ currently, if used with Avrorouter, the option `mariadb10_master_gtid` must be
set to off in the Binlog Server configuration in order to correclty read the
binlog files.
##### Example configuration
```
[replication-router]
type=service
router=binlogrouter
router_options=server-id=4000,binlogdir=/var/lib/mysql,filestem=binlog
user=maxuser
password=maxpwd
[avro-router]
type=service
router=avrorouter
binlogdir=/var/lib/mysql
filestem=binlog
avrodir=/var/lib/maxscale
```
#### Avro file options
These options control how large the Avro file data blocks can get.

View File

@ -123,8 +123,9 @@ List of databases to ignore when checking for duplicate databases.
### `ignore_databases_regex`
Regular expression that is matched against database names when checking for
duplicate databases.
A
[PCRE2 regular expression](../Getting-Started/Configuration-Guide.md#regular-expressions)
that is matched against database names when checking for duplicate databases.
### `preferred_server`