From bb706394f69c90bdd5604abec3f0ffaac5877c27 Mon Sep 17 00:00:00 2001 From: Esa Korhonen Date: Wed, 15 May 2019 17:04:33 +0300 Subject: [PATCH] MXS-2473 Simplify regular expression settings documentation The settings "match", "exclude" and "options" are now explained once in the general documentation. The individual filter documentation refers to the general explanation. --- Documentation/Filters/BinlogFilter.md | 32 ++++----- Documentation/Filters/CCRFilter.md | 38 ++--------- .../Filters/Database-Firewall-Filter.md | 9 +-- Documentation/Filters/Named-Server-Filter.md | 45 +++++-------- Documentation/Filters/Query-Log-All-Filter.md | 39 ++--------- Documentation/Filters/Regex-Filter.md | 22 ++++--- Documentation/Filters/Tee-Filter.md | 38 +---------- Documentation/Filters/Top-N-Filter.md | 45 ++----------- .../Getting-Started/Configuration-Guide.md | 66 ++++++++++++++++--- Documentation/Routers/Avrorouter.md | 57 +++++++--------- Documentation/Routers/SchemaRouter.md | 5 +- 11 files changed, 148 insertions(+), 248 deletions(-) diff --git a/Documentation/Filters/BinlogFilter.md b/Documentation/Filters/BinlogFilter.md index e4e180e98..e8b80b3f5 100644 --- a/Documentation/Filters/BinlogFilter.md +++ b/Documentation/Filters/BinlogFilter.md @@ -7,7 +7,7 @@ This filter was introduced in MariaDB MaxScale 2.3.0. The `binlogfilter` can be combined with a `binlogrouter` service to selectively replicate the binary log events to slave servers. -The filter uses two parameters, `match` and `exclude`, to decide which events +The filter uses two parameters, *match* and *exclude*, to decide which events are replicated. If a binlog event does not match or is excluded, the event is replaced with an empty data event. The empty event is always 35 bytes which translates to a space reduction in most cases. @@ -18,33 +18,25 @@ that there are no ambiguities in the event filtering. ## Configuration -Both the `match` and `exclude` parameters are optional. If neither of them is -defined, the filter does nothing and all events are replicated. +### `match` and `exclude` + +Both the *match* and *exclude* parameters are optional and work mostly as other +[typical regular expression parameters](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters). +If neither of them is defined, the filter does nothing and all events are replicated. This +filter does not accept regular expression options as a separate parameter, such settings +must be defined in the patterns themselves. See the +[PCRE2 api documentation](https://www.pcre.org/current/doc/html/pcre2api.html#SEC20) for +more information. The two parameters are matched against the database and table name concatenated with a period. For example, the string the patterns are matched against for the database `test` and table `t1` is `test.t1`. For statement based replication, the pattern is matched against all the tables -in the statements. If any of the tables matches the `match` pattern, the event -is replicated. If any of the tables matches the `exclude` pattern, the event is +in the statements. If any of the tables matches the *match* pattern, the event +is replicated. If any of the tables matches the *exclude* pattern, the event is not replicated. - -### `match` - -A [PCRE2 regular expression](../Getting-Started/Configuration-Guide.md#regular-expressions) -that is matched against the database and table name. If the pattern matches, the -event is replicated to the slave. If no `match` parameter is defined, all events -are considered to match. - -### `exclude` - -A [PCRE2 regular expression](../Getting-Started/Configuration-Guide.md#regular-expressions) -that is matched against the database and table name. If the pattern matches, the -event is excluded and is not replicated to the slave. If no `exclude` pattern is -defined, the event filtering is controlled completely by the `match` parameter. - ## Example Configuration With the following configuration, only events belonging to database `customers` diff --git a/Documentation/Filters/CCRFilter.md b/Documentation/Filters/CCRFilter.md index 3028e1af3..a8ba14152 100644 --- a/Documentation/Filters/CCRFilter.md +++ b/Documentation/Filters/CCRFilter.md @@ -35,22 +35,6 @@ comment. The `match`-comment typically has no effect, since write queries by default trigger the filter anyway. It can be used to override an ignore-type regular expression that would othewise prevent triggering. -## Filter Options - -The CCR filter accepts the following options. - -|Option |Description | -|-----------|--------------------------------------------| -|ignorecase |Use case-insensitive matching (default) | -|case |Use case-sensitive matching | -|extended |Use extended regular expression syntax (ERE)| - -To use multiple filter options, list them in a comma-separated list. - -``` -options=case,extended -``` - ## Filter Parameters The CCR filter has no mandatory parameters. @@ -81,27 +65,17 @@ the counter reaches zero, the statements are routed normally. If a new data modifying SQL statement is processed, the counter is reset to the value of _count_. -### `match` +### `match`, `ignore` and `options` -An optional parameter that can be used to control which statements trigger the -statement re-routing. The parameter value is a regular expression that is used -to match against the SQL text. Only non-SELECT statements are inspected. If this -parameter is defined, *only* matching SQL-queries will trigger the filter -(assuming no ccr hint comments in the query). +These [regular expression settings](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters) +control which statements trigger statement re-routing. Only non-SELECT statements are +inspected. For CCRFilter, the *exclude*-parameter is instead named *ignore*, yet works +similarly. ``` match=.*INSERT.* -``` - -### `ignore` - -An optional parameter that can be used to control which statements don't trigger -the statement re-routing. This does the opposite of the _match_ parameter. The -parameter value is a regular expression that is used to match against the SQL -text. Only non-SELECT statements are inspected. - -``` ignore=.*UPDATE.* +options=case,extended ``` ## Example Configuration diff --git a/Documentation/Filters/Database-Firewall-Filter.md b/Documentation/Filters/Database-Firewall-Filter.md index 9d575f35f..6929318fd 100644 --- a/Documentation/Filters/Database-Firewall-Filter.md +++ b/Documentation/Filters/Database-Firewall-Filter.md @@ -318,10 +318,11 @@ rule examplerule match not_function columns ssn #### `regex` -This rule blocks all queries matching a regex enclosed in single or double -quotes. The regex string expects a PCRE2 syntax regular expression. For more -information about the PCRE2 syntax, read the [PCRE2 -documentation](http://www.pcre.org/current/doc/html/pcre2syntax.html). +This rule blocks all queries matching the regular expression. The regex string expects a +PCRE2 syntax regular expression. For more information about PCRE2 syntax, read the +[PCRE2 documentation](http://www.pcre.org/current/doc/html/pcre2syntax.html). Unlike +typical MaxScale regex parameters, the value should be enclosed in single or double +quotes, not in `/.../`. Any compilation options must be included in the pattern itself. ##### Example diff --git a/Documentation/Filters/Named-Server-Filter.md b/Documentation/Filters/Named-Server-Filter.md index 458b1bc1c..b8fe7c7e4 100644 --- a/Documentation/Filters/Named-Server-Filter.md +++ b/Documentation/Filters/Named-Server-Filter.md @@ -50,23 +50,29 @@ filters=NamedServerFilter ## Filter Parameters -The NamedServerFilter requires two mandatory parameters. +NamedServerFilter requires at least one *matchXY* - *targetXY* pair. -### `matchXY` +### `matchXY`, `options` -Regular expression the SQL-query is matched against. XY must be a number in the -range 01 - 25. Each *match* setting must have a similarly indexed *target* -setting. +*matchXY* defines a +[PCRE2 regular expression](../Getting-Started/Configuration-Guide.md#regular-expressions) +against which the incoming SQL query is matched. *XY* must be a number in the range +01 - 25. Each *match*-setting pairs with a similarly indexed *target*- setting. If one is +defined, the other must be defined as well. If a query matches the pattern, the filter +attaches a routing hint defined by the *target*-setting to the query. The +*options*-parameter affects how the patterns are compiled as +[usual](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters). ``` match01=^SELECT +options=case,extended ``` ### `targetXY` -This is the hint which will be attached to the queries matching the regex. If a -compatible router is used in the service the query will be routed accordingly. -The target can be one of the following: +The hint which is attached to the queries matching the regular expression defined by +*matchXY*. If a compatible router is used in the service the query will be routed +accordingly. The target can be one of the following: * a server name (adds a `HINT_ROUTE_TO_NAMED_SERVER` hint) * a list of server names, comma-separated (adds several @@ -115,26 +121,7 @@ names is simply left as is and routed straight through. user=john ``` -## Filter Options - -The named server filter accepts the following options. - -|Option |Description | -|----------|--------------------------------------------| -|ignorecase|Use case-insensitive matching (default) | -|case |Use case-sensitive matching | -|extended |Ignore white space and # comments | - -To use multiple filter options, list them in a comma-separated list. - -``` -options=case,extended -``` - -**Note:** The *ignorecase* and *case* options are mutually exclusive and only -one of them should be used. - -## Notes +## Additional remarks The maximum number of accepted *match* - *target* pairs may be higher and can change if other features are added to the filter. A minimum of 25 is guaranteed @@ -143,7 +130,7 @@ for now. In the configuration, the indexed match and target settings may be in any order and may skip numbers. During SQL-query matching, however, the regexes are tested in ascending order: match01, match02, match03 and so on. As soon as a match is -found for a qiven query, the routing hints are written and the packet is +found for a given query, the routing hints are written and the packet is forwarded to the next filter or router. Any possibly remaining match regexes are ignored. This means the *match* - *target* pairs should be indexed in priority order, or, if priority is not a factor, in order of decreasing match diff --git a/Documentation/Filters/Query-Log-All-Filter.md b/Documentation/Filters/Query-Log-All-Filter.md index 9490ca84c..17ad382c8 100644 --- a/Documentation/Filters/Query-Log-All-Filter.md +++ b/Documentation/Filters/Query-Log-All-Filter.md @@ -23,28 +23,6 @@ password=mypasswd filters=MyLogFilter ``` -## Filter Options - -The QLA filter accepts the following options. - - Option | Description - -------| ----------- - ignorecase | Use case-insensitive matching - case | Use case-sensitive matching - extended | Use extended regular expression syntax (ERE) - -To use multiple filter options, list them in a comma-separated list. If no -options are given, default will be used. Multiple options can be enabled -simultaneously. - -``` -options=case,extended -``` - -**Note**: older the version of the QLA filter in 0.7 of MariaDB MaxScale used -the `options` to define the location of the log files. This functionality is not -supported anymore and the `filebase` parameter should be used instead. - ## Filter Parameters The QLA filter has one mandatory parameter, `filebase`, and a number of optional @@ -60,25 +38,18 @@ added to the filename for each written session file. For unified log files, filebase=/tmp/SqlQueryLog ``` -The filebase may also be set as the filter option. If both option and parameter -are set, the parameter setting will be used and the filter option ignored. +### `match`, `exclude` and `options` -### `match` and `exclude` - -These optional parameters limit logging on a query level. The parameter values -are regular expressions which are matched against the SQL query text. Only SQL -statements that match the regular expression in *match* but do not match the -*exclude* expression are logged. +These +[regular expression settings](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters) +limit which queries are logged. ``` match=select.*from.*customer.*where exclude=^insert +options=case,extended ``` -*match* is checked before *exclude*. If *match* is empty, all queries are -considered matching. If *exclude* is empty, no query is exluded. If both are -empty, all queries are logged. - ### `user` and `source` These optional parameters limit logging on a session level. If `user` is diff --git a/Documentation/Filters/Regex-Filter.md b/Documentation/Filters/Regex-Filter.md index 263746bd3..bf88b94b4 100644 --- a/Documentation/Filters/Regex-Filter.md +++ b/Documentation/Filters/Regex-Filter.md @@ -28,27 +28,29 @@ password=mypasswd filters=MyRegexfilter ``` -## Filter Options - -The Regex filter accepts the options ignorecase or case. These define if the pattern text should take the case of the string it is matching against into consideration or not. - ## Filter Parameters -The Regex filter requires two mandatory parameters to be defined. +The Regex filter has two mandatory parameters: *match* and *replace*. -### `match` +### `match`, `options` -A parameter that can be used to match text in the SQL statement which should be replaced. +*match* is a +[PCRE2 regular expression](../Getting-Started/Configuration-Guide.md#regular-expressions) +which defines the text in the SQL statements that is replaced. + +The *options*-parameter affects how the patterns are compiled as +[usual](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters). +Regex filter does not support the `extended`-option. ``` match=TYPE[ ]*= +options=case ``` -If the filter option ignorecase is used all regular expressions are evaluated with the option to ignore the case of the text, therefore a match option of select will match both type, TYPE and any form of the word with upper or lowercase characters. - ### `replace` -The replace parameter defines the text that should replace the text in the SQL text which matches the match. +This is the text that should replace the part of the SQL-query matching the pattern +defined in *match*. ``` replace=ENGINE = diff --git a/Documentation/Filters/Tee-Filter.md b/Documentation/Filters/Tee-Filter.md index 4c75d2a36..f4bb7fa32 100644 --- a/Documentation/Filters/Tee-Filter.md +++ b/Documentation/Filters/Tee-Filter.md @@ -36,46 +36,14 @@ filters=DataMartFilter The tee filter requires a mandatory parameter to define the service to replicate statements to and accepts a number of optional parameters. -### `match` +### `match`, `exclude` and `options` -An optional parameter used to limit the queries that will be replicated by the -tee filter. The parameter value is a PCRE2 regular expression that is used to -match against the SQL text. Only SQL statements that match the text passed as -the value of this parameter will be sent to the service defined in the filter -section. +These [regular expression settings](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters) +limit the queries replicated by the tee filter. ``` match=/insert.*into.*order*/ -``` - -### `exclude` - -An optional parameter used to limit the queries that will be replicated by the -tee filter. The parameter value is a PCRE2 regular expression that is used to -match against the SQL text. Any SQL statements that match the text passed as the -value of this parameter will be excluded from the replication stream. - -``` exclude=/select.*from.*t1/ -``` - -If both `match` and `exclude` parameters are defined, `exclude` takes -precedence. - -### `options` - -The options parameter controls the regular expression options. The following -options are accepted. - -|Option |Description | -|----------|--------------------------------------------| -|ignorecase|Use case-insensitive matching | -|case |Use case-sensitive matching | -|extended |Use extended regular expression syntax (ERE)| - -To use multiple filter options, list them in a comma-separated list. - -``` options=case,extended ``` diff --git a/Documentation/Filters/Top-N-Filter.md b/Documentation/Filters/Top-N-Filter.md index c35e7bfbb..6f46f0634 100644 --- a/Documentation/Filters/Top-N-Filter.md +++ b/Documentation/Filters/Top-N-Filter.md @@ -9,7 +9,7 @@ Table of Contents * [Filter Parameters](#filter-parameters) * [filebase](#filebase) * [count](#count) - * [match](#match) + * [match, exclude and options](#match-exclude-and-options) * [exclude](#exclude) * [source](#source) * [user](#user) @@ -44,22 +44,6 @@ password=mypasswd filters=MyLogFilter ``` -### Filter Options - -The top filter accepts the following options. - -|Option |Description | -|----------|--------------------------------------------| -|ignorecase|Use case-insensitive matching | -|case |Use case-sensitive matching | -|extended |Use extended regular expression syntax (ERE)| - -To use multiple filter options, list them in a comma-separated list. - -``` -options=case,extended -``` - ### Filter Parameters The top filter has one mandatory parameter, `filebase`, and a number of optional @@ -88,36 +72,17 @@ count=30 The default value for the number of statements recorded is 10. -#### `match` +#### `match`, `exclude` and `options` -An optional parameter that can be used to limit the queries that will be logged -by the top filter. The parameter value is a regular expression that is used to -match against the SQL text. Only SQL statements that matches the text passed as -the value of this parameter will be logged. +These [regular expression settings](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters) +limit the queries logged by the top filter. ``` match=select.*from.*customer.*where -``` - -All regular expressions are evaluated with the option to ignore the case of the -text, therefore a match option of select will match both select, SELECT and any -form of the word with upper or lowercase characters. - -#### `exclude` - -An optional parameter that can be used to limit the queries that will be logged -by the top filter. The parameter value is a regular expression that is used to -match against the SQL text. SQL statements that match the text passed as the -value of this parameter will be excluded from the log output. - -``` exclude=where +options=case,extended ``` -All regular expressions are evaluated with the option to ignore the case of the -text, therefore an exclude option of select will exclude statements that contain -both where, WHERE or any form of the word with upper or lowercase characters. - #### `source` The optional source parameter defines an address that is used to match against diff --git a/Documentation/Getting-Started/Configuration-Guide.md b/Documentation/Getting-Started/Configuration-Guide.md index 4e77370f7..101d41227 100644 --- a/Documentation/Getting-Started/Configuration-Guide.md +++ b/Documentation/Getting-Started/Configuration-Guide.md @@ -247,16 +247,66 @@ max_size=1000000M max_size=1000G max_size=1T ``` + #### Regular Expressions -When a regular expression (regex) parameter is accepted, the pattern string -should be enclosed in slashes e.g. `match=/^select/` defines the pattern -`^select`. The slashes allow whitespace to be read from the ends of the regex -string contrary to a normal string parameter and are removed before compiling -the pattern. For backwards compatibility, the slashes are not yet mandatory. -Omitting them is, however, deprecated and will be rejected in the next release -of MaxScale. Currently, *QLAFilter* accepts parameters in regular expression -form. +Many modules have settings which accept a regular expression. In most cases, these +settings are named either *match* or *exclude*, and are used to filter users or queries. +MaxScale uses the [PCRE2-library](https://www.pcre.org/current/doc/html/) for matching +regular expressions. + +When writing a regular expression (regex) type parameter to a MaxScale configuration file, +the pattern string should be enclosed in slashes e.g. `^select` -> `match=/^select/`. This +clarifies where the pattern begins and ends, even if it includes whitespace. Without +slashes the configuration loader trims the pattern from the ends. The slashes are removed +before compiling the pattern. For backwards compatibility, the slashes are not yet +mandatory. Omitting them is, however, deprecated and will be rejected in a future release +of MaxScale. Currently, *binlogfilter*, *ccrfilter*, *qlafilter*, *tee* and *avrorouter* +accept parameters in this type of regular expression form. Some other modules may not +handle the slashes yet correctly. + +PCRE2 supports a complicated regular expression +[syntax](https://www.pcre.org/current/doc/html/pcre2syntax.html). MaxScale typically uses +regular expressions simply, only checking whether the pattern and subject match at some +point. For example, using the QLAFilter and setting `match=/SELECT/` causes the filter to +accept any query with the text "SELECT" somewhere within. To force the pattern to only +match at the beginning of the query, set `match=/^SELECT/`. To only match the end, set +`match=/SELECT$/`. + +Modules which accept regular expression parameters also often accept options which affect +how the patterns are compiled. Typically, this setting is called *options* and accepts +values such as `ignorecase`, `case` and `extended`. `ignorecase` causes the regular +expression matcher to ignore letter case, and is often on by default. `extended` ignores +whitespace in the pattern. `case` turns on case-sensitive matching. These settings can +also be defined in the pattern itself, so they can be used even in modules without +pattern compilation settings. The pattern settings are `(?i)` for `ignorecase` and `(?x)` +for `extended`. See the +[PCRE2 api documentation](https://www.pcre.org/current/doc/html/pcre2api.html#SEC20) +for more information. + +##### Standard regular expression settings for filters + +Many filters use the settings *match*, *exclude* and *options*. Since these settings are +used in a similar way across these filters, the settings are explained here. The +documentation of the filters link here and describe any exceptions to this +generalized explanation. + +These settings typically limit the queries the filter module acts on. *match* and +*exclude* define PCRE2 regular expression patterns while *options* affects how both of the +patterns are compiled. *options* works as explained above, accepting the values +`ignorecase`, `case` and `extended`, with `ignorecase` being the default. + +The queries are matched as they arrive to the filter on their way to a routing module. If +*match* is defined, the filter only acts on queries matching that pattern. If *match* is +not defined, all queries are considered to match. + +If *exclude* is defined, the filter only acts on queries not matching that pattern. If +*exclude* is not defined, nothing is excluded. + +If both are defined, the query needs to match *match* but not match *exclude*. + +Even if a filter does not act on a query, the query is not lost. The query is simply +passed on to the next module in the processing chain as if the filter was not there. ### Global Settings diff --git a/Documentation/Routers/Avrorouter.md b/Documentation/Routers/Avrorouter.md index 02589bab6..8a443cc75 100644 --- a/Documentation/Routers/Avrorouter.md +++ b/Documentation/Routers/Avrorouter.md @@ -29,8 +29,7 @@ Table of Contents * [Router Parameters](#router-parameters) * [source](#source) * [codec](#codec) - * [match](#match) - * [exclude](#exclude) + * [match and exclude](#match-and-exclude) * [Router Options](#router-options) * [General Options](#general-options) * [binlogdir](#binlogdir) @@ -104,39 +103,11 @@ _deflate_. These are the mandatory compression algorithms required by the Avro specification. For more information about the compression types, refer to the [Avro specification](https://avro.apache.org/docs/current/spec.html#Required+Codecs). -#### `match` +#### `match` and `exclude` -Only process events for tables that match this PCRE2 regular expression. See -[Regular Expressions](../Getting-Started/Configuration-Guide.md#regular-expressions) -for more information about regular expressions. - -This parameter was added in MaxScale 2.2.14. - -#### `exclude` - -Ignore events for tables that match this PCRE2 regular expression. This can be -combined with the `match` parameter to implement table event filtering. - -This parameter was added in MaxScale 2.2.14. - -**Note:** Since the 2.1 version of MaxScale, all of the router options can also -be defined as parameters. - -``` -[replication-router] -type=service -router=binlogrouter -router_options=server-id=4000,binlogdir=/var/lib/mysql,filestem=binlog -user=maxuser -password=maxpwd - -[avro-router] -type=service -router=avrorouter -binlogdir=/var/lib/mysql -filestem=binlog -avrodir=/var/lib/maxscale -``` +These [regular expression settings](../Getting-Started/Configuration-Guide.md#standard-regular-expression-settings-for-filters) +filter events for processing depending on table names. Avrorouter does not support the +*options*-parameter for regular expressions. ### Router Options @@ -202,6 +173,24 @@ currently, if used with Avrorouter, the option `mariadb10_master_gtid` must be set to off in the Binlog Server configuration in order to correclty read the binlog files. +##### Example configuration + +``` +[replication-router] +type=service +router=binlogrouter +router_options=server-id=4000,binlogdir=/var/lib/mysql,filestem=binlog +user=maxuser +password=maxpwd + +[avro-router] +type=service +router=avrorouter +binlogdir=/var/lib/mysql +filestem=binlog +avrodir=/var/lib/maxscale +``` + #### Avro file options These options control how large the Avro file data blocks can get. diff --git a/Documentation/Routers/SchemaRouter.md b/Documentation/Routers/SchemaRouter.md index 51872ad08..e46934f36 100644 --- a/Documentation/Routers/SchemaRouter.md +++ b/Documentation/Routers/SchemaRouter.md @@ -123,8 +123,9 @@ List of databases to ignore when checking for duplicate databases. ### `ignore_databases_regex` -Regular expression that is matched against database names when checking for -duplicate databases. +A +[PCRE2 regular expression](../Getting-Started/Configuration-Guide.md#regular-expressions) +that is matched against database names when checking for duplicate databases. ### `preferred_server`