From 78724745955f700f6bbbf5c041d91ba3089b2a78 Mon Sep 17 00:00:00 2001 From: Johan Wikman Date: Mon, 3 Dec 2018 15:30:47 +0200 Subject: [PATCH 1/2] Add missing bug fix to release notes --- Documentation/Release-Notes/MaxScale-2.3.2-Release-Notes.md | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/Release-Notes/MaxScale-2.3.2-Release-Notes.md b/Documentation/Release-Notes/MaxScale-2.3.2-Release-Notes.md index 50e319bec..0206b4ac8 100644 --- a/Documentation/Release-Notes/MaxScale-2.3.2-Release-Notes.md +++ b/Documentation/Release-Notes/MaxScale-2.3.2-Release-Notes.md @@ -40,6 +40,7 @@ setting `assume_unique_hostnames` should be disabled. ## Bug fixes * [MXS-2189](https://jira.mariadb.org/browse/MXS-2189) optimistic_trx is rolled back if master fails +* [MXS-2188](https://jira.mariadb.org/browse/MXS-2188) MaxScale crashing when replicating * [MXS-2187](https://jira.mariadb.org/browse/MXS-2187) Transaction replay is only attempted once * [MXS-2186](https://jira.mariadb.org/browse/MXS-2186) SHOW DATABASES is routed to the master * [MXS-2184](https://jira.mariadb.org/browse/MXS-2184) event_number is not incremented for updates From a6cacbec1725d6ac3bf20c098ef44f8adda9e439 Mon Sep 17 00:00:00 2001 From: Esa Korhonen Date: Tue, 4 Dec 2018 17:53:10 +0200 Subject: [PATCH 2/2] Update MariaDBMonitor documentation Updates troubleshooting section. --- Documentation/Monitors/MariaDB-Monitor.md | 39 ++++++++++++++++------- 1 file changed, 27 insertions(+), 12 deletions(-) diff --git a/Documentation/Monitors/MariaDB-Monitor.md b/Documentation/Monitors/MariaDB-Monitor.md index 38e849c83..0a83997cd 100644 --- a/Documentation/Monitors/MariaDB-Monitor.md +++ b/Documentation/Monitors/MariaDB-Monitor.md @@ -36,7 +36,9 @@ Table of Contents * [servers_no_promotion](#servers_no_promotion) * [promotion_sql_file and demotion_sql_file](#promotion_sql_file-and-demotion_sql_file) * [handle_server_events](#handle_server_events) - * [Troubleshooting](#troubleshooting) + * [Troubleshooting](#troubleshooting) + * [Failover/switchover fails](#failoverswitchover-fails) + * [Slave detection shows external masters](#slave-detection-shows-external-masters) * [Using the MariaDB Monitor With Binlogrouter](#using-the-mariadb-monitor-with-binlogrouter) * [Example 1 - Monitor script](#example-1---monitor-script) @@ -754,25 +756,38 @@ the monitor rejoins the server and disables the events. This should only be an issue for events running more often than the monitor interval or events that run immediately after the server has restarted. -### Troubleshooting +## Troubleshooting + +### Failover/switchover fails Before performing failover or switchover, the MariaDB Monitor first checks that prerequisites are fulfilled, printing any found errors. This should catch and explain most issues with failover or switchover not working. If the operations are attempted and still fail, then most likely one of the commands the monitor -issued to a server failed or timed out. To find out exactly what the queries -sent to the servers are, start MaxScale with `--debug=enable-statement-logging`. -This setting prints all queries sent to the backends by monitors and -authenticators. +issued to a server failed or timed out. The log should explain which query failed. +To print out all queries sent to the servers, start MaxScale with +`--debug=enable-statement-logging`. This setting prints all queries sent to the +backends by monitors and authenticators. -A typical reason for failure is that a command such as `STOP SLAVE` takes longer -than the `backend_read_timeout` of the monitor, causing the connection to break. -In this case , increasing the timeout settings of the monitor should help. -Another settings to look at are `query_retries` and `query_retry_timeout`. These -are general MaxScale settings described in the -[Configuration guide](#../Getting-Started/Configuration-Guide.md). Setting +A typical reason for failure is that a command such as `STOP SLAVE` takes longer than the +`backend_read_timeout` of the monitor, causing the connection to break. As of 2.3, the +monitor will retry most such queries if the failure was caused by a timeout. The retrying +continues until the total time for a failover or switchover has been spent. If the log +shows warnings or errors about commands timing out, increasing the backend timeout +settings of the monitor should help. Another settings to look at are `query_retries` and +`query_retry_timeout`. These are general MaxScale settings described in the +[Configuration guide](../Getting-Started/Configuration-Guide.md). Setting `query_retries` to 2 is a reasonable first try. +### Slave detection shows external masters + +If a slave is shown in _maxadmin_ or _maxctrl_ as "Slave of External Server" instead of +"Slave", the reason is likely that the "Master_Host"-setting of the replication connection +does not match the MaxScale server definition. As of 2.3.2, the MariaDB Monitor by default +assumes that the slave connections (as shown by `SHOW ALL SLAVES STATUS`) use the exact +same "Master_Host" as used the MaxScale configuration file server definitions. This is +controlled by the setting [assume_unique_hostnames](#assume_unique_hostnames). + ## Using the MariaDB Monitor With Binlogrouter Since MaxScale 2.2 it's possible to detect a replication setup