Remove old "detect_standalone_master"-feature, update documentation

The auto_failover is a more reliable solution and should be used instead. Several unused parameters were removed, although they can still be defined in the config file. Updated documentation on the relevant parts.
2018-07-04 15:16:01 +03:00
parent f7538db3b7
commit 936bcde135
6 changed files with 42 additions and 234 deletions
--- a/Documentation/Monitors/MariaDB-Monitor.md
+++ b/Documentation/Monitors/MariaDB-Monitor.md
@ -4,9 +4,8 @@ Up until MariaDB MaxScale 2.2.0, this monitor was called _MySQL Monitor_.

 ## Overview

-The MariaDB Monitor is a monitoring module for MaxScale that monitors a Master-Slave
-replication cluster. It assigns master and slave roles inside MaxScale according to
-the actual replication tree in the cluster.
+The MariaDB Monitor monitors a Master-Slave replication cluster. It monitors the
+state of the backends and assigns master and slave roles.

 ## Configuration

@ -20,14 +19,14 @@ module=mariadbmon
 servers=server1,server2,server3
 user=myuser
 passwd=mypwd
-
 ```
-Note that from MaxScale 2.2.1 onwards, the module name is `mariadbmon`; up until
-MaxScale 2.2.0 it was `mysqlmon`. The name `mysqlmon` has been deprecated but can
-still be used, although it will cause a warning to be logged.

-The user requires the REPLICATION CLIENT privilege to successfully monitor the
-state of the servers.
+From MaxScale 2.2.1 onwards, the module name is `mariadbmon` instead of
+`mysqlmon`. The old name can still be used.
+
+The `user` requires the REPLICATION CLIENT privilege to successfully monitor the
+state of the servers. SUPER privilege is required for cluster manipulation
+features such as failover.

 ```
 MariaDB [(none)]> grant replication client on *.* to 'maxscale'@'maxscalehost';
@ -49,16 +48,17 @@ A boolean value which controls if replication lag between the master and the
 slaves is monitored. This allows the routers to route read queries to only
 slaves that are up to date. Default value for this parameter is _false_.

-To detect the replication lag, MaxScale uses the _maxscale_schema.replication_heartbeat_
-table. This table is created on the master server and it is updated at every heartbeat
-with the current timestamp. The updates are then replicated to the slave servers
-and when the replicated timestamp is read from the slave servers, the lag between
-the slave and the master can be calculated.
+To measure the replication lag, MaxScale uses the
+*maxscale_schema.replication_heartbeat* table. This table is created on the
+master server and it is updated at every heartbeat with the current timestamp.
+The updates are then replicated to the slave servers and when the replicated
+timestamp is read from the slave servers, the lag between the slave and the
+master is calculated.

 The monitor user requires INSERT, UPDATE, DELETE and SELECT permissions on the
-maxscale_schema.replication_heartbeat table and CREATE permissions on the
-maxscale_schema database. The monitor user will always try to create the database
-and the table if they do not exist.
+*maxscale_schema.replication_heartbeat* table and CREATE permissions on the
+maxscale_schema database. The monitor creates the database and the table if they
+do not exist.

 ### `detect_stale_master`

@ -97,38 +97,11 @@ detect_stale_slave=true

 ### `mysql51_replication`

-Enable support for MySQL 5.1 replication monitoring. This is needed if a MySQL
-server older than 5.5 is used as a slave in replication.
-
-```
-mysql51_replication=true
-```
+Deprecated and unused as of MaxScale 2.3. Can be defined but is ignored.

 ### `multimaster`

-Detect multi-master replication topologies. This feature is disabled by default.
-
-When enabled, the multi-master detection looks for the root master servers in
-the replication clusters. These masters can be found by detecting cycles in the
-graph created by the servers. When a cycle is detected, it is assigned a master
-group ID. Every master in a master group will receive the Master status. The
-special group ID 0 is assigned to all servers which are not a part of a
-multi-master replication cycle.
-
-If one or more masters in a group has the `@@read_only` system variable set to
-`ON`, those servers will receive the Slave status even though they are in the
-multi-master group. Slave servers with `@@read_only` disabled will never receive
-the master status.
-
-By setting the servers into read-only mode, the user can control which
-server receive the master status. To do this:
-
- Enable `@@read_only` on all servers (preferably through the configuration file)
- Manually disable `@@read_only` on the server which should be the master
-
-This functionality is similar to the [Multi-Master Monitor](MM-Monitor.md)
-functionality. The only difference is that the MariaDB monitor will also detect
-traditional Master-Slave topologies.
+Deprecated and unused as of MaxScale 2.3. Can be defined but is ignored.

 ### `ignore_external_masters`

@ -149,89 +122,22 @@ External Server, Running` labels will instead get the `Master, Running` labels.

 ### `detect_standalone_master`

-Detect standalone master servers. This feature takes a boolean parameter and
-from MaxScale 2.2.1 onwards is enabled by default. Up until MaxScale 2.2.0 it
-was disabled by default. In MaxScale 2.1.0, this parameter was called `failover`.
+Detect standalone master servers. This feature takes a boolean parameter and is
+enabled by default.

-This parameter is intended to be used with simple, two node master-slave pairs
-where the failure of the master can be resolved by "promoting" the slave as the
-new master. Normally this is done by using an external agent of some sort
-(possibly triggered by MaxScale's monitor scripts), like
-[MariaDB Replication Manager](https://github.com/tanji/replication-manager)
-or [MHA](https://code.google.com/p/mysql-master-ha/).
-
-When the number of running servers in the cluster drops down to one, MaxScale
-cannot be absolutely certain whether the last remaining server is a master or a
-slave. At this point, MaxScale will try to deduce the type of the server by
-looking at the system variables of the server in question.
-
-By default, MaxScale will only attempt to deduce if the server can be used as a
-slave server (controlled by the `detect_stale_slave` parameter). When the
-`detect_standalone_master` mode is enabled, MaxScale will also attempt to deduce
-whether the server can be used as a master server. This is done by checking that
-the server is not in read-only mode and that it is not configured as a slave.
-
-This mode in mariadbmon is completely passive in the sense that it does not modify
-the cluster or any of the servers in it. It only labels the last remaining
-server in a cluster as the master server.
-
-Before a server is labelled as a standalone master, the following conditions must
-have been met:
-
- Previous attempts to connect to other servers in the cluster have failed,
-  controlled by the `failcount` parameter
-
- There is only one running server among the monitored servers
-
- The value of the `@@read_only` system variable is set to `OFF`
-
-In 2.1.1, the following additional condition was added:
-
- The last running server is not configured as a slave
-
-If the value of the `allow_cluster_recovery` parameter is set to false, the monitor
-sets all other servers into maintenance mode. This is done to prevent accidental
-use of the failed servers if they came back online. If the failed servers come
-back up, the maintenance mode needs to be manually cleared once replication has
-been set up.
-
-**Note**: A failover will cause permanent changes in the data of the promoted
-  server. Only use this feature if you know that the slave servers are capable
-  of acting as master servers.
+This setting controls whether a standalone server can be a master. A standalone
+server is a server from which no other server in the cluster is attempting to
+replicate from. In most cases this should be left on.

 ### `failcount`

-Number of failures that must occur on all failed servers before a standalone
-server is labelled as a master. The default value is 5 failures.
-
-The monitor will attempt to contact all servers once per monitoring cycle. When
-`detect_standalone_master` is enabled, all of the failed servers must fail
-_failcount_ number of connection attempts before the last server is labeled as
-the master.
-
-The formula for calculating the actual number of milliseconds before the server
-is labelled as the master is `monitor_interval * failcount`.
-
-If automatic failover is enabled (`auto_failover=true`), this setting also
-controls how many times the master server must fail to respond before failover
-begins.
+Number of failures that must consecutively occur on a failed master before an
+automatic failover triggers. The default value is 5 failures. Automatic failover
+must be enabled for this effect (`auto_failover=true`).

 ### `allow_cluster_recovery`

-Allow recovery after the cluster has dropped down to one server. This feature
-takes a boolean parameter is enabled by default. This parameter requires that
-`detect_standalone_master` is set to true. In MaxScale 2.1.0, this parameter was
-called `failover_recovery`.
-
-When this parameter is disabled, if the last remaining server is labelled as the
-master, the monitor will set all of the failed servers into maintenance
-mode. When this option is enabled, the failed servers are allowed to rejoin the
-cluster.
-
-This option should be enabled only when MaxScale is used in conjunction with an
-external agent that automatically reintegrates failed servers into the
-cluster. One of these agents is the _replication-manager_ which automatically
-configures the failed servers as new slaves of the current master.
+Deprecated and unused as of MaxScale 2.3. Can be defined but is ignored.

 ### `enforce_read_only_slaves`

--- a/server/modules/monitor/mariadbmon/cluster_discovery.cc
+++ b/server/modules/monitor/mariadbmon/cluster_discovery.cc
@ -463,78 +463,6 @@ static bool check_replicate_wild_ignore_table(MXS_MONITORED_SERVER* database)
    return rval;
 }

-/**
- * @brief Check whether standalone master conditions have been met
- *
- * This function checks whether all the conditions to use a standalone master are met. For this to happen,
- * only one server must be available and other servers must have passed the configured tolerance level of
- * failures.
- *
- * @return True if standalone master should be used
- */
-bool MariaDBMonitor::standalone_master_required()
-{
-    int candidates = 0;
-    for (auto iter = m_servers.begin(); iter != m_servers.end(); iter++)
-    {
-        MariaDBServer* server = *iter;
-        if (server->is_running())
-        {
-            candidates++;
-            if (server->m_read_only || !server->m_slave_status.empty() || candidates > 1)
-            {
-                return false;
-            }
-        }
-        else if (server->m_server_base->mon_err_count < m_failcount)
-        {
-            return false;
-        }
-    }
-
-    return candidates == 1;
-}
-
-/**
- * @brief Use standalone master
- *
- * This function assigns the last remaining server the master status and sets all other servers into
- * maintenance mode. By setting the servers into maintenance mode, we prevent any possible conflicts when
- * the failed servers come back up.
- *
- * @return True if standalone master was set
- */
-bool MariaDBMonitor::set_standalone_master()
-{
-    bool rval = false;
-    for (auto iter = m_servers.begin(); iter != m_servers.end(); iter++)
-    {
-        MariaDBServer* server = *iter;
-        auto mon_server = server->m_server_base;
-        if (server->is_running())
-        {
-            if (!server->is_master() && m_warn_set_standalone_master)
-            {
-                MXS_WARNING("Setting standalone master, server '%s' is now the master.%s",
-                            server->name(), m_allow_cluster_recovery ? "" :
-                            " All other servers are set into maintenance mode.");
-                m_warn_set_standalone_master = false;
-            }
-
-            monitor_set_pending_status(mon_server, SERVER_MASTER | SERVER_WAS_MASTER);
-            monitor_clear_pending_status(mon_server, SERVER_SLAVE);
-            m_master = server;
-            rval = true;
-        }
-        else if (!m_allow_cluster_recovery)
-        {
-            server->set_status(SERVER_MAINT);
-        }
-    }
-
-    return rval;
-}
-
 /**
 * Find the server with the best reach in the candidates-array. Running state or 'read_only' is ignored by
 * this method.
--- a/server/modules/monitor/mariadbmon/mariadbmon.cc
+++ b/server/modules/monitor/mariadbmon/mariadbmon.cc
@ -42,6 +42,7 @@ static const char CN_NO_PROMOTE_SERVERS[]           = "servers_no_promotion";
 static const char CN_FAILOVER_TIMEOUT[]             = "failover_timeout";
 static const char CN_SWITCHOVER_ON_LOW_DISK_SPACE[] = "switchover_on_low_disk_space";
 static const char CN_SWITCHOVER_TIMEOUT[]           = "switchover_timeout";
+static const char CN_DETECT_STANDALONE_MASTER[]     = "detect_standalone_master";
 static const char CN_MAINTENANCE_ON_LOW_DISK_SPACE[] = "maintenance_on_low_disk_space";
 // Parameters for master failure verification and timeout
 static const char CN_VERIFY_MASTER_FAILURE[]    = "verify_master_failure";
@ -58,7 +59,6 @@ MariaDBMonitor::MariaDBMonitor(MXS_MONITOR* monitor)
    , m_cluster_topology_changed(true)
    , m_cluster_modified(false)
    , m_switchover_on_low_disk_space(false)
-    , m_warn_set_standalone_master(true)
    , m_log_no_master(true)
    , m_warn_failover_precond(true)
    , m_warn_cannot_rejoin(true)
@ -181,11 +181,9 @@ bool MariaDBMonitor::configure(const MXS_CONFIG_PARAMETER* params)
    m_detect_stale_master = config_get_bool(params, "detect_stale_master");
    m_detect_stale_slave = config_get_bool(params, "detect_stale_slave");
    m_detect_replication_lag = config_get_bool(params, "detect_replication_lag");
-    m_detect_multimaster = config_get_bool(params, "multimaster");
    m_ignore_external_masters = config_get_bool(params, "ignore_external_masters");
-    m_detect_standalone_master = config_get_bool(params, "detect_standalone_master");
+    m_detect_standalone_master = config_get_bool(params, CN_DETECT_STANDALONE_MASTER);
    m_failcount = config_get_integer(params, CN_FAILCOUNT);
-    m_allow_cluster_recovery = config_get_bool(params, "allow_cluster_recovery");
    m_script = config_get_string(params, "script");
    m_events = config_get_enum(params, "events", mxs_monitor_event_enum_values);
    m_failover_timeout = config_get_integer(params, CN_FAILOVER_TIMEOUT);
@ -244,7 +242,7 @@ void MariaDBMonitor::diagnostics(DCB *dcb) const
    dcb_printf(dcb, "\nServer information:\n-------------------\n\n");
    for (auto iter = m_servers.begin(); iter != m_servers.end(); iter++)
    {
-        string server_info = (*iter)->diagnostics(m_detect_multimaster) + "\n";
+        string server_info = (*iter)->diagnostics() + "\n";
        dcb_printf(dcb, "%s", server_info.c_str());
    }
 }
@ -256,10 +254,8 @@ json_t* MariaDBMonitor::diagnostics_json() const
    json_object_set_new(rval, "detect_stale_master", json_boolean(m_detect_stale_master));
    json_object_set_new(rval, "detect_stale_slave", json_boolean(m_detect_stale_slave));
    json_object_set_new(rval, "detect_replication_lag", json_boolean(m_detect_replication_lag));
-    json_object_set_new(rval, "multimaster", json_boolean(m_detect_multimaster));
-    json_object_set_new(rval, "detect_standalone_master", json_boolean(m_detect_standalone_master));
+    json_object_set_new(rval, CN_DETECT_STANDALONE_MASTER, json_boolean(m_detect_standalone_master));
    json_object_set_new(rval, CN_FAILCOUNT, json_integer(m_failcount));
-    json_object_set_new(rval, "allow_cluster_recovery", json_boolean(m_allow_cluster_recovery));
    json_object_set_new(rval, CN_AUTO_FAILOVER, json_boolean(m_auto_failover));
    json_object_set_new(rval, CN_FAILOVER_TIMEOUT, json_integer(m_failover_timeout));
    json_object_set_new(rval, CN_SWITCHOVER_TIMEOUT, json_integer(m_switchover_timeout));
@ -281,7 +277,7 @@ json_t* MariaDBMonitor::diagnostics_json() const
        json_t* arr = json_array();
        for (auto iter = m_servers.begin(); iter != m_servers.end(); iter++)
        {
-            json_array_append_new(arr, (*iter)->diagnostics_json(m_detect_multimaster));
+            json_array_append_new(arr, (*iter)->diagnostics_json());
        }
        json_object_set_new(rval, "server_info", arr);
    }
@ -491,20 +487,6 @@ void MariaDBMonitor::tick()
        }
    }

-    /* Check if need to use standalone master. TODO: Rewrite these methods. */
-    if (m_detect_standalone_master)
-    {
-        if (standalone_master_required())
-        {
-            // Other servers have died, set last remaining server as master
-            set_standalone_master();
-        }
-        else
-        {
-            m_warn_set_standalone_master = true;
-        }
-    }
-
    if (m_master != NULL && m_master->is_master())
    {
        // Update cluster-wide values dependant on the current master.
@ -1273,10 +1255,10 @@ extern "C" MXS_MODULE* MXS_CREATE_MODULE()
            {"detect_stale_master", MXS_MODULE_PARAM_BOOL, "true"},
            {"detect_stale_slave",  MXS_MODULE_PARAM_BOOL, "true"},
            {"mysql51_replication", MXS_MODULE_PARAM_BOOL, "false", MXS_MODULE_OPT_DEPRECATED},
-            {"multimaster", MXS_MODULE_PARAM_BOOL, "false"},
-            {"detect_standalone_master", MXS_MODULE_PARAM_BOOL, "true"},
+            {"multimaster", MXS_MODULE_PARAM_BOOL, "false", MXS_MODULE_OPT_DEPRECATED},
+            {CN_DETECT_STANDALONE_MASTER, MXS_MODULE_PARAM_BOOL, "true"},
            {CN_FAILCOUNT, MXS_MODULE_PARAM_COUNT, "5"},
-            {"allow_cluster_recovery", MXS_MODULE_PARAM_BOOL, "true"},
+            {"allow_cluster_recovery", MXS_MODULE_PARAM_BOOL, "true", MXS_MODULE_OPT_DEPRECATED},
            {"ignore_external_masters", MXS_MODULE_PARAM_BOOL, "false"},
            {
                "script",
--- a/server/modules/monitor/mariadbmon/mariadbmon.hh
+++ b/server/modules/monitor/mariadbmon/mariadbmon.hh
@ -143,14 +143,11 @@ private:
    CycleInfo m_master_cycle_status;     /**< Info about master server cycle from previous round */

    // Replication topology detection settings
-    bool m_allow_cluster_recovery;       /**< Allow failed servers to rejoin the cluster */
    bool m_detect_replication_lag;       /**< Monitor flag for MySQL replication heartbeat */
-    bool m_detect_multimaster;           /**< Detect and handle multi-master topologies */
    bool m_detect_stale_master;          /**< Monitor flag for MySQL replication Stale Master detection */
    bool m_detect_stale_slave;           /**< Monitor flag for MySQL replication Stale Slave detection */
    bool m_detect_standalone_master;     /**< If standalone master are detected */
    bool m_ignore_external_masters;      /**< Ignore masters outside of the monitor configuration */
-    bool m_mysql51_replication;          /**< Use MySQL 5.1 replication */

    // Failover, switchover and rejoin settings
    bool m_auto_failover;                /**< Is automatic master failover is enabled? */
@ -174,7 +171,6 @@ private:
    // Other settings
    std::string m_script;                /**< Script to call when state changes occur on servers */
    uint64_t m_events;                   /**< enabled events */
-    bool m_warn_set_standalone_master;   /**< Log a warning when setting standalone master */
    bool m_log_no_master;                /**< Should it be logged that there is no master */
    bool m_warn_no_valid_in_cycle;       /**< Log a warning when a replication cycle has no valid master */
    bool m_warn_no_valid_outside_cycle;  /**< Log a warning when a replication topology has no valid master
@ -196,8 +192,6 @@ private:
    // Cluster discovery and status assignment methods
    void update_server(MariaDBServer& server);
    void find_graph_cycles();
-    bool standalone_master_required();
-    bool set_standalone_master();
    void log_master_changes();
    void update_gtid_domain();
    void update_external_master();
--- a/server/modules/monitor/mariadbmon/mariadbserver.cc
+++ b/server/modules/monitor/mariadbmon/mariadbserver.cc
@ -487,7 +487,7 @@ const char* MariaDBServer::name() const
    return m_server_base->server->name;
 }

-string MariaDBServer::diagnostics(bool multimaster) const
+string MariaDBServer::diagnostics() const
 {
    std::stringstream ss;
    ss << "Server:                 " << name() << "\n";
@ -507,14 +507,14 @@ string MariaDBServer::diagnostics(bool multimaster) const
    {
        ss << "Gtid binlog position:   " << m_gtid_binlog_pos.to_string() << "\n";
    }
-    if (multimaster)
+    if (m_node.cycle != NodeData::CYCLE_NONE)
    {
        ss << "Master group:           " << m_node.cycle << "\n";
    }
    return ss.str();
 }

-json_t* MariaDBServer::diagnostics_json(bool multimaster) const
+json_t* MariaDBServer::diagnostics_json() const
 {
    json_t* srv = json_object();
    json_object_set_new(srv, "name", json_string(name()));
@ -541,7 +541,7 @@ json_t* MariaDBServer::diagnostics_json(bool multimaster) const
        json_object_set_new(srv, "gtid_io_pos",
                            json_string(m_slave_status[0].gtid_io_pos.to_string().c_str()));
    }
-    if (multimaster)
+    if (m_node.cycle != NodeData::CYCLE_NONE)
    {
        json_object_set_new(srv, "master_group", json_integer(m_node.cycle));
    }
--- a/server/modules/monitor/mariadbmon/mariadbserver.hh
+++ b/server/modules/monitor/mariadbmon/mariadbserver.hh
@ -299,18 +299,16 @@ public:
    /**
     * Print server information to a json object.
     *
-     * @param multimaster Print multimaster group
     * @return Json diagnostics object
     */
-    json_t* diagnostics_json(bool multimaster) const;
+    json_t* diagnostics_json() const;

    /**
     * Print server information to a string.
     *
-     * @param multimaster Print multimaster group
     * @return Diagnostics string
     */
-    std::string diagnostics(bool multimaster) const;
+    std::string diagnostics() const;

    /**
     * Check if server is using gtid replication.