Fix for broken replication

Fix for broken replication has been added to mysql_monitor.
Both Slave_IO  and Slave_SQL threads must be running in order to assign
the SERVER_SLAVE status but If only Slave_IO is running let’s assign
the master_id to current server and continue building the replication
tree; if no slaves at all the master will be still available.
The “detect_stale_master” option has been added, its default is 0.
If set to 1 the monitor will keep the last detected master even if the
replication setup is completely not working, i.e. both Slave_IO  and
Slave_SQL threads are not running: this applies only to the server that
was master before.
After monitor or MaxScale are restarted and the replication is still
stopped or not configured there will be no master because it’s not
possible to compute the replication topology tree.
This commit is contained in:
MassimilianoPinto
2014-09-01 11:18:57 +02:00
parent 493feb49ba
commit 63d267e5ef
8 changed files with 127 additions and 32 deletions

View File

@ -34,6 +34,7 @@
* 29/05/14 Mark Riddoch Addition of filter definition
* 23/05/14 Massimiliano Pinto Added automatic set of maxscale-id: first listening ipv4_raw + port + pid
* 28/05/14 Massimiliano Pinto Added detect_replication_lag parameter
* 28/08/14 Massimiliano Pinto Added detect_stale_master parameter
*
* @endverbatim
*/
@ -650,6 +651,7 @@ int error_count = 0;
char *passwd;
unsigned long interval = 0;
int replication_heartbeat = 0;
int detect_stale_master = 0;
module = config_get_value(obj->parameters, "module");
servers = config_get_value(obj->parameters, "servers");
@ -663,6 +665,10 @@ int error_count = 0;
replication_heartbeat = atoi(config_get_value(obj->parameters, "detect_replication_lag"));
}
if (config_get_value(obj->parameters, "detect_stale_master")) {
detect_stale_master = atoi(config_get_value(obj->parameters, "detect_stale_master"));
}
if (module)
{
obj->element = monitor_alloc(obj->object, module);
@ -686,6 +692,10 @@ int error_count = 0;
if(replication_heartbeat == 1)
monitorSetReplicationHeartbeat(obj->element, replication_heartbeat);
/* detect stale master */
if(detect_stale_master == 1)
monitorDetectStaleMaster(obj->element, detect_stale_master);
/* get the servers to monitor */
s = strtok(servers, ",");
while (s)
@ -1346,6 +1356,7 @@ static char *monitor_params[] =
"passwd",
"monitor_interval",
"detect_replication_lag",
"detect_stale_master",
NULL
};
/**