All monitors now persist the state of the server in a monitor journal
file.
Moved the removal of stale journals into the core and removed them from
the monitor journal interface.
Enabling the option hinders the use of maintenance mode with the root
master node in most use-cases.
This behavior occurs due to the fact that the maintenance mode causes a
server to be treted as if it was down. The Galera monitor waits for the
cluster to reorganize before assigning a new master node. This is correct
(but very unexpected) behavior for single instance use-cases.
The multimaster node detection uses stacks to sort the node groups. The
size of this stack was always assumed to be positive but it was possible
that it dropped down to -1 causing a crash when the stack was accessed
with the index number.
The multimaster node detection uses stacks to sort the node groups. The
size of this stack was always assumed to be positive but it was possible
that it dropped down to -1 causing a crash when the stack was accessed
with the index number.
The function should actually be in include/maxscale/protocol/mysql.h
and the implementation in MySQLCommon, but the monitors do not link
to that yet.
All MySQL related should be moved to MySQLCommon and the core
refactored so that no MySQL knowledge is needed there.
Whenever a server which is a slave of an external master is detected, it
will be assigned the slave status. This will allow the status to be used
the way it was intended to be used.
The MySQL monitor replication_heartbeat table now uses the default storage
engine of the database when creating the table. Most of the time the
default is InnoDB which makes the table crash-safe.
If a monitor is started and stopped before the external monitoring thread
has had time to start, a deadlock will occur.
The first thing that the monitoring threads do is read the monitor handle
from the monitor object. This handle is given as the return value of
startMonitor and it is stored in the monitor object. As this can still be
NULL when the monitor thread starts, the threads use locks to prevent
this.
The correct way to prevent this is to pass the handle as the thread
parameter so that no locks are required.
This number (defaults to 1) sets how many times mon_connect_to_db
will try to connect to a backend before returning an error. Every
connection attempt may take backend_connect_timeout seconds to
complete.
Also refactored code a bit. Renamed mon_connect_to_db to
mon_ping_or_connect_to_db, since it does not connect if the connection
is already alive.
When log messages are written with both address and port information, IPv6
addresses can cause confusion if the normal address:port formatting is
used. The RFC 3986 suggests that all IPv6 addresses are expressed as a
bracket enclosed address optionally followed by the port that is separate
from the address by a colon.
In practice, the "all interfaces" address and port number 3306 can be
written in IPv4 numbers-and-dots notation as 0.0.0.0:3306 and in IPv6
notation as [::]:3306. Using the latter format in log messages keeps the
output consistent with all types of addresses.
The details of the standard can be found at the following addresses:
https://www.ietf.org/rfc/rfc3986.txthttps://www.rfc-editor.org/std/std66.txt
The static capabilities declared in getCapabilities allows certain
capabilities to be queried before instances are created. The intended use
of this capability is to remove the need for the `is_internal_service`
function.
The server state information should be persisted even if a controlled
shutdown is done. This will allow the monitor to retain the server state
information across a restart.
The MySQL monitor stores the server states in a backup file which can be
used to restore the state of the servers even if MaxScale is stoppen in an
uncontrolled fashion.
When a standalone master server is detected, it should receive the stale
status to prevent it from losing the master status if another server is
started and allow_cluster_recovery is enabled.
When the real root master server went down, it still received the master
status bit due to how the replication tree was built. The bit should only
be set for servers that are running.
Also fixed a false state change event when the master status bit was
manually cleared from the downed root master server.
If all but one server in a cluster fail and `failover` is enabled for
mysqlmon, the last server would be used as if it were a master. With this
change, the restrictions on failover also require that the last server is
not configured as a slave.
This change will prevent unintended failovers from happening when network
connectivity is bad. It also allows external actors to clear the slave
configuration from the last remaining server to signal MaxScale that the
server can be used as a master.
The `failover_recovery` option allows failed servers to rejoin the
cluster. This should make using MaxScale with two node clusters easier.
One use case for this is when the replication-manager promotes the last
node in the cluster as the master. When this is done, the slave
configuration is cleared and the read-only mode is disabled. Since the
failover requires that the server is not configured as a slave and that it
is not in read-only mode, it is safe to use `failover_recovery` with
replication-manager.
Monitored nodes could be part of different cluster UUIDs: select only
the ones belonging to UUID with more joined nodes.
In case of different UUIDs if the joined numbers is less than (n_nodes
/ 2 ) + 1 don’t consider any node part of the cluster
Moved some typedefs to router.h and server.h, changed a few
constants to these enums. Renamed some types in config.h to
remove "Gateway".
There are still some functions in the public header which are
only used in core, but they seem to fit the theme of public functions
so were not moved.