Using timestamps to detect whether MaxScale was active or passive can
cause problems if multiple events happen at the same time. This can be
avoided by separating events into actively observed and passively observed
events. This clarifies the logic by removing the ambiguity of timestamps.
As the monitoring threads are separate from the worker threads, it is
prudent to use atomic operations to modify and read the state of the
MaxScale. This will impose an happens-before relation between MaxScale
being set into passive mode and events being classified as being passively
observed.
The master failure can now be verified by checking when the slaves are
connected to the master. If the slaves do not receive any events from the
master, the connections are considered as down after a configurable limit.
Added two parameters for controlling whether the check is done and for how
long the monitor waits before doing the failover.
The slave heartbeat count and period are collected from the SHOW ALL
SLAVES STATUS output. This, in addition to the relay log position, is used
to calculate the point in time when a slave has last interacted with the
master.
By using this timestamp, the monitor can enforce a minimum "timeout" for
the master before a failover is performed.
Now SHOW ALL SLAVES STATUS reports new fields:
Retried_transactions;
Max_relay_log_size,
Executed_log_entries,
Slave_received_heartbeats,
Slave_heartbeat_period,
Gtid_Slave_Pos"
Currently binlog server doesn't send to slaves these event types:
- MARIADB10_START_ENCRYPTION_EVENT
- IGNORABLE_EVENT
It also skips events with LOG_EVENT_IGNORABLE_F flag.
This modification allows sending events with that flag.
Moved mon_process_failover() from monitor.cc to mysql_mon.cc. Renamed
some functions and variables related to previous failover functionality
to avoid confusion.
The get_server_info function takes the monitor handle and a database and
returns the corresponding MYSQL_SERVER_INFO struct. This hides a part of
the actual implementation of the info struct from the monitor code,
allowing future refactoring to be done. It also makes the code a bit more
readable.
The values in the MYSQL_SERVER_INFO struct can now be updated with the
update_slave_status function.
Also moved the number of configured and running slave configurations into
the info struct. This removes the need to pass output parameters.
If something is SELECTed that should be cached for some, but not
for the current user, the cached entry it nevertheless updated.
That way the cached data will always be the last fetched value
and it is also possible to use this behaviour for explicitly
updating the cache entry.
The MYSQL_SERVER_INFO struct is updated first and then the server status
is updated. This allows the function to be called without it affecting the
server state.
The JSON API specification states that all resources must support direct
modification of resource relationships by providing only the definition
for a particular relationship type to a /:type/:id/relationships/:type
endpoint.
The relevant part of the JSON API specification:
http://jsonapi.org/format/#crud-updating-to-many-relationships
The test did not properly move the relationships from the old monitor to
the new one. The test to passed as the relationship modification was not
really tested.
When pre-parsing the configuration file, the existence of environment
variables is only done for the [maxscale] section. For other sections
a nicer error message is obtained if the comlplaint is made when the
configuration file is actually loaded.
Mechanism for providing custom error message from the pre-parsing
function added.
If 'substitute_variables' has been set to true, then the value of
a parameter like `some_param=$SOME_VAR' is replaced with the value
of the environment variable 'SOME_VAR'.
It is a fatal error to refer to a variable that does not exist.
With this variables set to true, if $VAR is used as a value in the
configuration file, then `$VAR` will be replaced with the value of
the environment variable VAR.
When binary data was processed, it was possible that the values were
misinterpreted as OK packets which caused debug assertions to trigger.
In addition to this, readwritesplit did not handle the case when all
packets were routed individually.
The authentication phase expects full packets. If the packets aren't
complete a debug assertion would get hit. To detect this, the result of
the extracted buffer needs to be checked.
If multiple queries that only generate OK packets were executed, the
result returned by the server would consist of a chain of OK packets. This
special case needs to be handled by the modutil_count_signal_packets.
The current implementation is very ugly as it simulates a result with at
least one resultset in it. A better implementation would hide it behind a
simple boolean return value and an internal state object.
A multi-statements can return multiple resultsets in one response. To
accommodate for this, both the readwritesplit and modutil code must be
altered.
By ignoring complete resultsets in readwritesplit, the code can deduce
whether a result is complete or not.
The original offset needs to be separately tracked to assert that an OK
packet is not the first packet in the buffer. The functional offset into
the buffer is modified to reduce the need to iterate over buffers that
have already been processed.
When LEAST_BEHIND_MASTER routing criteria was used, the info level logging
function would fall through to the default case. In debug builds, this
would trigger a debug assertion.
Returning the results of a query as a set of packets is currently more
efficient. This is mainly due to the fact that each individual packet for
single packet routing is allocated from the heap which causes a
significant loss in performance.
Took the new capability into use in readwritesplit and modified the
reply_is_complete function to work with non-contiguous results.
The GLIBC backtrace functionality doesn't generate file names and line
numbers in the generated stacktrace. This can to be done manually by
executing a set of system commands.
Conceptually doing non-signal-safe operations in a signal handler is very
wrong but as stacktraces are only printed when something has gone horribly
wrong, there is no real need to worry about making things worse.
As a safeguard for fatal errors while the stacktrace is being generated,
it is first dumped into the standard error output of the process. This
will function even if malloc is corrupted.
When the router requires statement based output, the gathering of complete
packets can be skipped as the process of splitting the complete packets
into individual packets implies that only complete packets are handled.
Also added a quicker check for stored protocol commands than a call to
protocol_get_srv_command.
The backend protocol command tracking didn't check whether the session was
the dummy session. The DCB's session is always set to this value when it
is put into the persistent pool.
If a query was processed in the client protocol module when a prepared
statement was being executed by the backend module, the current command
would get overwritten. This caused a debug assertion in readwritesplit to
trigger as the result was neither a single packet nor a collected result.
The RCAP_TYPE_STMT_INPUT capability guarantees that a buffer contains a
complete packet. This information can be used to track the currently
executed command based on the buffer contents which allows asynchronicity
betweent the client and backend protocol. In practice this only comes in
play when routers queue queries for later execution.
The GWBUF shared buffer and its data is now allocated in one
chunk so that the data directly follows the shared buffer.
That way, creating a GWBUF will involve 2 and not 3 calls to
malloc and freeing one will involve 2 and not 3 calls to free.