Commit Graph

12087 Commits

Author SHA1 Message Date
f29e5b65de MXS-2057 systemd watchdog
Systemd wathdog notification at a little more than 2/3 of the
systemd configured time. In the service config (maxscale.service)
add e.g. WatchdogSec=30s to set and enable the watchdog.
For building: install libsystemd-dev.

The next commit will modify cmake configuration and code to
conditionally compile the new code based on existence of libsystemd-dev.
2018-11-09 16:45:59 +02:00
2a6df0e724 Merge branch '2.2' into 2.3 2018-11-09 14:22:28 +02:00
00eb7cb4ee Automatically stop secondary MaxScale
If the test uses two MaxScales, they are automatically stopped after the
test. This prevents the second MaxScale from interfering with subsequent
tests.
2018-11-09 12:13:22 +02:00
0a9d24230a Fix cross-thread buffer usage
If the initial handshake that is sent by the accepting thread is buffered,
the subsequent flushing of it is done by the owning thread. As
cross-thread buffer usage is not allowed, the initial handshake must be
sent by the owning thread.
2018-11-09 12:13:22 +02:00
514b5f7856 Fix mxs359_read_only
Due to the changes in the monitor, an explicit failcount=1 and extra waits
are required to make sure the master actually changes.
2018-11-09 12:13:22 +02:00
a9e2364979 Fix unknown PS ID on query re-routing
If a PS command is routed multiple times, the ID will not be reverted to
the external ID in the failure cases. This prevented prepared statements
from being re-routed correctly.
2018-11-09 12:13:22 +02:00
226fe4871d Log mysql error message in mxs1776_ps_exec_hang
If the test fails, log the error message. This should help understand why
the test failed.
2018-11-09 12:13:21 +02:00
5b3a209643 Update the masking documentation 2018-11-09 12:05:13 +02:00
00d0ec5f8e Move wait_for_maxscale functionality inside MaxScale
By exposing a (currently undocumented) debug endpoint that lets one
monitor interval pass, we make the reuse of the monitor waiting
functionality a lot easier. With it, when MaxScale is started by the test
framework it knows that at least one monitor interval will have passed for
all monitors and that the system is ready to accept queries.
2018-11-09 09:13:27 +02:00
a53dbeec57 Always use service restart for startup
By starting MaxScale with `service restart maxscale`, the start() function
is idempotent: MaxScale is started from a stopped state.
2018-11-09 09:13:27 +02:00
eb1bc0b768 Add more error logging to Galera checks
The reason for the failure is now logged.
2018-11-09 09:13:27 +02:00
f085abf720 Use one ssh connection for block/unblock operations
As the ssh_node_f function supports full shell syntax, all of the work can
be done with a single ssh connection. This removes the overhead that each
extra ssh connection adds.
2018-11-09 09:13:27 +02:00
4f3ae823a9 Speed up log copying
The collection of the various artifacts generated by a test case and the
core dump detection is now done in the same SSH command. This removes the
extra overhead that it added.
2018-11-09 09:13:27 +02:00
ff9b26b7fa Remove unnecessary SSHing to MaxScale at test startup
There were a total of five SSH connections opened at the start of each
test. Only two of these are currently required: the SSL certificate
directory check and the actual command that restarts MaxScale. Two of the
three remaining commands, stopping of MaxScale and copying of the
configuration, can be made conditional or combined into other
commands.

The stopping of MaxScale is done to prevent it from interfering with the
cluster setup process. As MaxScale does nothing if nothing is wrong, it is
safe to make the restart conditional so that it is done only when a
problem in the cluster setup is detected.

The final SSH command, the MaxScale health check via maxadmin, can be
removed as it is redundant: the daemonization already covers this by
exiting only after MaxScale is ready.
2018-11-09 09:13:26 +02:00
4e3d1a29b6 Clean up Galera_nodes::check_galera
The code can be simplified as only one of the nodes needs to be checked to
see how many nodes are in the cluster.
2018-11-09 09:13:26 +02:00
69bf3a90d3 Fix and improve Galera startup
A certain templated parameter was only substituted when the VMs were
provisioned. This needs to be handled by the test framework to allow
changes into Galera clusters configuration.

Also made the startup of the "lesser" nodes parallel so minimize the
startup time.
2018-11-09 09:13:26 +02:00
2ac1656fc7 Fix galera initialization
The galera configurations need pre-processing before they can be
used. Switched to std::endl to automatically flush the output at the end
of each line. This makes it easier to see what is happening when the tests
are ran by buildbot. Also removed the extra startup of the servers that
was done right after installing the database.
2018-11-09 09:13:26 +02:00
4e87d7da4c Remove unused files
The files weren't used or were built but not used.
2018-11-09 09:13:26 +02:00
6479656445 Remove commented out tests
The tests weren't enabled in 2.1 so they are unlikely to be up to date.
2018-11-09 09:13:26 +02:00
d7e809f525 Group blr and avrorouter tests
Grouped all binlogrouter and avrorouter tests so that they are executed as
the last tests. This helps prevent some side effects that result from the
"aggressive" replication modifications the tests do. Also removed some
commented out test cases.
2018-11-09 09:13:25 +02:00
8f542d05ba Organize tests by backend server type
Grouping the tests helps detect Galera specific problems.
2018-11-09 09:13:25 +02:00
70dfb447a2 Use normal config for bug681
The test doesn't require Galera backends.
2018-11-09 09:13:25 +02:00
df003a3e7c Use normal config for server_weight
The backends used for the test don't have to be Galera servers as the
functionality is generic.
2018-11-09 09:13:25 +02:00
8c9ecf2756 Remove redundant tests
The tests tested generic functionality and the backend type should not
affect the test results.
2018-11-09 09:13:25 +02:00
ccec2a387a Fix replication in parallel
If the replication is broken between the nodes, it is now fixed in
parallel on all nodes instead of doing it one server at a time.

This reduces the time from about 120 seconds to 13 seconds. The time was
measured by running the check_backend test first with all backends broken
and then with the fixed backends subtracting time of the latter from the
former.
2018-11-09 09:13:25 +02:00
04e4f17618 Sort tests by replication type
The tests that require GTID replication are now all grouped together. This
removes the need to reconfigure the test environment multiple times.
2018-11-09 09:13:25 +02:00
3a5b49caf1 Speed up mxs1751_available_when_donor_crash
As the wait_for_monitor function guarantees that the monitor notices the
state change, we can skip the replication fixing which was somewhat
superficial in the first place.
2018-11-09 09:13:25 +02:00
c523bf74b8 Rewrite binlog_change_master tests
The tests were consistently unstable and as a result of this did not
provide any actionable output. In addition to this these two test were the
longest running tests in the whole MaxScale test suite so a re-design was
warranted.

Instead of emulating a client and a server failure, testing functionality
provides for a test that is faster, more precise and provides more
actionable output. Due to the single-threadedness of the new test, no
cross-thread depencies are present. In addition to this, the superfluous
log flushing was not done as it almost always happened after all
transactions were already complete.

The estimated savings in test time alone is around 1100 seconds (roughly
18 minutes).
2018-11-09 09:13:25 +02:00
b77d5568d8 Add output to mxs1743_rconn_bitmask
This helps analyze why the test is hanging when the slaves are synced.
2018-11-09 09:13:24 +02:00
aadd6f38dc MXS-2153: Update available_when_donor documentation
Added a more precise description of what the parameter does, what the
accepted values are and where to get more information.
2018-11-09 08:04:45 +02:00
293d45aaf1 Limit line length in galeramon documentation
Split long lines that exceeded 80 characters.
2018-11-09 07:45:02 +02:00
bfc8cb4803 MXS-2151: Always log fatal master connection errors
When the connection to the master is broken, the session is not configured
to use the read-only modes and the monitor can still connect to the
server, the connection will be closed and and error is sent to the
client. To leave some trace of this problem in the MaxScale logs, a
message should always be logged when a network error occurs.
2018-11-09 00:39:32 +02:00
ecc7442358 Detect manual commands faster
Previous, MariaDBMonitor would wait until the next monitor interval before detecting
a new manual command. The commands are now checked every 100 ms.
2018-11-08 19:12:00 +02:00
8058e46309 Add link to maxctrl 2018-11-08 13:52:54 +02:00
112c25ab1f MXS-1780 Update release notes 2018-11-08 13:50:42 +02:00
d65269fabb Always process responses inside RootResource
The REST API would skip propagation of the requests to the RootResource if
it was a request to the empty resource.
2018-11-08 12:14:36 +02:00
7d54df74dc Hard-code minimum suggestion distance to 4
This way only close matches are suggested.
2018-11-08 12:14:36 +02:00
809d3549ae MXS-2149 Add REST-API watchdog
This will simply cause a task to be posted to each worker.
If the workers are running normally, the task will reach the
workers and the associated semaphore posted, and the REST-API
call will return. If any worker is not running normally, the
task will not be processed and the REST-API call will hang.
2018-11-08 12:13:02 +02:00
1ca03fb85c MXS-1780 Show last queries of session
'maxctrl show sessions' now show last queries of session, if
the retaining of statements has been enabled.
2018-11-08 12:08:42 +02:00
99bd621874 MXS-1780 Adjust maxctrl output according to console width
Figure out the console width and adjust output accordingly.
In default mode use '\n' as separator (necessary for making the
session query output sensible) and in tsv mode ','.
2018-11-08 12:08:42 +02:00
32f2e769f4 MXS-1780 Make retain_last_statements service specific 2018-11-08 12:08:42 +02:00
2bd2b4a32e MXS-1780 Report times using localtime instead of gmtime 2018-11-08 12:08:42 +02:00
dd712a06fa MXS-1780 Make statement retaining session specific
Whether or not a session should retain its statements is now
a property of the session. This in preparation for making the
whole functionality a property that can be enabled and disabled
at runtime, of the service.
2018-11-08 12:08:42 +02:00
c899f00541 MXS-1780 Collect server response information
As the router is the only one that knows what backends a particular
statement has been sent to, it is the responsibility of the router
to keep the session bookkeeping up to date. If it doesn't we will
know what statements a session has received (provided at least some
component in the routing chain has RCAP_TYPE_STMT_INPUT capability),
but not how long their processing took. Currently only readwritesplit
does that.

All queries are stored and not just COM_QUERY as that makes the
overall bookkeeping simpler; at clientReply() time we do not need to
know whether or not to bookkeep information, we can just do it.

When session information is queried for, we report as much information
we have available.
2018-11-08 12:04:55 +02:00
c78c5a615d MXS-1780 Store time when statement was received 2018-11-08 12:03:50 +02:00
c6378e1006 MXS-1780 Make retained statements available via REST-API 2018-11-08 12:03:50 +02:00
fa13b8036a MXS-1780 Store last statement as cloned GWBUF
The last statements of a session are now stored as a cloned
GWBUF instead of as a copy of the SQL.
2018-11-08 12:03:50 +02:00
3ccdb508de Fix bug in roulette wheel
Slot values were changed after the total was calculated. Fix bug
and adjust the offending code.
2018-11-08 10:50:23 +02:00
c692c864e2 MXS-2078 Take new statistics into use 2018-11-08 10:44:32 +02:00
5175d2b2d7 MXS-2078 Add support for holding router specific server data.
New class to hold the statistics, part of which is currently in
RWSplitSession. Simple API in Backend to create session
specific data.
2018-11-08 10:44:32 +02:00