Commit Graph

12264 Commits

Author SHA1 Message Date
54370618bc Stop keepalived after the tests
Once the tests are done keepalived must be stopped. This is done to
prevent it from affecting other tests.
2018-11-14 16:23:47 +02:00
07231747bf Print server logs on failure to start
When the MariaDB server exits with an error, the logs help explain why
that happened.
2018-11-14 16:23:47 +02:00
370483fb4b Log slave error message on failed session command
If the master succeeds in executing a session command but the slave fails,
the error message could help explain why it failed. At the moment this is
mainly relevant for inspection of test results.
2018-11-14 16:23:46 +02:00
e3c9ac9e98 Document csmon grants
Documented the grants the monitor user must have in order to operate
properly.
2018-11-14 16:23:46 +02:00
77f8a3b71b MXS-2057 System test 2018-11-14 16:20:42 +02:00
e371964f8b MXS-2057 Add log output for system test, and two random code fixes 2018-11-14 16:20:42 +02:00
a84748e67f MXS-2057 Add documentation 2018-11-14 16:20:42 +02:00
2650a9174e Fix addition of galera config options
The options weren't properly set as the galera nodes used different file
names.
2018-11-14 13:08:27 +02:00
896ce87332 Allow loading any plugin for tests
Required for 'disks'-plugin.
2018-11-14 11:05:23 +02:00
a377a9fc5a Add gtid event in reset-replication
Adds a "FLUSH TABLES" command at the end so that the new master has a non-empty
gtid_binlog_pos after the operation.
2018-11-14 11:01:48 +02:00
14e38e4e08 MXS-2158 Return true if update_gtids() succeeds, even if no data is returned
Previously, if the server had no gtid:s, the method would fail leading to
a confusing error message. This could even totally stop the monitor from working
if a recent server version (10.X) did not have any gtid events.
2018-11-14 10:56:42 +02:00
f03c5e0fef MXS-2077 Expand 'maxctrl list sessions' somewhat
'maxctrl list sessions' will now show the connection
time and idleness in addition to the id, user, host
and service of the session. Further, the columns have
be reordered somewhat so that the id, user and host are
shown first, and the service last.
2018-11-14 09:52:15 +02:00
433c6708bf Merge branch '2.2' into 2.3 2018-11-13 17:35:45 +02:00
cdce11391a Add script template into Luafilter documentation
This makes copy-pasting it for testing a lot easier.
2018-11-13 16:48:03 +02:00
c32bb18862 Fix transaction replay checksum mismatches
The transaction replay could get mixed up with new queries if the client
managed to perform one while the delayed routing was taking place. A
proper way to solve this would be to cork the client DCB until the
transaction is fully replayed. As this change would be relatively more
complex compared to simply labeling queries that are being retried the
corking implementation is left for later when a more complete solution can
be designed.

This commit also adds some of the missing info logging for the transaction
replaying which makes analysis of failures easier.
2018-11-13 16:48:03 +02:00
0355398425 Fix typo in unblock_node
The command is called ip6tables.
2018-11-13 16:48:03 +02:00
ae1a062a58 MXS-2160 Use CLOCK_MONOTONIC_COARSE
We measure time in milliseconds and as CLOCK_MONOTONIC_COARSE
provides 1ms granularity we should use that since it is cheaper.
2018-11-13 16:44:30 +02:00
fb84b2690a MXS-2159: Combine client capability bits
If the client sends two different sets of capability bits during the
authentication phase of an SSL enabled connection, both sets need to be
combined. This prevents capabilities from degrading mid-connection which
is the case when Oracle Connector/J drops the SSL capability bit
mid-authentication.
2018-11-13 11:37:48 +02:00
ad52834c9b Merge branch '2.2' into 2.3 2018-11-12 14:51:49 +02:00
f7db955101 Add proxy protocol test
The test creates a user with only the client ip as allowed host and
then uses that client to log in.
2018-11-12 14:33:59 +02:00
b59eb28802 Merge branch '2.2' into 2.3 2018-11-12 12:51:18 +02:00
ae0e9b359d Fix use of zero-weight servers
The servers with a zero weight would be always used over ones that have a
weight. This means that the behavior was inverted and caused the
mxs2054_hybrid_cluster test to fail in 2.3.

Also fixed a typo in the deprecation message.
2018-11-12 10:13:59 +02:00
f2688784cf Reconnect before sync in mxs1743_rconn_bitmask
The blocking of the nodes that happens before it could cause the
connections to break. This also removes the need for the fixing of the
replication which takes time.
2018-11-12 10:13:59 +02:00
b443bb7525 Store PS session commands with internal ID
Commit a9e236497963251f8b4afa07484b88ad97e73a03 changed where the PS ID
for a binary protocol command is replaced with the internal form. This
caused prepared statements that are also session commands to be always
routed with the external ID.

As the external ID is almost always the master's ID, the aforementioned
bug resulted in odd side-effects and the true cause of these was only
revealed when the error message sent by the slave was included in the log
messages.
2018-11-12 10:13:59 +02:00
7e54cb8132 Fix crash in cat
The router used the wrong capabilities and results weren't delivered as
complete and contiguous packets.
2018-11-12 10:13:22 +02:00
f4dd0628da Fix COM_CHANGE_USER handling
If the service doesn't require collection of complete packets, the user
reauthentication done with COM_CHANGE_USER would be skipped. This caused
the change_user test to fail.

By temporarily switching to full packet collection mode for the duration
of the COM_CHANGE_USER, we avoid duplicating the code for the streaming
router types.
2018-11-11 17:19:52 +02:00
1108132cbd MXS-2057 Do not require systemd libraries
Exclude systemd usage if the library is not installed.
Only excluding what is necessary. This keeps the object size the
same and still compiles most of the code.
2018-11-09 16:45:59 +02:00
f29e5b65de MXS-2057 systemd watchdog
Systemd wathdog notification at a little more than 2/3 of the
systemd configured time. In the service config (maxscale.service)
add e.g. WatchdogSec=30s to set and enable the watchdog.
For building: install libsystemd-dev.

The next commit will modify cmake configuration and code to
conditionally compile the new code based on existence of libsystemd-dev.
2018-11-09 16:45:59 +02:00
2a6df0e724 Merge branch '2.2' into 2.3 2018-11-09 14:22:28 +02:00
00eb7cb4ee Automatically stop secondary MaxScale
If the test uses two MaxScales, they are automatically stopped after the
test. This prevents the second MaxScale from interfering with subsequent
tests.
2018-11-09 12:13:22 +02:00
0a9d24230a Fix cross-thread buffer usage
If the initial handshake that is sent by the accepting thread is buffered,
the subsequent flushing of it is done by the owning thread. As
cross-thread buffer usage is not allowed, the initial handshake must be
sent by the owning thread.
2018-11-09 12:13:22 +02:00
514b5f7856 Fix mxs359_read_only
Due to the changes in the monitor, an explicit failcount=1 and extra waits
are required to make sure the master actually changes.
2018-11-09 12:13:22 +02:00
a9e2364979 Fix unknown PS ID on query re-routing
If a PS command is routed multiple times, the ID will not be reverted to
the external ID in the failure cases. This prevented prepared statements
from being re-routed correctly.
2018-11-09 12:13:22 +02:00
226fe4871d Log mysql error message in mxs1776_ps_exec_hang
If the test fails, log the error message. This should help understand why
the test failed.
2018-11-09 12:13:21 +02:00
5b3a209643 Update the masking documentation 2018-11-09 12:05:13 +02:00
00d0ec5f8e Move wait_for_maxscale functionality inside MaxScale
By exposing a (currently undocumented) debug endpoint that lets one
monitor interval pass, we make the reuse of the monitor waiting
functionality a lot easier. With it, when MaxScale is started by the test
framework it knows that at least one monitor interval will have passed for
all monitors and that the system is ready to accept queries.
2018-11-09 09:13:27 +02:00
a53dbeec57 Always use service restart for startup
By starting MaxScale with `service restart maxscale`, the start() function
is idempotent: MaxScale is started from a stopped state.
2018-11-09 09:13:27 +02:00
eb1bc0b768 Add more error logging to Galera checks
The reason for the failure is now logged.
2018-11-09 09:13:27 +02:00
f085abf720 Use one ssh connection for block/unblock operations
As the ssh_node_f function supports full shell syntax, all of the work can
be done with a single ssh connection. This removes the overhead that each
extra ssh connection adds.
2018-11-09 09:13:27 +02:00
4f3ae823a9 Speed up log copying
The collection of the various artifacts generated by a test case and the
core dump detection is now done in the same SSH command. This removes the
extra overhead that it added.
2018-11-09 09:13:27 +02:00
ff9b26b7fa Remove unnecessary SSHing to MaxScale at test startup
There were a total of five SSH connections opened at the start of each
test. Only two of these are currently required: the SSL certificate
directory check and the actual command that restarts MaxScale. Two of the
three remaining commands, stopping of MaxScale and copying of the
configuration, can be made conditional or combined into other
commands.

The stopping of MaxScale is done to prevent it from interfering with the
cluster setup process. As MaxScale does nothing if nothing is wrong, it is
safe to make the restart conditional so that it is done only when a
problem in the cluster setup is detected.

The final SSH command, the MaxScale health check via maxadmin, can be
removed as it is redundant: the daemonization already covers this by
exiting only after MaxScale is ready.
2018-11-09 09:13:26 +02:00
4e3d1a29b6 Clean up Galera_nodes::check_galera
The code can be simplified as only one of the nodes needs to be checked to
see how many nodes are in the cluster.
2018-11-09 09:13:26 +02:00
69bf3a90d3 Fix and improve Galera startup
A certain templated parameter was only substituted when the VMs were
provisioned. This needs to be handled by the test framework to allow
changes into Galera clusters configuration.

Also made the startup of the "lesser" nodes parallel so minimize the
startup time.
2018-11-09 09:13:26 +02:00
2ac1656fc7 Fix galera initialization
The galera configurations need pre-processing before they can be
used. Switched to std::endl to automatically flush the output at the end
of each line. This makes it easier to see what is happening when the tests
are ran by buildbot. Also removed the extra startup of the servers that
was done right after installing the database.
2018-11-09 09:13:26 +02:00
4e87d7da4c Remove unused files
The files weren't used or were built but not used.
2018-11-09 09:13:26 +02:00
6479656445 Remove commented out tests
The tests weren't enabled in 2.1 so they are unlikely to be up to date.
2018-11-09 09:13:26 +02:00
d7e809f525 Group blr and avrorouter tests
Grouped all binlogrouter and avrorouter tests so that they are executed as
the last tests. This helps prevent some side effects that result from the
"aggressive" replication modifications the tests do. Also removed some
commented out test cases.
2018-11-09 09:13:25 +02:00
8f542d05ba Organize tests by backend server type
Grouping the tests helps detect Galera specific problems.
2018-11-09 09:13:25 +02:00
70dfb447a2 Use normal config for bug681
The test doesn't require Galera backends.
2018-11-09 09:13:25 +02:00
df003a3e7c Use normal config for server_weight
The backends used for the test don't have to be Galera servers as the
functionality is generic.
2018-11-09 09:13:25 +02:00