Commit Graph

2925 Commits

Author SHA1 Message Date
55276be6f2 MXS-1699: Log progress messages at startup
When MaxScale is starting, the loading of the listeners can take a while
if there are a large number of services and users to load. To signal this
to the user, progress messages should be logged after every service is
started.
2018-03-06 15:56:07 +02:00
86eae02366 Log message on failed worker message
When a worker message fails, an error message should be logged to know why
it failed.
2018-03-06 13:35:15 +02:00
b619fb0707 MXS-1699: Log progress messages at startup
When MaxScale is starting, the loading of the listeners can take a while
if there are a large number of services and users to load. To signal this
to the user, progress messages should be logged after every service is
started.
2018-03-06 12:47:20 +02:00
ea83420620 Merge branch '2.2' into develop 2018-03-01 18:08:45 +02:00
a197e6c859 Remove unnecessary code
A descriptor is always added to the global epoll instance or
to a specific worker, never just to _any_ worker.
2018-02-28 20:11:27 +02:00
236e906d88 Revert "Turn MariaDB Monitor struct to class with public fields"
This reverts commit cb6f70119d9857b277306e9af5881fe29c574a32.
2018-02-24 15:37:50 +02:00
e385e6fcd4 Merge branch '2.2' into develop 2018-02-22 10:13:55 +02:00
6e9e83ccaf MXS-1674 Change load granularity to 1 second
With a granularity of 1 second, the load will from a human
perspective reflect the current situation. That also means
that the maxadmin output shows "natural" steps; 1s, 1m and 1h.
2018-02-21 13:05:58 +02:00
cb6f70119d Turn MariaDB Monitor struct to class with public fields
Allows using std::string for strings. Also, cleanup.
2018-02-21 11:00:42 +02:00
fd4fd4eead MXS-1674 Add worker load calculation
By definition, the load is calculated using the following formula:

  L = 100 * ((T - t) / T)

where T is a time period and t the time of that period that the worker
spends in epoll_wait(). So, if there is so much work that epoll_wait()
always returns immediately, then the load is 100 and if the thread
spends the entire period in epoll_wait(), then the load is 0.

The basic idea is that the timeout given to epoll_wait() is adjusted
so that epoll_wait() will always return roughly at 10 seconds interval.
By making a note of when we are about to enter epoll_wait() and when we
return from it, we have all the information we need for calculating the
load.

Due to the nature of things, we will not be able to calculate the load
at exact 10-second boundaries, but it will be pretty close. And the load
is always calculated using the true length of the period.

We will then calculate 1 minute load by averaging the load value for 6
consecutive 10-second periods and the 1 hour load by averaging the load
value of 60 consecutive 1 minute loads.

So, while the 10-second load represents the load of the most recently
measured 10-second period (and not the load of the most recent 10
seconds), the 1 minute load and the 1 hour load represents the load of
the most recent minute and hour respectively.
2018-02-20 09:18:43 +02:00
350eaf0e90 Add complete set of atomit_store-operations 2018-02-16 15:06:24 +02:00
1042b861bb MXS-1669: Fix load average tracking
The output of `show threads` could have a negative historic thread load
average that could be explained by the overflow of the signed 32-bit
integer used to count the number of samples.

The time that each thread started to process an event for a DCB used an
old value that is no longer used. Updating this to DCB::last_read retains
the 2.0 behavior.
2018-02-15 11:18:22 +02:00
f388e2f838 Merge branch '2.2' into develop 2018-02-12 14:00:40 +02:00
b8d3da4968 Add error tolerance to "servers_no_promotion"
Previously, if the list contained servers that were not monitored by
the monitor yet were valid servers, an error value would be returned
and the monitor failed to start.

With this update, the non-monitored servers are simply ignored when
forming the final list.

Also, added printing of the list to diagnostics.
2018-02-12 10:49:28 +02:00
faaf43ff39 Add gtid to monitor diagnostics, clean up formatting
Gtid:s are now queried every monitor loop.

dignostics() no longer prints slave related info if the server has
no slave connection.
2018-02-10 12:32:56 +02:00
e346968e0e Merge branch '2.1' into 2.2 2018-02-10 08:28:11 +02:00
b4760c5bbe MXS-1661 Introduce 'users_refresh_time'
It is now possible to explicitly specify how frequently MaxScale
may refresh the users of a service.
2018-02-09 13:33:17 +02:00
ae160f3ff2 MXS-1661 Now only the time affects the reloading of users
Now the users will be reloaded at most once during each
USERS_REFRESH_TIME period. Earlier they could be reloaded at
at most USERS_REFRESH_MAX_PER_TIME times, which in practice meant
that with repeated unauthorized login attempts they were reloaded
N times in rapid succession, without the situation being likely to
change in between.
2018-02-09 13:33:17 +02:00
b23ad6d2ef MXS-1661 Turn error into warning and suppress logging
The error regarding the refresh rate having been exceeded

    error: [RWSplit] Refresh rate limit exceeded ...

has been turned into a warning. Further, the warning will be
logged at most once per refresh period that currently is 30s.
2018-02-09 13:33:17 +02:00
83ce603e3e Fix CentOS 6 build failure
The __sync load builtins do not work with pointers to constant variables.
2018-02-09 09:19:46 +02:00
fcde23e6fe Merge branch '2.2' into develop 2018-02-08 18:40:29 +02:00
fa37198da1 MXS-1653: Fix hang on preparation of BEGIN
When a BEGIN statement is prepared using the binary protocol, it returns a
single OK packet. Due to a bug in the code that deals with multi-statement
results and EOF packets, the response was never sent to the client.

Also added back the error messages of failed session commands to the INFO
level. This way it's still possible to see why a session command fails but
the log isn't flooded by them in normal usage.
2018-02-08 16:59:00 +02:00
a6fc2d3f88 Add missing <string> header
load_utils.cc used std::string without including the header.
2018-02-08 13:40:11 +02:00
5326c8db5c Merge branch '2.2' into develop 2018-02-08 12:48:06 +02:00
13498d8ec8 MXS-1655 Allow symbolic links to files in config directory 2018-02-07 16:43:13 +02:00
fa8f6a5da3 Fix monitor error with empty servers_no_promotion 2018-02-07 16:11:47 +02:00
1cf3de4a74 Add config parameter for excluding servers from failover
"servers_no_promotion" is a comma-separated list of servers
which cannot be chosen when selecting a new master during failover
(auto or manual), or when automatically selecting a new master
for switchover (currently disabled).

The servers in the list are redirected normally and can be promoted
by switchover when manually selecting a new master.
2018-02-07 14:07:10 +02:00
6f6c11e6a3 Ignore events for closed DCBs
If a closed DCB receives an event, it is ignored.
2018-02-06 16:36:04 +02:00
4089b6b6fd MXS-1647: Detect API version mismatch
If the API versions do not match, MaxScale will treat this as an
error. The API versioning would allow backwards compatible changes but the
functionality to handle that is not implemented in MaxScale.

Updated API versions based on changes done to module APIs in 2.2.
2018-02-06 14:51:07 +02:00
90fdbf8860 MXS-1652 Add possibility to log SQL statements
With the flag --debug=enable-statement-logging it is now possible
to instruct MaxScale to log all SQL statements it sends to the
servers.

The format of the logged string looks like:

    notice : SQL(127.0.0.1): 0, "SELECT ..."

First the fixed string "SQL", followed by the server address in
parenthesis followed by the actual return value of mysql_query(),
followed by the statement itself.

The "SQL" string makes the lines easy to grep for and having the
return value before the statement makes it easier to spot since
the length of the return value string does not wary much, but the
length of the statements do wary a lot.
2018-02-06 14:30:29 +02:00
099b976773 MXS-1646 Remove non-blocking/timed epoll_wait calls
Since a shutdown message will now be sent via the regular epoll route,
there is no need to regularily wake up from epoll in order to check
whether shutdown has been initiated, but we can simply wait in epoll_wait
until told to wake up.
2018-02-05 10:35:14 +02:00
771716e9db Merge branch '2.2' into develop 2018-02-05 10:22:43 +02:00
8a0c8e63f2 MXS-199: Support Causal Read in Read Write Splitting (#164)
* MXS-199: Support Causal Read in Read Write Splitting

* move most causal read logic into rwsplit router and get server type from monitor

* misc fix: remove new line

* refactor, move config to right place, replace ltrim with gwbuf_consume

* refacter a little for previous commit

* fix code style
2018-02-05 09:09:18 +02:00
e1f1d8e58a Merge branch '2.1' into 2.2 2018-02-02 16:05:14 +02:00
7ae931ce9c MXS-1635 Allow using specific address when connecting
In some cases you might want to use a specific address/interface
when connecting to a server instead of the default one. With the
global parameter 'local_address' it can now be specified which
address to use.
2018-02-02 15:17:22 +02:00
9f0a691233 Always stop the session by closing the client DCB
By always starting the session shutdown process by stopping the client
DCB, the manipulation of the session state can be removed from the backend
protocol modules and replaced with a fake hangup event.

Delivering this event via the core allows the actual dcb_close call on the
client DCB to be done only when the client DCB is being handled by a
worker.
2018-02-02 12:28:07 +02:00
ebf0d6fc5f Close client DCB with a hangup in the backend protocol
Directly closing the client DCB in the backend protocol modules is not
correct anymore as the state of the session doesn't change when the client
DCB is closed. By propagating the shutdown of the session with a fake
hangup to the client DCB, the closing of the DCB is done only once.

Added debug assertions that make sure all DCBs are closed only
once. Removed redundant code in the backend protocol error handling code.
2018-02-02 12:28:07 +02:00
12f5cabc50 Discard fake events for closed DCBs
If a fake event is sent to a DCB that has been closed, it should be
discarded.
2018-01-31 13:38:45 +02:00
1febafabf3 Crash after double close on debug builds
When a double close is detected in a debug build, a debug assertion is
triggered. This will generate a core dump which should help investigate
the double close.
2018-01-31 13:38:45 +02:00
255250652d Refactor pre-switchover, add similar checks as in failover
Now detects some erroneous situations before starting switchover.
Switchover can be activated without specifying current master.
In this case, the cluster master server is selected.
2018-01-31 10:40:09 +02:00
3dfb972d87 Merge branch '2.1' into 2.2 2018-01-30 16:28:11 +02:00
66ec4792cd MXS-1575: Fix DATETIME handling
DATETIME values in old formats should always be 8 bytes long. This is how
MariaDB 10.2 stores them and only DATETIME2 values are stored with a
fractional part.
2018-01-30 15:59:05 +02:00
b7e475f316 MXS-1621: Detect TABLE_MAP ↔ TABLE_CREATE column count mismatch
If the TABLE_MAP and TABLE_CREATE have different column counts, an error
is logged and the row events are skipped.
2018-01-30 15:59:05 +02:00
6410b4f19a MXS-1633 Turn off collecting of sqlite3 memstats
According to customer reports collecting the statistics has a significant
impact on the performance. As we don't need that information we can just
as well turn off that.

Further, since maxscale-common now links to the sqlite3-library, no
module needs to do that explicitly.
2018-01-30 13:58:37 +02:00
a56d0f8992 Merge branch '2.2' into develop 2018-01-26 10:26:22 +02:00
dcd57ea21b MXS-1623 Expose descriptor counts through maxadmin 2018-01-26 10:25:19 +02:00
11b0f84b8e MXS-1623 Maintain count of current/total descriptors 2018-01-26 10:25:19 +02:00
9093f19c8b Clean up atomic_load-functions 2018-01-25 10:52:03 +02:00
b8c78ca9fe Remove erroneous casts 2018-01-25 10:52:03 +02:00
6b877de5bc Make gwbuf_{add|get}_property const correct
The function now takes const arguments.
2018-01-24 20:29:09 +02:00