Merge branch '2.3' into develop

This commit is contained in:
Johan Wikman
2018-11-06 09:51:49 +02:00
20 changed files with 316 additions and 109 deletions

View File

@ -101,10 +101,14 @@ Common options for all monitor modules.
Module specific documentation.
- [Aurora Monitor](Monitors/Aurora-Monitor.md)
- [Galera Monitor](Monitors/Galera-Monitor.md)
- [Multi-Master Monitor](Monitors/MM-Monitor.md)
- [MariaDB Monitor](Monitors/MariaDB-Monitor.md)
- [Galera Monitor](Monitors/Galera-Monitor.md)
- [ColumnStore Monitor](Monitors/ColumnStore-Monitor.md)
- [Aurora Monitor](Monitors/Aurora-Monitor.md)
Legacy monitors that have been deprecated.
- [Multi-Master Monitor](Monitors/MM-Monitor.md)
- [MySQL Cluster Monitor](Monitors/NDB-Cluster-Monitor.md)
## Protocols

View File

@ -0,0 +1,53 @@
# ColumnStore Monitor
The ColumnStore monitor, `csmon`, is a monitor module for MariaDB ColumnStore
servers. It supports multiple UM nodes and can detect the correct server for
DML/DDL statements which will be labeled as the master. Other UM nodes will be
used for reads.
## Master Selection
The automatic master detection only works with ColumnStore 1.1.7 (planned
version at the time of writing). Older versions of ColumnStore do not implement
the required functionality to automatically detect which of the servers is the
primary UM.
With older versions the `primary` parameter must be defined to tell the monitor
which of the servers is the primary UM node. This guarantees that DDL statements
are only executed on the primary UM.
## Configuration
Read the [Monitor Common](Monitor-Common.md) document for a list of supported
common monitor parameters.
### `primary`
The `primary` parameter controls which server is chosen as the master
server. This is an optional parameter.
If the server pointed to by this parameter is available and is ready to process
queries, it receives the _Master_ status. If the parameter is not defined and
the ColumnStore server does not support the `mcsSystemPrimary` function, no
master server is chosen.
Note that this parameter is only used when the server does not implement the
required functionality. Otherwise the parameter is ignored as the information
from ColumnStore itself is more reliable.
## Example
The following is an example of a `csmon` configuration.
```
[CS-Monitor]
type=monitor
module=csmon
servers=um1,um2,um3
user=myuser
passwd=mypwd
monitor_interval=5000
primary=um1
```
It defines a set of three UMs and defines the UM `um1` as the primary UM.

View File

@ -1,5 +1,7 @@
# Multi-Master Monitor
**NOTE:** This module has been deprecated, do not use it.
## Overview
The Multi-Master Monitor is a monitoring module for MaxScale that monitors Master-Master replication.

View File

@ -1,5 +1,7 @@
# NDB Cluster Monitor
**NOTE:** This module has been deprecated, do not use it.
## Overview
The MySQL Cluster Monitor is a monitoring module for MaxScale that monitors a MySQL Cluster. It assigns a NDB status for the server if it is a part of a MySQL Cluster.

View File

@ -55,15 +55,25 @@ the [JSON API](http://jsonapi.org/format/) specification.
- [sessions](Resources-Session.md)
- [users](Resources-User.md)
All of the current resources are in the `/v1/` namespace of the MaxScale REST
API. Further additions to the namespace can be added that do not break backwards
compatibility of any existing resources.
In addition to the named resources, the REST API will respond with a HTTP 200 OK
response to GET requests on the root resource (`/`) as well as the namespace
root resource (`/v1/`). These can be used for HTTP health checks to determine
whether MaxScale is running.
## API Versioning
All of the current resources are in the `/v1/` namespace of the MaxScale REST
API. Further additions to the namespace can be added that do not break backwards
compatibility of any existing resources. What this means in practice is that:
* No resources or URLs will be removed
* The API will be JSON API compliant
Note that this means that the contents of individual resources can change. New
fields can be added, old ones can be removed and the meaning of existing fields
can change. The aim is to be as backwards compatible as reasonably possible
without sacrificing the clarity and functionality of the API.
### Resource Relationships
All resources return complete JSON objects. The returned objects can have a

View File

@ -393,8 +393,6 @@ alter:
set:
set server - Set the status of a server
set pollsleep - Set poll sleep period
set nbpolls - Set non-blocking polls
set log_throttling - Set the log throttling configuration
clear:
@ -1743,57 +1741,11 @@ maxadmin call command dbfwfilter rules/reload my-firewall-filter /home/user/rule
Here the name of the filter is _my-firewall-filter_ and the optional rule file
path is `/home/user/rules.txt`.
# Tuning MariaDB MaxScale
# MaxScale Internals
The way that MariaDB MaxScale does its polling is that each of the polling
threads, as defined by the threads parameter in the configuration file, will
call epoll_wait to obtain the events that are to be processed. The events are
then added to a queue for execution. Any thread can read from this queue, not
just the thread that added the event.
Once the thread has done an epoll call with no timeout it will either do an
epoll_wait call with a timeout or it will take an event from the queue if there
is one. These two new parameters affect this behavior.
The first parameter, which may be set by using the non_blocking_polls option in
the configuration file, controls the number of epoll_wait calls that will be
issued without a timeout before MariaDB MaxScale will make a call with a timeout
value. The advantage of performing a call without a timeout is that the kernel
treats this case as different and will not rescheduled the process in this
case. If a timeout is passed then the system call will cause the MariaDB
MaxScale thread to be put back in the scheduling queue and may result in lost
CPU time to MariaDB MaxScale. Setting the value of this parameter too high will
cause MariaDB MaxScale to consume a lot of CPU when there is infrequent work to
be done. The default value of this parameter is 3.
This parameter may also be set via the maxadmin client using the command _set
nbpolls <number>_.
The second parameter is the maximum sleep value that MariaDB MaxScale will pass
to epoll_wait. What normally happens is that MariaDB MaxScale will do an
epoll_wait call with a sleep value that is 10% of the maximum, each time the
returns and there is no more work to be done MariaDB MaxScale will increase this
percentage by 10%. This will continue until the maximum value is reached or
until there is some work to be done. Once the thread finds some work to be done
it will reset the sleep time it uses to 10% of the maximum.
The maximum sleep time is set in milliseconds and can be placed in the
[maxscale] section of the configuration file with the poll_sleep
parameter. Alternatively it may be set in the maxadmin client using the command
_set pollsleep <number>_. The default value of this parameter is 1000.
Setting this value too high means that if a thread collects a large number of
events and adds to the event queue, the other threads might not return from the
epoll_wait calls they are running for some time resulting in less overall
performance. Setting the sleep time too low will cause MariaDB MaxScale to wake
up too often and consume CPU time when there is no work to be done.
The _show epoll_ command can be used to see how often we actually poll with a
timeout, the first two values output are significant. Also the "Number of wake
with pending events" is a good measure. This is the count of the number of times
a blocking call returned to find there was some work waiting from another
thread. If the value is increasing rapidly reducing the maximum sleep value and
increasing the number of non-blocking polls should help the situation.
The _show epoll_ command can be used to see what kind of events have been
processed and also how many events on average have been returned by each
call to `epoll_wait`.
```
MaxScale> show epoll
@ -1801,16 +1753,12 @@ MaxScale> show epoll
Poll Statistics.
No. of epoll cycles: 343
No. of epoll cycles with wait: 66
No. of epoll calls returning events: 19
No. of non-blocking calls returning events: 10
No. of read events: 2
No. of write events: 15
No. of error events: 0
No. of hangup events: 0
No. of accept events: 4
No. of times no threads polling: 4
Total event queue length: 1
Average event queue length: 1
Maximum event queue length: 1
No of poll completions with descriptors
@ -1828,19 +1776,10 @@ No of poll completions with descriptors
MaxScale>
```
If the "Number of DCBs with pending events" grows rapidly it is an indication
that MariaDB MaxScale needs more threads to be able to keep up with the load it
is under.
The _show threads_ command can be used to see the historic average for the
pending events queue, it gives 15 minute, 5 minute and 1 minute averages. The
load average it displays is the event count per poll cycle data. An idea load is
1, in this case MariaDB MaxScale threads and fully occupied but nothing is
waiting for threads to become available for processing.
The _show eventstats_ command can be used to see statistics about how long
events have been queued before processing takes place and also how long the
events took to execute once they have been allocated a thread to run on.
The _show eventstats_ command can be used to see statistics about how
long it has taken for events having been returned from `epoll_wait`
until they processed, and how long it has taken for events to be
processed once the processing has started.
```
MaxScale> show eventstats
@ -1849,7 +1788,6 @@ Event statistics.
Maximum queue time: 000ms
Maximum execution time: 000ms
Maximum event queue length: 1
Total event queue length: 4
Average event queue length: 1
| Number of events

View File

@ -16,6 +16,31 @@ Secondary masters can now be specified also when file + position
based replication is used. Earlier it was possible only in conjunction
with GTID based replication.
### `mmmon` and `ndbclustermon`
Both of these modules have been deprecated and will be removed in a future
release. The functionality in `mmmon` has been largely obsoleted by the
advancements in `mariadbmon`. The `ndbclustermon` is largely obsolete due to the
fact that there are virtually no users who use it.
### Deprecated options
The following configuration file options have been deprecated and will
be removed in 2.4.
#### Global section
* `non_blocking_polls`, ignored.
* `poll_sleep`, ignored.
* `thread_stack_size`, ignored.
#### Services and Monitors
* `passwd`, replaced with `password`.
### MaxAdmin
The commands `set pollsleep` and `set nbpolls` have been deprecated and
will be removed in 2.4.
## New Features
### MaxCtrl

View File

@ -162,7 +162,7 @@ int main(int argc, char** argv)
// If initialisation failed, fail the test immediately.
if (test.global_result != 0)
{
delete_event(test);
try_delete_event(test);
return test.global_result;
}
@ -182,7 +182,7 @@ int main(int argc, char** argv)
// Again, stop on failure.
if (test.global_result != 0)
{
delete_event(test);
try_delete_event(test);
return test.global_result;
}
@ -206,7 +206,7 @@ int main(int argc, char** argv)
if (test.global_result != 0)
{
delete_event(test);
try_delete_event(test);
return test.global_result;
}
@ -225,7 +225,7 @@ int main(int argc, char** argv)
check_event_status(test, 0, EVENT_NAME, "ENABLED");
if (test.global_result != 0)
{
delete_event(test);
try_delete_event(test);
return test.global_result;
}
@ -237,7 +237,7 @@ int main(int argc, char** argv)
test.expect(states.count("Slave") == 1, "%s is not a slave.", server_name.c_str());
}
delete_event(test);
try_delete_event(test);
if (test.global_result != 0)
{
test.repl->fix_replication();

View File

@ -48,11 +48,8 @@ struct WORKER_STATISTICS
int64_t n_accept = 0; /*< Number of accept events */
int64_t n_polls = 0; /*< Number of poll cycles */
int64_t n_pollev = 0; /*< Number of polls returning events */
int64_t n_nbpollev = 0; /*< Number of polls returning events */
int64_t evq_avg = 0; /*< Average event queue length */
int64_t evq_max = 0; /*< Maximum event queue length */
int64_t blockingpolls = 0; /*< Number of epoll_waits with a timeout
* specified */
int64_t maxqtime = 0;
int64_t maxexectime = 0;
std::array<int64_t, MAXNFDS> n_fds {}; /*< Number of wakeups with particular n_fds value */

View File

@ -1277,7 +1277,7 @@ void dcb_final_close(DCB* dcb)
else
{
// Only listeners are closed with a fd of -1
mxb_assert(dcb->dcb_role == DCB_ROLE_SERVICE_LISTENER);
mxb_assert(dcb->dcb_role == DCB_ROLE_SERVICE_LISTENER || dcb->dcb_role == DCB_ROLE_INTERNAL);
}
dcb->state = DCB_STATE_DISCONNECTED;

View File

@ -104,9 +104,7 @@ void dprintPollStats(DCB* dcb)
dcb_printf(dcb, "\nPoll Statistics.\n\n");
dcb_printf(dcb, "No. of epoll cycles: %" PRId64 "\n", s.n_polls);
dcb_printf(dcb, "No. of epoll cycles with wait: %" PRId64 "\n", s.blockingpolls);
dcb_printf(dcb, "No. of epoll calls returning events: %" PRId64 "\n", s.n_pollev);
dcb_printf(dcb, "No. of non-blocking calls returning events: %" PRId64 "\n", s.n_nbpollev);
dcb_printf(dcb, "No. of read events: %" PRId64 "\n", s.n_read);
dcb_printf(dcb, "No. of write events: %" PRId64 "\n", s.n_write);
dcb_printf(dcb, "No. of error events: %" PRId64 "\n", s.n_error);

View File

@ -397,9 +397,9 @@ bool qc_setup(const QC_CACHE_PROPERTIES* cache_properties,
if (cache_max_size)
{
int64_t size_per_thr = cache_max_size / config_get_global_options()->n_threads;
MXS_NOTICE("Query classification results are cached and reused. "
"Memory used per thread: %s",
mxb::to_binary_size(cache_max_size).c_str());
"Memory used per thread: %s", mxb::to_binary_size(size_per_thr).c_str());
}
else
{

View File

@ -759,10 +759,8 @@ Worker::STATISTICS RoutingWorker::get_statistics()
cs.n_accept = mxs::sum(s, &STATISTICS::n_accept);
cs.n_polls = mxs::sum(s, &STATISTICS::n_polls);
cs.n_pollev = mxs::sum(s, &STATISTICS::n_pollev);
cs.n_nbpollev = mxs::sum(s, &STATISTICS::n_nbpollev);
cs.evq_avg = mxs::avg(s, &STATISTICS::evq_avg);
cs.evq_max = mxs::max(s, &STATISTICS::evq_max);
cs.blockingpolls = mxs::sum(s, &STATISTICS::blockingpolls);
cs.maxqtime = mxs::max(s, &STATISTICS::maxqtime);
cs.maxexectime = mxs::max(s, &STATISTICS::maxexectime);
cs.n_fds = mxs::sum_element(s, &STATISTICS::n_fds);
@ -1033,9 +1031,6 @@ public:
json_object_set_new(pStats, "errors", json_integer(s.n_error));
json_object_set_new(pStats, "hangups", json_integer(s.n_hup));
json_object_set_new(pStats, "accepts", json_integer(s.n_accept));
json_object_set_new(pStats, "blocking_polls", json_integer(s.blockingpolls));
// TODO: When REST-API v2 is published, remove 'event_queue_length'.
json_object_set_new(pStats, "event_queue_length", json_integer(s.evq_avg));
json_object_set_new(pStats, "avg_event_queue_length", json_integer(s.evq_avg));
json_object_set_new(pStats, "max_event_queue_length", json_integer(s.evq_max));
json_object_set_new(pStats, "max_exec_time", json_integer(s.maxexectime));

View File

@ -571,7 +571,7 @@ void dprintServer(DCB* dcb, const SERVER* srv)
{
ave_os << "not available";
}
dcb_printf(dcb, "\tAverage response time: %s\n", ave_os.str().c_str());
dcb_printf(dcb, "\tAdaptive avg. select time: %s\n", ave_os.str().c_str());
if (server->persistpoolmax)
{
@ -1448,6 +1448,9 @@ static json_t* server_json_attributes(const SERVER* server)
json_object_set_new(stats, "active_operations", json_integer(server->stats.n_current_ops));
json_object_set_new(stats, "routed_packets", json_integer(server->stats.packets));
maxbase::Duration response_ave(server_response_time_average(server));
json_object_set_new(stats, "adaptive_avg_select_time", json_string(to_string(response_ave).c_str()));
json_object_set_new(attr, "statistics", stats);
return attr;

View File

@ -4,3 +4,4 @@ add_subdirectory(grmon)
add_subdirectory(mariadbmon)
add_subdirectory(mmmon)
add_subdirectory(ndbclustermon)
add_subdirectory(csmon)

View File

@ -0,0 +1,4 @@
add_library(csmon SHARED csmon.cc)
target_link_libraries(csmon maxscale-common)
set_target_properties(csmon PROPERTIES VERSION "1.0.0")
install_module(csmon core)

View File

@ -0,0 +1,141 @@
/*
* Copyright (c) 2016 MariaDB Corporation Ab
*
* Use of this software is governed by the Business Source License included
* in the LICENSE.TXT file and at www.mariadb.com/bsl11.
*
* Change Date: 2022-01-01
*
* On the date above, in accordance with the Business Source License, use
* of this software will be governed by version 2 or later of the General
* Public License.
*/
#define MXS_MODULE_NAME "csmon"
#include "csmon.hh"
#include <regex>
#include <vector>
#include <string>
#include <maxscale/modinfo.h>
#include <maxscale/mysql_utils.h>
namespace
{
constexpr const char* alive_query = "SELECT mcsSystemReady() = 1 && mcsSystemReadOnly() <> 2";
constexpr const char* role_query = "SELECT mcsSystemPrimary()";
// Helper for extracting string results from queries
static std::string do_query(MXS_MONITORED_SERVER* srv, const char* query)
{
std::string rval;
MYSQL_RES* result;
if (mxs_mysql_query(srv->con, query) == 0 && (result = mysql_store_result(srv->con)))
{
MYSQL_ROW row = mysql_fetch_row(result);
if (row && row[0])
{
rval = row[0];
}
mysql_free_result(result);
}
else
{
mon_report_query_error(srv);
}
return rval;
}
// Returns a numeric version similar to mysql_get_server_version
int get_cs_version(MXS_MONITORED_SERVER* srv)
{
std::string result = do_query(srv, "SELECT @@version_comment");
std::regex re("Columnstore ([0-9]*)[.]([0-9]*)[.]([0-9]*)-[0-9]*");
std::smatch match;
int rval = 0;
if (std::regex_match(result, match, re) && match.size() == 4)
{
rval = atoi(match[1].str().c_str()) * 10000 + atoi(match[2].str().c_str()) * 100
+ atoi(match[3].str().c_str());
}
return rval;
}
}
CsMonitor::CsMonitor(MXS_MONITOR* monitor)
: maxscale::MonitorInstanceSimple(monitor)
, m_primary(config_get_server(monitor->parameters, "primary"))
{
}
CsMonitor::~CsMonitor()
{
}
// static
CsMonitor* CsMonitor::create(MXS_MONITOR* monitor)
{
return new CsMonitor(monitor);
}
bool CsMonitor::has_sufficient_permissions() const
{
return check_monitor_permissions(m_monitor, alive_query);
}
void CsMonitor::update_server_status(MXS_MONITORED_SERVER* srv)
{
monitor_clear_pending_status(srv, SERVER_MASTER | SERVER_SLAVE | SERVER_RUNNING);
int status = 0;
if (do_query(srv, alive_query) == "1")
{
status |= SERVER_RUNNING;
if (get_cs_version(srv) >= 10107)
{
// 1.1.7 should support the mcsSystemPrimary function
// TODO: Update when the actual release is out
status |= do_query(srv, role_query) == "1" ? SERVER_MASTER : SERVER_SLAVE;
}
else
{
status |= srv->server == m_primary ? SERVER_MASTER : SERVER_SLAVE;
}
}
monitor_set_pending_status(srv, status);
}
extern "C" MXS_MODULE* MXS_CREATE_MODULE()
{
static MXS_MODULE info =
{
MXS_MODULE_API_MONITOR,
MXS_MODULE_BETA_RELEASE,
MXS_MONITOR_VERSION,
"MariaDB ColumnStore monitor",
"V1.0.0",
MXS_NO_MODULE_CAPABILITIES,
&maxscale::MonitorApi<CsMonitor>::s_api,
NULL, /* Process init. */
NULL, /* Process finish. */
NULL, /* Thread init. */
NULL, /* Thread finish. */
{
{"primary", MXS_MODULE_PARAM_SERVER},
{MXS_END_MODULE_PARAMS}
}
};
return &info;
}

View File

@ -0,0 +1,35 @@
/*
* Copyright (c) 2018 MariaDB Corporation Ab
*
* Use of this software is governed by the Business Source License included
* in the LICENSE.TXT file and at www.mariadb.com/bsl11.
*
* Change Date: 2022-01-01
*
* On the date above, in accordance with the Business Source License, use
* of this software will be governed by version 2 or later of the General
* Public License.
*/
#pragma once
#include <maxscale/ccdefs.hh>
#include <maxscale/monitor.hh>
class CsMonitor : public maxscale::MonitorInstanceSimple
{
public:
CsMonitor(const CsMonitor&) = delete;
CsMonitor& operator=(const CsMonitor&) = delete;
~CsMonitor();
static CsMonitor* create(MXS_MONITOR* monitor);
protected:
bool has_sufficient_permissions() const;
void update_server_status(MXS_MONITORED_SERVER* monitored_server);
private:
CsMonitor(MXS_MONITOR* monitor);
SERVER* m_primary;
};

View File

@ -302,7 +302,7 @@ static void log_server_connections(select_criteria_t criteria, const SRWBackendL
maxbase::Duration response_ave(server_response_time_average(b->server));
std::ostringstream os;
os << response_ave;
MXS_INFO("Average response time : %s from \t[%s]:%d %s",
MXS_INFO("adaptive avg. select time: %s from \t[%s]:%d %s",
os.str().c_str(),
b->server->address,
b->server->port,

View File

@ -575,6 +575,17 @@ void RWSplitSession::clientReply(GWBUF* writebuf, DCB* backend_dcb)
mxb_assert(backend->get_reply_state() == REPLY_STATE_DONE);
MXS_INFO("Reply complete, last reply from %s", backend->name());
ResponseStat& stat = backend->response_stat();
stat.query_ended();
if (stat.is_valid() && (stat.sync_time_reached()
|| server_response_time_num_samples(backend->server()) == 0))
{
server_add_response_average(backend->server(),
stat.average().secs(),
stat.num_samples());
stat.reset();
}
if (m_config.causal_reads)
{
// The reply should never be complete while we are still waiting for the header.
@ -650,18 +661,6 @@ void RWSplitSession::clientReply(GWBUF* writebuf, DCB* backend_dcb)
m_can_replay_trx = true;
}
ResponseStat& stat = backend->response_stat();
stat.query_ended();
if (stat.is_valid() && (stat.sync_time_reached()
|| server_response_time_num_samples(backend->server()) == 0))
{
server_add_response_average(backend->server(),
stat.average().secs(),
stat.num_samples());
stat.reset();
}
if (backend->in_use() && backend->has_session_commands())
{
// Backend is still in use and has more session commands to execute