Just like the thread stats and poll stats earlier, the queue stats
are now moved to worker.
A litte refactoring still, and the polling will only work on local
data.
Each worker now has a separate structure for collecting the
polling statistics that is passed to epoll_waitevents(). When
the stats are asked for, we loop over all separate stats and
combine them. So, instead of having every statistics of each
thread one cacheline apart, each thread has all its statistics
in one lump that, for obvious reasons, are going to be apart.
The primary purpose of this excersize is to remove the hardwired
nature of the statistics collection. For instance, the admin
thread will be doing I/O but that I/O should not be included
in the statistics of the workers.
The Worker no longer creates a pipe and implements the cross
worker/thread message mechanism itself. Instead it has a
MessageQueue instance variable for that purpose.
MessageQueue encapsulates a message queue built on top of a
pipe. The message queue needs a handler for receiving messages
and must be added to a worker for pumping messages through the
pipe.
Each Worker will have an instance of MessageQueue.
This is the first step in turning the worker mechanism and everything
around it into a set of C++ classes. In this change, the original C
API is still present, but in subsequent changes that will be removed.
This is not globally safe yet, but all other access is directly or
indirectly related to maxadmin, which is irrelevant as far as
performance testing is concerned.
Now possible to send a function and arguments to a specific worker
thread for execution.
In particular, this will be used for transferring the injection of
fake hangup events into DCBs, related to a particular server, from
the monitor thread to the worker threads, thus removing the need
for locks.
The shutdown is now performed so that a shutdown message is
sent to all workers. When the workers receive that message, they
turn on a shutdown flag, which subsequently is checked in the poll
loop.
MXS_WORKER is an abstraction of a worker aka worker thread.
It has a pipe whose read descriptor is added to the worker/thread
specific poll set and a write descriptor used for sending messages
to the worker.
The worker exposes a function mxs_worker_post_message using which
messages can be sent to the worker. These messages can be sent from
any thread but will be delivered on the thread dedicated for the
worker.
To illustrate how it works, maxadmin has been provided with a new
command "ping workers" that sends a message to every worker, which
then logs a message to the log.
Additional refactoring are needed, since there currently are overlaps
and undesirable interactions between the poll mechanism, the thread
mechanism and the worker mechanism.
This is visible currently, for instance, by it not being possible to
shut down MaxScale. The reason is that the workers should be shut down
first, then the poll mechanism and finally the threads. The shutdown
need to be arranged so that a shutdown message is sent to the workers
who then cause the polling loop to exit, which will cause the threads
to exit.
That can be arranged cleanly by making poll_waitevents() a "method"
of the worker, which implies that the poll set becomes a "member
variable" of the worker.
To be continued.
The whole worker thread mechanism assumes EPOLLET and non-blocking
descriptors, so that should be the default.
TODO: In debug mode, check that the provided file descriptor indeed
is non-blocking.
The handler callback should now return a bitmask with bits set
according to what it did when it was called. That way the actual
statistics gathering can be done in poll_waitevents() and the
handler need not be aware of any thread structs.
Actually, the only thing that needs any assistance is accept handling,
because in poll_waitevents() we do not know whether a READ event
relates to a listening or a normal socket, that is, should the
event be counted as an accept or as a read.
This is just a first step in a trial that will allow the addition
of any file descriptor to the general poll mechanism and hence
allow any i/o to be handled by the worker threads.
There is a structure
typedef struct mxs_poll_data
{
void (*handler)(struct mxs_poll_data *data, int wid, uint32_t events);
struct
{
int id;
} thread;
} MXS_POLL_DATA;
that any other structure (e.g. a DCB) encapsulating a file descriptor must
have as its first member (a C++ struct could basically derive from it).
That structure contains two members; 'handler' and 'thread.id'. Handler is a
pointer to a function taking a pointer to a struct mxs_poll_data, a worker thread
if and an epoll event mask as argument.
So, DCB is modified to have MXS_POLL_DATA as its first member and 'handler'
is initialized with a function that *knows* the passed MXS_POLL_DATA can
be downcast to a DCB.
process_pollq no longer exists, but is now called process_pollq_dcb. The
general stuff related to statistics etc. will be moved to poll_waitevents
itself after which the whole function is moved to dcb.c. At that point,
the handler pointer will be set in dcb_alloc().
Effectively poll.[h|c] will provide a generic mechanism for listening on
whatever descriptors and the dcb stuff will be part of dcb.[h|c].
A module can now declare a path parameter for a directory that does not
yet exist. If the directory does not exist, MaxScale will create the
directory with the requested permissions.
This number (defaults to 1) sets how many times mon_connect_to_db
will try to connect to a backend before returning an error. Every
connection attempt may take backend_connect_timeout seconds to
complete.
Also refactored code a bit. Renamed mon_connect_to_db to
mon_ping_or_connect_to_db, since it does not connect if the connection
is already alive.
New parameter added to maxsrows filter:
max_resultset_return=empty|error|ok
Default, 'empty' is to return an empty set, as the current
implementation.
'err' will return an ERR reply with the input SQL statement
'ok' will return an OK packet
The new type allows routers to send queries and get complete result sets
as a response. This allows the routers to easily send commands that create
result sets and which are parsed by the router.
Currently only the schemarouter benefits from this new capability as it
generates the database mappings by parsing the output of a SHOW DATABASES
query.
Some of the protocol modules use ssize_t instead of size_t.
Split the function that counts the number of response packets a session
command will receive into two parts. This allows it to be reused
elsewhere.
The readwritesplit now sends COM_PING queries to backend servers that have
been idle for too long. The option is configured with the
`connection_keepalive` parameter.
The contents of the existing filter.cc was copied into filter.c that
subsequently was renamed to filter.cc.
The way the session is called as the last filter in the filter chain
is really dubious and ought to be rearranged so that the blind casting
of a session to a filter and back is not needed.
By default, only the essentials - the type and the operation - of
a statement will be collected and only if fields, tables, functions
and databases are explicitly asked for, will they be collected.
However, a statement will be parsed at most twice; if parsing is
needed a second time then all information will be collected.
If it is known that some particular information is needed, then
qc_parse() can be called explicitly to ensure it is collected
at first parsing.
It is now possible to specify what information the caller is interested
in. With this the cost for collecting information during the query parsing
that nobody is interested in can be avoided.
- Non-GCC intrinsics alternative implementation removed. Let's worry
about the absence of the intrinsics once/if that becomes relevant.
- Spinlock release now performed using __sync_lock_release, as per
svoj's advice.
- while-looping on the variable used as lock removed, so it no longer
need to be volatile.
- Boolean function returns bool.
- Size of profiling counters increased.
- Risk for division-by-zero removed.
- Documentation moved from implementation to header.
If a server points to a local MaxScale listener, the permission checks for
that server are skipped. This allows permission checks to be used with a
mix of external servers and internal services.
The static module capabilities are now used to query the capabilities of
filters and routers. The new RCAP_TYPE_NOAUTH capability is also taken
into use. These changes removes the need for the `is_internal_service`
function.
The static capabilities declared in getCapabilities allows certain
capabilities to be queried before instances are created. The intended use
of this capability is to remove the need for the `is_internal_service`
function.