A Worker::Task is an object that can be sent to a worker for
execution. The task is sent to the worker using the messaging
mechanism where the `execute` function of the task will be
called in the thread context of the worker.
There are two kinds of tasks; regular tasks and disposable tasks.
The former are just sent to the worker for execution while the
latter are sent and subsequently disposed of, once the task has
been executed.
A disposable task can be sent to either one worker or to all
workers. In the latter case, the task will be deleted once it
has been executed by all workers.
A semaphore can be associated with a regular task. Once the task
has been executed by the worker, the semaphore will automatically
be posted. That way, it is trivial to send a task for execution
to a worker and wait until the task has been executed. For instance:
Semaphore sem;
MyTask task;
pWorker->execute(&task, &sem);
sem.wait();
const MyResult& result = task.result();
The low level mechanism for posting and broadcasting messages will
be removed.
All workers share an epoll instance that is added level-triggered
to the epoll instance of each Worker. This is intended to be used
together with listening sockets.
When a listening socket is added to the shared epoll instance the
effect is that EPOLLIN will be active for it whenever there is a
connection pending on a listening socket added to that epoll
instance.
When that occurs all workers in their epoll_wait()-calls will return.
When the workers subsequently call epoll_wait() on the shared epoll
instance, that will return with an event provided some other thread(s)
has not yet called accept() on the listening socket.
As each worker extracts just one event at a time and calls accept just
once before calling epoll_wait(), it means that the client connections
will be distributed evenly across all workers, provided the load on
the workers is roughly the same. If it isn't then a worker with less
load will get more connections to handle (which will even out the load).
All debug messages from dcb.cc were prefixed with the pthread ID of the
current thread. If the thread ID is needed, it should be logged by the log
manager.
The functions used to track the resultset EOF packets now expose the
position of the end of the result set. This allows the modules that use
them to check if more results exist in the same buffer.
Added the status bits for OK and EOF packets to the mysql.h protocol
header. This can be used to check for various state changes that happen in
the session. Currently the status bits are only used to detect if more
results are expected.
Now the statistics is in a single structure and the property of the
Worker instance in question. Methods are provided for obtaining the
statistics of all workers in one go.
The existing load calculation does not fit the 2.0 thread approach
that well. So it is removed entirely now, to be replaced with some
new approach later.
Showing dcb addresses, the number of fds and the events is
meaningless as the information is completely transient and
is likely to have changed the moment is was displayed.
Just like the thread stats and poll stats earlier, the queue stats
are now moved to worker.
A litte refactoring still, and the polling will only work on local
data.
Each worker now has a separate structure for collecting the
polling statistics that is passed to epoll_waitevents(). When
the stats are asked for, we loop over all separate stats and
combine them. So, instead of having every statistics of each
thread one cacheline apart, each thread has all its statistics
in one lump that, for obvious reasons, are going to be apart.
The primary purpose of this excersize is to remove the hardwired
nature of the statistics collection. For instance, the admin
thread will be doing I/O but that I/O should not be included
in the statistics of the workers.
The Worker no longer creates a pipe and implements the cross
worker/thread message mechanism itself. Instead it has a
MessageQueue instance variable for that purpose.
MessageQueue encapsulates a message queue built on top of a
pipe. The message queue needs a handler for receiving messages
and must be added to a worker for pumping messages through the
pipe.
Each Worker will have an instance of MessageQueue.
Now the epoll instance of the Worker is used when polling. The
work is still done in poll.cc and the worker provides the descriptor
and thread id.
The comments of poll_waitevents() have been removed; they were not
accurate anymore so better to let the code speak for itself.
And add the Worker header...
The epoll instance is not used yet, but the common creation of epoll
instances in poll.cc will be removed and the epoll instances of each
worker used instead.
This is the first step in turning the worker mechanism and everything
around it into a set of C++ classes. In this change, the original C
API is still present, but in subsequent changes that will be removed.
This is not globally safe yet, but all other access is directly or
indirectly related to maxadmin, which is irrelevant as far as
performance testing is concerned.
Now possible to send a function and arguments to a specific worker
thread for execution.
In particular, this will be used for transferring the injection of
fake hangup events into DCBs, related to a particular server, from
the monitor thread to the worker threads, thus removing the need
for locks.
The shutdown is now performed so that a shutdown message is
sent to all workers. When the workers receive that message, they
turn on a shutdown flag, which subsequently is checked in the poll
loop.
MXS_WORKER is an abstraction of a worker aka worker thread.
It has a pipe whose read descriptor is added to the worker/thread
specific poll set and a write descriptor used for sending messages
to the worker.
The worker exposes a function mxs_worker_post_message using which
messages can be sent to the worker. These messages can be sent from
any thread but will be delivered on the thread dedicated for the
worker.
To illustrate how it works, maxadmin has been provided with a new
command "ping workers" that sends a message to every worker, which
then logs a message to the log.
Additional refactoring are needed, since there currently are overlaps
and undesirable interactions between the poll mechanism, the thread
mechanism and the worker mechanism.
This is visible currently, for instance, by it not being possible to
shut down MaxScale. The reason is that the workers should be shut down
first, then the poll mechanism and finally the threads. The shutdown
need to be arranged so that a shutdown message is sent to the workers
who then cause the polling loop to exit, which will cause the threads
to exit.
That can be arranged cleanly by making poll_waitevents() a "method"
of the worker, which implies that the poll set becomes a "member
variable" of the worker.
To be continued.
The whole worker thread mechanism assumes EPOLLET and non-blocking
descriptors, so that should be the default.
TODO: In debug mode, check that the provided file descriptor indeed
is non-blocking.