The connections that relate to a particular session are now a part of the
sessions resource. Currently, only the generic information is stored for
each connection (id and server name).
By storing a link to the backend DCBs in the session object itself, we can
reach all related objects from the session. This removes the need to
iterate over all DCBs to find the set of related DCBs.
DCBs will only be used in conjunction with RoutingWorkers and
hence RoutingWorker and not just Worker must be used when looking
for the current worker and worker id.
The reason is that RoutingWorker cheats; the current worker id is
set to 0 at initialization time, which indicates that a worker would
be running although it isn't.
The reason is that as listeners are created before any worker has
been started, that arrangement ensures that listening DCBs are
book-kept in the worker running in the main thread.
Now, that is a kludge. It ought to be changed so that a, say,
MainRoutingWorker class is introduced that in its run function
initializes MaxScale and then continues running as any regular
RoutingWorker. In due time.
- If a client DCB should be moved to some other worker than
the current one (cli and maxinfo), and that fails, the
thread id must be reset to that of the calling thread as
otherwise asserts will be triggered.
- If the creation of the first DCB fails, then the dcb list
for that thread will be NULL and thus must be accessed
with some caution.
When DCBs are being hung in dcb_hangup_foreach, the hangup event can be
processed directly. This prevents excessive use of the worker message
queue pipe thus reducing the possibility of it being full.
The fact that a client dcb was immediately added to the epoll-
instance of the relevant worker (possible, since that is thread-
safe), but was added to the book-keeping via the message mechanism
(necessary, since that is not thread-safe), meant that if the
connection was closed before the message was delivered, the handling
of the message then caused an access error.
Now the fd is also added to the epoll-instance via the messaging
mechanism, so the problem can no longer occur. The only fds this
affects are connections made to maxadmin or maxinfo as they are
always handled by the main thread due to deadlock issues.
The same problem that caused maxadmin to lock up was also what caused
maxinfo to lock up. The concurrent access to the legacy administrative
functions caused deadlocks.
With the changes to the DCB handling, the service pointer of a client DCB
must always be assigned.
Also removed the unnecessary parentheses around the comparison.
Worker is now the base class of all workers. It has a message
queue and can be run in a thread of its own, or in the calling
thread. Worker can not be used as such, but a concrete worker
class must be derived from it. Currently there is only one
concrete class RoutingWorker.
There is some overlapping in functionality between Worker and
RoutingWorker, as there is e.g. a need for broadcasting a
message to all routing workers, but not to other workers.
Currently other workers can not be created as the array for
holding the pointers to the workers is exactly as large as
there will be RoutingWorkers. That will be changed so that
the maximum number of threads is hardwired to some ridiculous
value such as 128. That's the first step in the path towards
a situation where the number of worker threads can be changed
at runtime.
A new class mxs::Worker will be introduced and mxs::RoutingWorker
will be inherited from that. mxs::Worker will basically only be a
thread with a message-loop.
Once available, all current non-worker threads (but the one
implicitly created by microhttpd) can be creating by inheriting
from that; in practice that means the housekeeping thread, all
monitor threads and possibly the logging thread.
The benefit of this arrangement is that there then will be a general
mechanism for cross thread communication without having to use any
shared data structures.
If maxadmin connections are handled by different workers, then
there may be a deadlock if some maxadmin command requires
communication with all workers.
Namely, in that case a message will be sent to all other workers
but the current one, but that message will not be handled if that
other worker at that point sits in the debugcmd_lock spinlock
in debugcmd.c:execute_cmd().
We can prevent that deadlock from happening simply by ensuring
that all maxadmin connections are handled by one thread.
The old hkheartbeat variable was changed to the mxs_clock() function that
simply wraps an atomic load of the variable. This allows it to be
correctly read by MaxScale as well as opening up the possibility of
converting the value load to a relaxed memory order read.
Renamed the header and associated macros. Removed inclusion of the
heartbeat header from the housekeeper header and added it to the files
that were missing it.
We need to copy some data from a AF_UNIX based listener dcb
to the accepted client dcb, to prevent assertion violation in
dcb_get_port(). Further, to be able to log the path in the case
of an authentication error we need to copy that as well.
When a double close is detected in a debug build, a debug assertion is
triggered. This will generate a core dump which should help investigate
the double close.
Since there is a concept called "listener" it is confusing that the
dcb "this" argument in some dcb functions is called 'listener' instead
of 'dcb' as it is called everywhere else.
The internal header directory conflicted with in-source builds causing a
build failure. This is fixed by renaming the internal header directory to
something other than maxscale.
The renaming pointed out a few problems in a couple of source files that
appeared to include internal headers when the headers were in fact public
headers.
Fixed maxctrl in-source builds by making the copying of the sources
optional.
Only used in conjunction with queued connections, which are not
enabled anyway. Once that comes on the table again, better to use
some standard data structures.
Apart from listeners, all DCBs will be assigned to the current
thread. This simplifies the addition of DCBs to worker threads.
Also performed a small cleanup of poll_add_dcb to make it more readable.
Now it is also possible to ensure that a DCB stays alive while
a task referring to it is posted from one worker to another.
That will be implemented in a subsequent commit.
The polling mechanism can now optionally be used for managing
the lifetime of an object placed into the poll set.
If a MXS_POLL_DATA has a non-null 'free', then the reference count
of the data will be increased before calling the handler and
decreased after. In that case, if the reference count reaches 0,
the free function will be called.
Note that the reference counts of *all* MXS_POLL_DATAs returned
by 'epoll_wait' will be increased before the events are delivered
to the handlers individually for each MXS_POLL_DATA, and then once
all events have been delivered, the reference count of each
MXS_POLL_DATA will be decreased.
This ensure that if there are interdependencies between different
MXS_POLL_DATAs returned by one call to 'epoll_wait', the case that
an MXS_POLL_DATA is deleted before its events have been delivered
can be avoided by using the reference count for lifetime management.
In subsequent commits, the reference count will be taken into use
in the lifetime management of DCBs.
When something fails inside dcb_connect we rewind the situation
properly, without calling any of the close functions intended for
shutting down a properly created DCB. That way they can be simplified
and once the reference counting is taken into use it is sufficient to
call dcb_dec_ref(dcb) instead of dcb_free_all_memory().
If a fake event is added to the current dcb, we arrange things so
that it is delivered immediately when the handling of the event(s)
during which the fake event was added, has been performed.
Otherwise the event is delivered via the event loop.