The shutdown signal handlers were installed before the workers were
initialized and weren't removed before the workers were deleted. This
would lead to a debug assertion and an eventual crash when a SIGTERM
signal was received outside of the expected scope.
The proper way to do this is to install the handlers only after the system
is up and running and to disable them as soon as the shutdown process
starts.
This mostly happened with the mxs621_unreadable_cnf test as it seemed to
receive a SIGTERM during the execution of the at-exit handlers.
As long as the same thread never handles more than one fatal signal,
multiple fatal signals can be processed. This should guarantee that the
stacktrace is printed into the log while guaranteeing that recursion never
takes place if the handling of a fatal signal causes a fatal signal to be
emitted.
By stopping the REST API before the workers and moving the shutdown to the
same worker that handles REST API requests, we prevent the hang on
shutdown. This also makes the signal handler signal-safe.
Exclude systemd usage if the library is not installed.
Only excluding what is necessary. This keeps the object size the
same and still compiles most of the code.
Systemd wathdog notification at a little more than 2/3 of the
systemd configured time. In the service config (maxscale.service)
add e.g. WatchdogSec=30s to set and enable the watchdog.
For building: install libsystemd-dev.
The next commit will modify cmake configuration and code to
conditionally compile the new code based on existence of libsystemd-dev.
If the log file is successfully opened, both stdout and stderr are
redirected to it. This helps catch ASAN reports without having to read the
system journal files.
As the output is redirected to a file, some of the output was made visible
only in non-daemon mode. This helps keep the log file clean and readable.
See script directory for method. The script to run in the top level
MaxScale directory is called maxscale-uncrustify.sh, which uses
another script, list-src, from the same directory (so you need to set
your PATH). The uncrustify version was 0.66.
Given that worker.hh was public, it made sense to make routingworker.hh
public as well. This removes the need to include private headers in
modules and allows C++ constructs to be used in C++ code when previously
only the C API was available.
The maxscale_is_shutting_down function is used to detect when MaxScale
should stop. This fixes a race condition in the code where the workers has
not yet been initialized but a termination signal has been received. It
also replaces the misuse of the service_should_stop variable with a proper
function.
The print_log_n_stderr function used to implicitly initialize the log
manager. As this is no longer done, no logging must be done before the log
manager is initialized.
The executable name of the PID is now checked to be maxscale. This fixes
the problem where MaxScale would refuse to start if a stale PID file had a
PID of a process that's not a MaxScale process.
The atexit functions should be registered after there is a need for
them. This fixes the problem where cleanup functions were called before
the initialization for them was done. Also removed the header and footer
printint to stdout.
Now that the exit handlers work correctly, the output of `maxscale
--version` no longer consists of a single line but multiple lines in both
stdout and stderr. Removing the printing to stderr guarantees that the
correct output is produced.
Removed skygw_utils and relate files along with the old log manager
code. Also removed file flushing due to it being redundant; messages are
written to the file immediately. Adjusted tests to accommodate this
change.
The feature was rarely used and was only useful in extremely rare
cases. The functionality can still be emulated, if for some reason needed,
by pointing `logdir` to `/dev/shm` or another tmpfs mount.
The custom random number generator can now be replaced with a C++11
RNG. This greatly improves the reliability and trustworthiness of it.
In addition to this, the conversion of the RNG to a thread-local object
removes the race condition that was present in the previous
implementation. It also theoretically improves performance by a tiny bit.