When a DCB error occurs, the handleError entry point of the routers is
called. The caller of this entry point expects that the error handler
marks the DCB as handled. The aforementioned behavior is wrong as the
error handler should not keep track of whether the handler was already
called.
Closing the DCB and the backend reference that uses it at the same time
makes the error handling code clearer and removes some of the assumptions
that the code made. It will cause the DCB to be closed in multiple places
but the logic of why a DCB is being closed is more visible from the code.
This change should remove all cases where a DCB is closed without a
tightly coupled backend reference.
If the `error_on_write` mode is used when a master loses the master state,
the backend would not get closed. This would allow masters that appear
back to be used which is not intended.
Added a check for the validity of the backend DCBs before they are
closed. This should guarantee that only valid DCBs are closed by
readwritesplit.
However, this is not the correct solution for the problem. The DCB should
not be in an invalid state in the first place and this fix just removes
the bad side effects of the double closing.
With the added logging in the readwritesplit error handler, more detailed
information should become available.
If a DCB is passed to the error handler for which we cannot find the
corresponding backend reference, it should not be closed.
Added extra logging for situations where the backend reference can't be
found or it is in the wrong state.
MXS-1009. This commit adds a gwbuf_free after maxinfo_execute() to
free a buffer with an sql-query after it has been processed. Also,
the parse tree in maxinfo_execute_query() is now freed. The tree_free-
function was renamed to maxinfo_tree_free, since it is now globally
available.
This commit has additional changes (in relation to the 1.4.4 branch)
to remove errors caused by differences in the html and sql-sides of
MaxInfo.
This is not the optimal way to do error handling but it should solve all
problems that could rise from the multi-threaded model of MaxScale.
By taking a lock at the start of handleError, we'll be able to modify the
dcb error handling flag in a thread-safe manner. This should prevent
double error handling for all DCBs.
JSON does not have a concept of streams and a common way to stream JSON is
to separate each JSON object with a newline. Adding a newline makes it
easier to parse as JSON values do not natively contain newlines.
The prepared statements were router according to the real type instead of
being router to the master. This was caused by the change in the route
target function.
This commit adds a free() to null_auth_free_client_data, which plugs
the memory leak in maxinfo.
Also, this commit fixes some segfaults when multiple threads are
running status_row() or variable_row(). The functions use
statically allocated index variables, which often go out-of-bounds
in concurrent use. This fix changes the indexes to thread-specific
variables, with allocating and deallocating. This does seem to slow
the functions down somewhat.
With the use_sql_variables_in=master option, readwritesplit should route
all user variable modifications and reads with user variables to the
master.
Previously, the modification of user variables was grouped into generic
system variables which caused all modifications to system variables to go
to the master only. The router requires a finer grained distiction between
normal system variable modifications and user variable modifications.
With the improvements to the query classifier, readwritesplit now properly
routes all user variable operations to the master and other system
variable modifications to all servers.
When a connection to the master fails, readwritesplit should always treat
it the same way. Previously, if a connection to the master was lost but it
hadn't lost the master status, the failure would be treated like a slave
server failure.
The backend references now use a common closing function so that all
variables are reset to proper states. The stored queries weren't always
freed and they would leak memory if left open.
The master reference used by the readwritesplit sessions needs to be
reassigned if slave reconnection occurs. This happens because the
reference refers to a certain place in the backend reference array
instead of the actual backend reference and those places are mixed
when the array is sorted.
The error logging is now more detailed and tells why the connection is
being closed. This should help the user figure out what is happening when
write fails and the connection is closed.
Some of the master server status checks didn't check whether the server
was actually running. The macros in server.h should always be used instead
of manually inspecting the server status.
If a readwritesplit session is active, it should never connect to a new
master. This will lead to unexpected results as the session states aren't
consistent.
If an illegal DCB close is done with a backend DCB, it will log the server
where it was connected. This allows us to know whether the DCB was
connected to a master or a slave.
Added more debug assertions to readwritesplit code. The DCBs should never
enter the DCB_STATE_DISCONNECTED.
Removed useless debug log messages. The messages usually just flood the
logs with no use to the developers.
A debug assertion failed due to a NULL buffer but a non-zero packet
length. This was caused by a missing reset of the packet length after
freeing the buffers.
Some error messages were logged at INFO level and some had conditions that
prevent the logging. Removed these restrictions that an error situation is
always logged.
If a master_failure_mode was set to error_on_write, a reconnection to the
old master would happen after the following events:
- Master server fails and the connection is closed
- The master server recovers
- A slave fails and the connection is closed
- A replacement for the slave is searched
If these events took place, the master would be taken back into use with
an inconsistent session state.
Binlog server is already configured: if there is no pending transaction
a new binlog file is created after CHANGE MASTER.
If as START SLAVE is issued replication starts as usuale.
If maxscale is restarted the replication starts using the new created
file.
While configuring binlog server for the first time, master.ini not
existent, the specified MASTER_LOG_FILE is created in the $binlogdir.
If START SLAVE command is not issued the replication can start after
restarting maxscale as the binlog file exists.
When checksum is in use and there is an error in replication stream
master connection the blr_terminate_master_replication has no effect.
MXS-961: The checksum detection calls
blr_master_delayed_connect(router); and connection is scheduled again.
The fix will break the main loop as soon as the error indicator byte is
seen and no other computation will be done (such as checksum)
When the readwritesplit can't locate the master server when it's checking
the list of available servers, it logs an error if the original master
reference isn't in a valid state. This error should only be logged if the
server is in use but in an unexpected state.
The backend reference states should be cleared when a reconnection attempt
is made. Should the creation of a new DCB succeed, the backend should no
longer be closed.
When a backend causes an error and it should be sent to the client, the
backend reference was closed but the waiting results state was not
cleared. This caused a debug assertion to be hit.
The active operation counters are now closed every time a backend referece
is taken out of use. This should fix a few debug assertions that were hit
in tests.
When a client executes commands which do not return results (for example
inserting BLOB data via the C API), readwritesplit expects a result for
each sent packet. This is a somewhat of a false assumption but it clears
itself out when the session is closed normally. If the session is closed
due to an error, the counter is not decremented.
Each sesssion should only increase the number of active operation on a
server by one operation. By checking that the session is not already
executing an operation before incrementing the active operation count the
runtime operation count will be correct.
It's now possible to use both a Unix domain socket and host/port
when connecting with MaxAdmin to MaxScale.
By default MaxAdmin will attempt to use the default Unix domain
socket, but if host and/or port has been specified, then an inet
socket will be used.
maxscaled will authenticate the connection attempt differently
depending on whether a Unix domain socket is used or not. If
a Unix domain socket is used, then the Linux user id will be
used for the authorization, otherwise the 1.4.3 username/password
handshake will be performed.
adminusers has now been extended so that there is one set of
functions for local users (connecting locally over a Unix socket)
and one set of functions for remote users (connecting locally
or remotely over an Inet socket).
The local users are stored in the new .../maxscale-users and the
remote users in .../passwd. That is, the old users of a 1.4
installation will work as such in 2.0.
One difference is that there will be *no* default remote user.
That is, remote users will always have to be added manually using
a local user.
The implementation is shared; the local and remote alternatives
use common functions to which the hashtable and filename to be
used are forwarded.
The commands "[add|remove] user" behave now exactly like they did
in 1.4.3, and also all existing users work out of the box.
In addition there is now the commands "[enable|disable] account"
using which Linux accounts can be enabled for MaxAdmin usage.