MaxScale

Author	SHA1	Message	Date
Markus Mäkelä	b97976c4ee	MXS-2323: Close stale connections Cleaning up and closing stale connections to servers in maintenance mode helps administrators see when a server is no longer in use.	2019-03-07 15:59:26 +02:00
Markus Mäkelä	a7be3c527c	Remove unnecessary memory allocations Given the fact that there exist only three possible categories, the map can be replaced with a static array that needs no memory allocations. Making this array thread-local allows it to be reused which places an upper limit on the number of memory allocations.	2019-03-07 15:59:26 +02:00
Markus Mäkelä	aea64aede2	Prune only when history size is exceeded The documentation stated that at most `max_sescmd_history` commands were kept but in reality the number of commands kept in the history was one command smaller than what was documented.	2019-02-11 10:58:17 +02:00
Markus Mäkelä	b93d35ab03	Revert "MXS-2300: Fix off-by-one bug in history size" This reverts commit 840b4b24bd048ed536621d4433abbb4e846dfcc5.	2019-02-11 10:58:17 +02:00
Markus Mäkelä	840b4b24bd	MXS-2300: Fix off-by-one bug in history size The history was one command shorter than what was configured.	2019-01-31 14:23:27 +02:00
Markus Mäkelä	260ce9b8b8	MXS-2300: Add session command pruning This commit adds a new parameter that, when enabled, prunes the session command history to a known length. This makes it possible to keep a client-side pooled connection open indefinitely at the cost of making reconnections theoretically unsafe. In practice the maximum history length can be set to a value that encompasses a single session using the pooled connection with no risk to session state integrity. The default history length of 50 commands is quite likely to be adequate for the majority of use-cases.	2019-01-31 14:23:26 +02:00
Markus Mäkelä	bf4aa1ab2c	MXS-2295: Keep the COM_CHANGE_USER command If the whole session command history is cleared, the COM_CHANGE_USER is lost and the connections can end up using different users.	2019-01-31 14:23:26 +02:00
Markus Mäkelä	2e809524d1	MXS-2295: Reset session commands on connection reset When the connection state is reset by executing a COM_CHANGE_USER or COM_RESET_CONNECTION, readwritesplit does not need to store the session command history that was executed before it. With this, pooled connections will effectively behave like normal connections if the pooling mechanism is smart enough to reset the connection. This also prevents unwanted visibility into the session states of other connections.	2019-01-31 14:23:26 +02:00
Markus Mäkelä	24c9b62a2f	Add verbose logging for session command failures If the routing of a session command fails due to problems with the backend connections, a more verbose error message is logged. The added status information in the Backend class makes tracking the original cause of the problem a lot easier due to knowing where, when and why the connection was closed.	2019-01-31 14:23:26 +02:00
Markus Mäkelä	6d88afbf55	MXS-2038: Fix debug assertion A query that is classified as a write but has a hint that tells to route it to a slave is not unexpected ever since the 2.1 version of MaxScale.	2019-01-28 18:36:52 +02:00
Markus Mäkelä	b078eb2cca	Add server state to routing hint log message If a server was not chosen as the target of a routing hint, the server status would not be logged. By logging the server state in the message, it is easier to figure out why a server wasn't chosen as the routing target.	2019-01-28 09:27:13 +02:00
Markus Mäkelä	2c95119a3b	Replace STRTARGET macro The macro was extremely unwieldly to update and made the addition of debug assertions harder. Rewriting it as an inline function makes this possible.	2019-01-28 09:27:13 +02:00
Esa Korhonen	7f978f275f	MXS-2223 Log a message when a slave is discriminated due to replication lag Both the replication lag and the message printing state are saved in SERVER, although the values are mostly used by readwritesplit. A log message is printed both when a server goes over the limit and when it comes back below. Because of concurrency issues, a message may be printed multiple times before different threads detect the new message state. Documentation updated to explain the change.	2019-01-21 13:02:18 +02:00
Markus Mäkelä	ba40916d4a	MXS-2266: Close prepared statements with internal ID The ID used to store the prepared statements uses the internal ID and using the external ID caused unwanted memory use and a false warning.	2019-01-16 12:22:06 +02:00
Markus Mäkelä	57fe5ff56a	Fix error packet stringification function The code read past the stack buffer.	2019-01-16 09:43:49 +02:00
Markus Mäkelä	021d48f94c	Log low-level reason and idle time on master failure If the connection to the master is lost, knowing what type of an error caused the call to handleError helps deduce what was the real reason for it. Logging the idle time of the connection helps detect when the wait_timeout of a connection is exceeded.	2019-01-16 09:43:49 +02:00
Markus Mäkelä	147f0bb656	Extend master failure error message The error now describes the failure mode in more detail. This should make post mortem analysis of failed connections a lot easier.	2019-01-09 20:05:38 +02:00
Markus Mäkelä	5f83b07fc2	MXS-2241: Detect invalid readwritesplit configuration master_reconnection and disable_sescmd_history are, in practice, mutually exclusive settings.	2019-01-07 11:06:24 +02:00
Markus Mäkelä	9adbd2f8f0	Cache the local server statistics object By storing the server statistics object in side the session, the lookup involved in getting a worker-local value is avoided. Since the lookup is done multiple times for a single query, it is beneficial to store it in the session. As the worker-local value is never deleted, it is safe to store a reference to it in the session. It is also never updated concurrently so no atomic operations are necessary.	2019-01-03 09:37:59 +02:00
Markus Mäkelä	1fa3b133c7	Make keepalive ping checks more efficient The code now only checks the need for a keepalive ping once every keepalive interval. Reduced the number of mxs_clock calls to one so that all servers use the same value.	2019-01-03 09:37:59 +02:00
Markus Mäkelä	35d31801bb	Merge branch '2.2' into 2.3	2018-12-17 23:52:56 +02:00
Markus Mäkelä	48efa6d027	MXS-2213: Clear stored PS information The information stored for each prepared statement would not be cleared until the end of the session. This is a problem if the sessions last for a very long time as the stored information is unused once a COM_STMT_CLOSE has been received. In addition to this, the session command response maps were not cleared correctly if all backends had processed all session commands.	2018-12-11 13:54:10 +02:00
Markus Mäkelä	da83551493	MXS-2189: Prevent unwanted trx replay When a transaction is being executed on a slave and the master fails, the transaction replay would start.	2018-11-27 12:52:45 +02:00
Markus Mäkelä	1abcbd64bd	MXS-2187: Allow multiple transaction retries By resetting the replay state the transaction replay can start again on a new server. This allows the replay process work when a master server is shutting down.	2018-11-27 12:52:44 +02:00
Markus Mäkelä	e6325d39fb	Delay initial transaction replay By delaying the replay for a second, we give the monitor a small chance to adap to master failures. It'll also prevent rapid re-querying if multiple transaction replays are supported.	2018-11-27 12:52:44 +02:00
Markus Mäkelä	851793cb86	Fix transaction replay debug assertion A transaction that just completed will go through the start_trx_replay function as from the client protocol's point of view the transaction is still open. The debug assertion did not take this into account and would fail if a successful commit was the last thing done on master that failed. Also fixed the formatting.	2018-11-27 12:52:44 +02:00
Markus Mäkelä	842f9f1d15	Fix transaction replay timeout The timeout would not be triggered due to the fact that the delayed_retry_timeout wasn't inspected.	2018-11-26 09:42:12 +02:00
Markus Mäkelä	7bf5c07835	Ignore errors sent by servers in shutdown When a server is stopping, it'll send an error to the client before terminating the TCP connection. The code in readwritesplit would detect this error and create a hangup event on the DCB. This would cause it to appear as if the TCP connection was broken and the router would immediately try to reconnect to the same server. By ignoring the error and allowing the connection to die on its own, we avoid immediately reconnecting and retrying any transactions on the stopping server. This increases the chances that the monitor will see it first and assign the server states correctly before the transaction replay is attempted.	2018-11-26 09:42:12 +02:00
Markus Mäkelä	9f6700b329	Skip connection_keepalive during transaction_replay When a transaction is replayed, there is no target but the routing was "successful".	2018-11-26 09:42:11 +02:00
Markus Mäkelä	925670ae2f	Fix false master failure log message The message would be logged even if the session continues.	2018-11-26 09:42:11 +02:00
Markus Mäkelä	8b92c63248	Remove incorrect assertion The assertion would hold true for a single worker but it can't be guaranteed to be true on a multi-worker system where the statistics are distributed across the workers.	2018-11-26 09:42:11 +02:00
Markus Mäkelä	dcf53da209	Enable connection_keepalive by default Enabling the feature by default prevents the master connection from dying during times when there are very little or no writes. Having a modest ping interval of 300 seconds serves to minimize the amount of extra work that both MaxScale and the server have to do while still keeping the connections in good shape.	2018-11-26 09:42:11 +02:00
Markus Mäkelä	cab8a4bde8	MXS-2144: Treat server shutdown as a network error If the server where a query is being executed is shutting down, readwritesplit should treat it as an error to make retrying of the query possible. By treating server shutdowns as network errors, the same code path that is used for actual network errors can be taken. This removes the need for any extra retrying logic for this particular case.	2018-11-14 16:23:47 +02:00
Markus Mäkelä	370483fb4b	Log slave error message on failed session command If the master succeeds in executing a session command but the slave fails, the error message could help explain why it failed. At the moment this is mainly relevant for inspection of test results.	2018-11-14 16:23:46 +02:00
Markus Mäkelä	c32bb18862	Fix transaction replay checksum mismatches The transaction replay could get mixed up with new queries if the client managed to perform one while the delayed routing was taking place. A proper way to solve this would be to cork the client DCB until the transaction is fully replayed. As this change would be relatively more complex compared to simply labeling queries that are being retried the corking implementation is left for later when a more complete solution can be designed. This commit also adds some of the missing info logging for the transaction replaying which makes analysis of failures easier.	2018-11-13 16:48:03 +02:00
Markus Mäkelä	ae0e9b359d	Fix use of zero-weight servers The servers with a zero weight would be always used over ones that have a weight. This means that the behavior was inverted and caused the mxs2054_hybrid_cluster test to fail in 2.3. Also fixed a typo in the deprecation message.	2018-11-12 10:13:59 +02:00
Markus Mäkelä	b443bb7525	Store PS session commands with internal ID Commit a9e236497963251f8b4afa07484b88ad97e73a03 changed where the PS ID for a binary protocol command is replaced with the internal form. This caused prepared statements that are also session commands to be always routed with the external ID. As the external ID is almost always the master's ID, the aforementioned bug resulted in odd side-effects and the true cause of these was only revealed when the error message sent by the slave was included in the log messages.	2018-11-12 10:13:59 +02:00
Markus Mäkelä	2a6df0e724	Merge branch '2.2' into 2.3	2018-11-09 14:22:28 +02:00
Markus Mäkelä	a9e2364979	Fix unknown PS ID on query re-routing If a PS command is routed multiple times, the ID will not be reverted to the external ID in the failure cases. This prevented prepared statements from being re-routed correctly.	2018-11-09 12:13:22 +02:00
Markus Mäkelä	bfc8cb4803	MXS-2151: Always log fatal master connection errors When the connection to the master is broken, the session is not configured to use the read-only modes and the monitor can still connect to the server, the connection will be closed and and error is sent to the client. To leave some trace of this problem in the MaxScale logs, a message should always be logged when a network error occurs.	2018-11-09 00:39:32 +02:00
Johan Wikman	c899f00541	MXS-1780 Collect server response information As the router is the only one that knows what backends a particular statement has been sent to, it is the responsibility of the router to keep the session bookkeeping up to date. If it doesn't we will know what statements a session has received (provided at least some component in the routing chain has RCAP_TYPE_STMT_INPUT capability), but not how long their processing took. Currently only readwritesplit does that. All queries are stored and not just COM_QUERY as that makes the overall bookkeeping simpler; at clientReply() time we do not need to know whether or not to bookkeep information, we can just do it. When session information is queried for, we report as much information we have available.	2018-11-08 12:04:55 +02:00
Niclas Antti	3ccdb508de	Fix bug in roulette wheel Slot values were changed after the total was calculated. Fix bug and adjust the offending code.	2018-11-08 10:50:23 +02:00
Niclas Antti	c692c864e2	MXS-2078 Take new statistics into use	2018-11-08 10:44:32 +02:00
Markus Mäkelä	113b1503f6	Expand readwritesplit delayed retry error message The error now explains if the write failure was due to the delayed_retry_timeout being reached.	2018-11-06 15:09:14 +02:00
Markus Mäkelä	4341c2b6e2	MXS-2142: Set causal_reads_timeout default to 10 The causal_reads_timeout default value is too long when considering the behavioral changes that MXS-2141 introduced. With a 10 second default value, a result is returned to the client in a reasonable amount of time.	2018-11-06 15:09:14 +02:00
Markus Mäkelä	e56372b153	MXS-2141: Retry query on master if it times out on slave With causal_reads enabled, the query would return with an error if the slave was not able to catch up to the master fast enough. By automatically retrying the query on the master, we're guaranteed that a valid result is always returned to the client.	2018-11-06 15:09:14 +02:00
Markus Mäkelä	c661f5e838	MXS-2139: Extend transaction_replay requirements Enabling transaction_replay now automatically enables master_failure_mode=fail_on_write. This makes the behavior consistent across all failure modes.	2018-11-06 15:09:14 +02:00
Markus Mäkelä	95745f5a4e	MXS-2140: Fix readwritesplit configuration processing Runtime configuration changes did not properly enable implicitly enabled parameters.	2018-11-06 15:09:14 +02:00
Niclas Antti	f8c132903b	Fix query average measurment and average text output. The query_ended() call was not in the right spot. Tests did not detect it. Changed textual output to reflect the fact that they are for RWSplit reads.	2018-11-04 17:18:09 +02:00
Markus Mäkelä	4d8a95d041	Merge commit '262f1d7e471bacca6b985ec3f2cd5cb76d6e2584' into 2.3	2018-10-26 12:44:57 +03:00

1 2 3 4 5 ...

1025 Commits