The duplicate event error message now logs the length of the slave's
write queue. This will tell how much data is still buffered inside MaxScale
when duplicate events are detected.
If a duplicate event is detected the state of the slave is set
to BLRS_ERRORED and the connection is closed. That way the
duplicate event will not break the slave, and it will pick
up its state when it reconnects.
When an event is sent to a slave, we store information about the
event and who sent it, so that we can detect if the same event is
sent twice. If a duplicate event is detected, we log information
about it.
If transaction safety was disabled and a large event sent in multiple SQL
packets was received, the distribution of that event to the slaves would fail.
The empty packet sent after a large event which fits into exactly one packet
was written to disk and the writing of no bytes caused it to be treated as
an error.
The router->last_written is used to store the position where the last event was
written. The replication header is also stored in a separate structure in
the router which is used later when the last packet of a multi-packet event
arrives.
In blr_slave_callback the bits of slave->cstate are reset and
set as one transaction. Earlier they were reset in one and
set in another, leading to a situation where slave->cstate did
not contain a sensible value for a short period of time.
Further, it is now explicitly checked in blr_distribute_binlog_record
that slave->cstate indeed contains a meaningful value.
The unsafe slave position is no longer an error and will be treated the
same way if no events are available i.e. the slaves are no longer disconnected.
The log messages now have more information such as the current committed
transaction event being processed and the number of events sent by the
current thread.
Changed burst_size to long instead of unsigned long.
This way check burst_size > 0 is now effective.
Setting "burstsize" option in router_options may be required.
i.e.: burstsize=10M
Slave request for a log_pos behind binlog file size may result in a
disconnection or replication error:
if binlog file is latest one slave get disconnected otherwise an error
message is returned and replication stops
The binlog file is now always opened when it is needed and closed
when we are finished with it. That will remove any potential
file concurrency issues between different threads dealing with
the same slave.
In blr_open_binlog the refcnt increase of file which is already
open is protected by router->fileslock. In blr_close_binlog the
decrease of the refcnt was protected by file->lock.
This lead to a situation where it was possible that a file was
closed and the file instance freed, even though it just had been
taken into use by somebody else.
This is now fixed by solely using the router->fileslock for protecting
the increase and decrease of the refcnt.
In blr_slave.c under certain conditions, two locks were not released.
That was fixed in another change, and with this change a notice will be
logged if that branch is entered. That way it will be possible to find
out whether this may have been the cause of earlier lock-ups.