4450 Commits

Author SHA1 Message Date
Markus Makela
0ce48474eb Added logging for safe event and current event mismatch
If the position being currently processed is not the current safe event,
a log message is written.
2016-04-05 16:57:39 +03:00
Markus Makela
ed9356562c Added DCB write queue to error message
The duplicate event error message now logs the length of the slave's
write queue. This will tell how much data is still buffered inside MaxScale
when duplicate events are detected.
2016-04-05 15:01:30 +03:00
Johan Wikman
0493196ecc Disconnect slave if duplicate event detected
If a duplicate event is detected the state of the slave is set
to BLRS_ERRORED and the connection is closed. That way the
duplicate event will not break the slave, and it will pick
up its state when it reconnects.
2016-03-31 10:57:23 +03:00
Johan Wikman
51e60000dd Add duplicate event detection & logging.
When an event is sent to a slave, we store information about the
event and who sent it, so that we can detect if the same event is
sent twice. If a duplicate event is detected, we log information
about it.
2016-03-31 10:50:47 +03:00
Johan Wikman
f551099af9 Reformat blr.h
By oversight was not reformatted when the source was.
2016-03-17 17:06:23 +02:00
Johan Wikman
5070b81473 Reformat binlog router. 2016-03-15 12:56:41 +02:00
Markus Makela
59f5880898 Added missing OK byte to payload size calculation
The OK byte was not taken into notice when the total size of all the payloads
in all the packets was calculated.
2016-02-17 08:30:50 +02:00
Markus Makela
63ce9fe6bc Fixed formatting and added more error checks
Added log messages when ftruncate fails and cleaned up formatting.
2016-02-16 13:06:25 +02:00
Markus Makela
cd2af6ffef Cleaned up the code based on the code review
Added missing error condition checks and cleaned up code.
2016-02-16 13:06:25 +02:00
Markus Makela
a55f017c75 Fixed packets with a length of one being ignored
The packets were not written into the binlogs which caused binlog corruption.
2016-02-16 13:06:25 +02:00
Markus Makela
9306b9d68c Added detection of checksums split across two packets
The checksums should now be processed properly event if the event is in more than
one packet.
2016-02-16 13:06:25 +02:00
Markus Makela
36896afcbd Fixed missing NULL check when reading records
If the binlog record was not found, a NULL pointer is returned. There was no
check for the return value and it assumed that it was always non-NULL.
2016-02-16 13:06:25 +02:00
Markus Makela
74c8b5e296 Fixed events larger than 2^24 failing without transaction safety
If transaction safety was disabled and a large event sent in multiple SQL
packets was received, the distribution of that event to the slaves would fail.
2016-02-16 13:06:25 +02:00
Markus Makela
2b7e2d3043 Added checksum calculations for events larger than 2^24 bytes
The checksums are now properly calculated for large events that span multiple
SQL packets.
2016-02-16 13:06:25 +02:00
Markus Makela
3609f97ba0 Fixed events which are exactly 0x00fffffe bytes long failing to replicate
The empty packet sent after a large event which fits into exactly one packet
was written to disk and the writing of no bytes caused it to be treated as
an error.
2016-02-16 13:06:25 +02:00
MassimilianoPinto
476691eda1 Removed log message for event larger than 16MB
The log message used during tests is now removed
2016-02-16 13:06:25 +02:00
Markus Makela
3e04a36ac3 Added support for distribution of packets larger than 2^24 bytes
Moved the the sending of the replication events to a different function
and added support for events that span multiple MySQL packets.
2016-02-16 13:06:25 +02:00
Markus Makela
12ee568978 Fixed last_written being set to the size of the event
The addition used =+ instead of += which caused it to be an assignment.
2016-02-16 13:06:25 +02:00
Markus Makela
d2b4713d27 Added missing condition to else clause
This fixes all packets being considered as large packets.
2016-02-16 13:06:25 +02:00
Markus Makela
ae33df3cbc Large events are now processed in chuncks
The router->last_written is used to store the position where the last event was
written. The replication header is also stored in a separate structure in
the router which is used later when the last packet of a multi-packet event
arrives.
2016-02-16 13:06:24 +02:00
MassimilianoPinto
d3e1d4dd2f First fix for 16MB handling in the master part
First fix for 16MB handling in the master part.

Distribute events to up to date slave is not included yet
2016-02-16 13:06:24 +02:00
MassimilianoPinto
ab1fb90d86 Fix for 16MB transmission in the slave part
Fix for 16MB transmission in the slave part
2016-02-16 13:06:24 +02:00
MassimilianoPinto
4eccc1acb6 When creating heartbeat packet too many bytes were copied.
The memory area ‘ptr’ points to contains now the right data
2016-02-11 17:06:13 +01:00
MassimilianoPinto
0ab9733393 The router->rotating is no longer part of Unsafe Pos check
In blr_read_binlog the router->rotating is no longer used for Unsafe
Pos check
2016-02-01 09:12:48 +01:00
Johan Wikman
d9b022db10 Protect updating of router when rotating.
When rotating, all state variables of router are now updated while
protected by the router->binlog_lock lock.
2016-01-28 15:23:22 +02:00
Johan Wikman
0deffbf2f2 Ensure that slave->cstate contains meaningful value.
In blr_slave_callback the bits of slave->cstate are reset and
set as one transaction. Earlier they were reset in one and
set in another, leading to a situation where slave->cstate did
not contain a sensible value for a short period of time.

Further, it is now explicitly checked in blr_distribute_binlog_record
that slave->cstate indeed contains a meaningful value.
2016-01-28 11:00:07 +02:00
MassimilianoPinto
1a4fc56c67 Unsafe Pos detection moved into blr_slave_catchup and removed router->rotating check
Unsafe Pos detection moved into blr_slave_catchup and removed
router->rotating check
2016-01-25 12:27:57 +01:00
MassimilianoPinto
0a3f20f8af Variable moved
Variable moved
2016-01-11 09:59:22 +01:00
MassimilianoPinto
1b0c7d0d90 Force slave disconnection when requesting an unsafe pos with blr_slave_binlog_dump
Force slave disconnection when requesting an unsafe pos with
blr_slave_binlog_dump
2016-01-08 18:51:36 +01:00
MassimilianoPinto
2715d3f8e4 Removed the 16 chars limitation for binlog file name
Removed the 16 chars limitation for binlog file name
2016-01-07 15:30:57 +01:00
Markus Makela
a5ccf09ac5 Unsafe position is no longer an error
The unsafe slave position is no longer an error and will be treated the
same way if no events are available i.e. the slaves are no longer disconnected.

The log messages now have more information such as the current committed
transaction event being processed and the number of events sent by the
current thread.
2015-12-30 18:13:07 +02:00
MassimilianoPinto
23809af02e Changed burst_size to long instead of unsigned long
Changed burst_size to long instead of unsigned long.
This way check burst_size > 0 is now effective.

Setting "burstsize" option in router_options may be required.
i.e.: burstsize=10M
2015-12-30 16:03:30 +01:00
MassimilianoPinto
82914d43d2 Removed extra brace
Removed extra brace
2015-12-17 16:25:04 +01:00
MassimilianoPinto
b55f100e1f Changed behaviour for a slave requesting master_log_pos beyond binlog file size
Slave request for a log_pos behind binlog file size may result in a
disconnection or replication error:

if binlog file is latest one slave get disconnected otherwise an error
message is returned and replication stops
2015-12-17 15:45:16 +01:00
MassimilianoPinto
162e73f083 Added MaxScale version numbers into the CMake cache
This will allow custom version numbers without modifying the source
code.
2015-12-17 15:28:28 +01:00
Johan Wikman
40cfacfec4 Remove file from slave
The binlog file is now always opened when it is needed and closed
when we are finished with it. That will remove any potential
file concurrency issues between different threads dealing with
the same slave.
2015-12-11 17:25:27 +01:00
Markus Makela
28f05198dd Fixed SHOW SLAVE STATUS showing obsolete slaves
If SHOW SLAVE STATUS was executed after DISCONNECT ALL it was possible that
some of the disconnected slaves were used when printing slave hosts.
2015-12-10 14:50:02 +01:00
Markus Makela
992a8e2300 Slaves are set to unregistered state once disconnected
It was possible that the same slave was disconnected multiple times
before the slave DCB was closed.
2015-12-10 14:49:51 +01:00
Markus Makela
c8a9eafdc0 Replaced explicit closeSession calls with dcb_close
The closeSession entry point shouldn't be called directly and dcb_close
should be used instead.
2015-12-10 14:49:40 +01:00
Johan Wikman
2f54f33cfb Make state-change logging conditional. 2015-12-03 09:54:31 +02:00
Johan Wikman
af7a19b7b3 Reduce logging of binlog server
Only the true state changes of a slave - up-to-date -> catch-up
or catch-up to up-to-date - are logged.
2015-12-02 15:23:55 +02:00
MassimilianoPinto
3d8adefa73 Removed useless spaces
Removed useless spaces
2015-12-01 16:21:53 +01:00
MassimilianoPinto
a53213093a Addition of slave transition to catchup mode in logging
Addition of slave transition to catchup mode in logging
2015-12-01 16:16:14 +01:00
MassimilianoPinto
6367ac7148 Changed log level for up to date transition
Changed log level for up to date transition
2015-11-30 19:23:36 +01:00
MassimilianoPinto
592e4d06cb Changed name for bad fd
Changed name for bad fd
2015-11-30 10:22:47 +01:00
MassimilianoPinto
ba135c5548 Log messages fix with slave ip:port and id
Log messages fix with slave ip:port and id
2015-11-30 10:22:12 +01:00
Johan Wikman
e38334c457 Fix locking issue in blr_close_binlog
In blr_open_binlog the refcnt increase of file which is already
open is protected by router->fileslock. In blr_close_binlog the
decrease of the refcnt was protected by file->lock.

This lead to a situation where it was possible that a file was
closed and the file instance freed, even though it just had been
taken into use by somebody else.

This is now fixed by solely using the router->fileslock for protecting
the increase and decrease of the refcnt.
2015-11-26 10:34:34 +02:00
Johan Wikman
c167499c7b Add notice about previous failure to unlock.
In blr_slave.c under certain conditions, two locks were not released.
That was fixed in another change, and with this change a notice will be
logged if that branch is entered. That way it will be possible to find
out whether this may have been the cause of earlier lock-ups.
2015-11-23 09:51:59 +02:00
MassimilianoPinto
99fdf9cdec Fixed reference to LOGIF macro
Fixed reference to LOGIF macro
2015-11-19 17:54:23 +01:00
MassimilianoPinto
023d4bc588 Develop merge
Develop merge
2015-11-19 17:06:30 +01:00