issue#I5UDM6 LogAccessExclusiveLock() when primary

node acquire AccessExclusiveLock for systable

数据库中表(例如系统表pg_database在大量drop database情况下)因为vacuum时会触发
truncate操作,备机replay truncate操作的同时,有监控程序查询该表(
xlog_block_smgr_redo_truncate采用LockRelFileNode给该表的relfilenode加锁了,但
scan操作是通过LockRelationOid给relation的oid加锁,导致replay truncate和select操
作并行了),使被truncate操作invalid的buffer再次被加载到缓冲池中,在之后新的数据
插入到此原本应该已经invalid的数据页时产生错误。query在initscan时通过smgrnblocks
拿到nblocks,如果此时replay trucate log先做了InvalidateBuffer,再做
CacheInvalidateSmgr、smgr_truncate,会导致scan时再次讲invalid的buffer加载到
bufferpool中,产生错误。
回看pg代码,发现在lock relation时候,主机加AccessExclusiveLock时候会记录一条
XLOG_STANDBY_LOCK的日志,当备机回放到该日志时会加上AccessExclusiveLock。但是在
og中,因为原本分布式代码的残留,只在用户表时记了该日志,所以导致系统表会出现上
述问题。
This commit is contained in:
Chunling Wang
2022-09-30 16:34:36 +08:00
parent 6f87123c7a
commit ecfc320dd6

View File

@ -754,30 +754,8 @@ static LockAcquireResult LockAcquireExtendedXC(const LOCKTAG *locktag, LOCKMODE
*/
if (lockmode >= AccessExclusiveLock && locktag->locktag_type == LOCKTAG_RELATION && !RecoveryInProgress() &&
XLogStandbyInfoActive()) {
/*
* In a scenario like:
*
* 1, openGauss run vacuum full or autovacuum pg_class, insert AccessExclusiveLock xlog.
* 2, datanode crash, vacuum full abort.
* 3, datanode restart in pending mode, start recovery.
* 4, startup thread acquire pg_class's AccessExclusiveLock.
* 5, startup thread complete recovery and wait for notify.
* 6, cm agent connect datanode, need to init relcache file.
* 7, cm agent connect want to acquire pg_class's AccessShareLock,
* but the AccessExclusiveLock lock is hold by startup thread.
* 8, dead lock, datanode hang.
*
* Other system tables like pg_attribute/pg_type.. also have such problem.
*
* To solve this problem, we don't insert AccessExclusiveLock xlog for system tables.
* This change may cause exception when primary run vacuum full system table while
* standby access the system table at the same time.
*
*/
if (locktag->locktag_field2 > FirstNormalObjectId) {
LogAccessExclusiveLockPrepare();
log_lock = true;
}
LogAccessExclusiveLockPrepare();
log_lock = true;
}
/*