Restore the portal-level snapshot after procedure COMMIT/ROLLBACK.

COMMIT/ROLLBACK necessarily destroys all snapshots within the session.
The original implementation of intra-procedure transactions just
cavalierly did that, ignoring the fact that this left us executing in
a rather different environment than normal.  In particular, it turns
out that handling of toasted datums depends rather critically on there
being an outer ActiveSnapshot: otherwise, when SPI or the core
executor pop whatever snapshot they used and return, it's unsafe to
dereference any toasted datums that may appear in the query result.
It's possible to demonstrate "no known snapshots" and "missing chunk
number N for toast value" errors as a result of this oversight.

Historically this outer snapshot has been held by the Portal code,
and that seems like a good plan to preserve.  So add infrastructure
to pquery.c to allow re-establishing the Portal-owned snapshot if it's
not there anymore, and add enough bookkeeping support that we can tell
whether it is or not.

We can't, however, just re-establish the Portal snapshot as part of
COMMIT/ROLLBACK.  As in normal transaction start, acquiring the first
snapshot should wait until after SET and LOCK commands.  Hence, teach
spi.c about doing this at the right time.  (Note that this patch
doesn't fix the problem for any PLs that try to run intra-procedure
transactions without using SPI to execute SQL commands.)

This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD,
rename that to allow_nonatomic.

replication/logical/worker.c also needs some fixes, because it wasn't
careful to hold a snapshot open around AFTER trigger execution.
That code doesn't use a Portal, which I suspect someday we're gonna
have to fix.  But for now, just rearrange the order of operations.
This includes back-patching the recent addition of finish_estate()
to centralize the cleanup logic there.

This also back-patches commit 2ecfeda3e into v13, to improve the
test coverage for worker.c (it was that test that exposed that
worker.c's snapshot management is wrong).

Per bug #15990 from Andreas Wicht.  Back-patch to v11 where
intra-procedure COMMIT was added.

Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
This commit is contained in:
Tom Lane
2021-05-21 14:03:53 -04:00
parent 71787b23e3
commit ef94805096
11 changed files with 323 additions and 112 deletions

View File

@ -192,6 +192,14 @@ typedef struct PortalData
/* Presentation data, primarily used by the pg_cursors system view */
TimestampTz creation_time; /* time at which this portal was defined */
bool visible; /* include this portal in pg_cursors? */
/*
* Outermost ActiveSnapshot for execution of the portal's queries. For
* all but a few utility commands, we require such a snapshot to exist.
* This ensures that TOAST references in query results can be detoasted,
* and helps to reduce thrashing of the process's exposed xmin.
*/
Snapshot portalSnapshot; /* active snapshot, or NULL if none */
} PortalData;
/*
@ -235,5 +243,6 @@ extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
extern bool ThereAreNoReadyPortals(void);
extern void HoldPinnedPortals(void);
extern void ForgetPortalSnapshots(void);
#endif /* PORTAL_H */