We fail to apply updates and deletes when the REPLICA IDENTITY FULL is
used for the table having generated columns. We didn't use to ignore
generated columns while doing tuple comparison among the tuples from
the publisher and subscriber during apply of updates and deletes.
Author: Onder Kalaci
Reviewed-by: Shi yu, Amit Kapila
Backpatch-through: 12
Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
We fail to apply updates and deletes when the REPLICA IDENTITY FULL is
used for the table having dropped columns. We didn't use to ignore dropped
columns while doing tuple comparison among the tuples from the publisher
and subscriber during apply of updates and deletes.
Author: Onder Kalaci, Shi yu
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
With unlucky timing and parallel_leader_participation=off (not the
default), PHJ could attempt to access per-batch shared state just as it
was being freed. There was code intended to prevent that by checking
for a cleared pointer, but it was racy. Fix, by introducing an extra
barrier phase. The new phase PHJ_BUILD_RUNNING means that it's safe to
access the per-batch state to find a batch to help with, and
PHJ_BUILD_DONE means that it is too late. The last to detach will free
the array of per-batch state as before, but now it will also atomically
advance the phase, so that late attachers can avoid the hazard. This
mirrors the way per-batch hash tables are freed (see phases
PHJ_BATCH_PROBING and PHJ_BATCH_DONE).
An earlier attempt to fix this (commit 3b8981b6, later reverted) missed
one special case. When the inner side is empty (the "empty inner
optimization), the build barrier would only make it to
PHJ_BUILD_HASHING_INNER phase before workers attempted to detach from
the hashtable. In that case, fast-forward the build barrier to
PHJ_BUILD_RUNNING before proceeding, so that our later assertions hold
and we can still negotiate who is cleaning up.
Revealed by build farm failures, where BarrierAttach() failed a sanity
check assertion, because the memory had been clobbered by dsa_free().
In non-assert builds, the result could be a segmentation fault.
Back-patch to all supported releases.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reported-by: Michael Paquier <michael@paquier.xyz>
Reported-by: David Geier <geidav.pg@gmail.com>
Tested-by: David Geier <geidav.pg@gmail.com>
Discussion: https://postgr.es/m/20200929061142.GA29096%40paquier.xyz
When calculating distance in brin_minmax_multi_distance_inet(), the
netmask was applied incorrectly. This results in (seemingly) incorrect
ordering of values, triggering an assert.
For builds without asserts this is mostly harmless - we may merge other
ranges, possibly resulting in slightly less efficient index. But it's
still correct and the greedy algorithm doesn't guarantee optimality
anyway.
Backpatch to 14, where minmax-multi indexes were introduced.
Reported by Dmitry Dolgov, investigation and fix by me.
Reported-by: Dmitry Dolgov
Backpatch-through: 14
Discussion: https://postgr.es/m/17774-c6f3e36dd4471e67@postgresql.org
When probing the Memoize cache to check if the current cache key values
exist in the cache, we perform an evaluation of the expressions making up
the cache key before probing the hash table for those values. This
operation could leak memory as it is possible that the cache key is an
expression which requires allocation of memory, as was the case in bug
17844.
Here we fix this by correctly switching to the per tuple context before
evaluating the cache expressions so that the memory is freed next time the
per tuple context is reset.
Bug: 17844
Reported-by: Alexey Ermakov
Discussion: https://postgr.es/m/17844-d2f6f9e75a622bed@postgresql.org
Backpatch-through: 14, where Memoize was introduced
Hash partitioning on an enum is problematic because the hash codes are
derived from the OIDs assigned to the enum values, which will almost
certainly be different after a dump-and-reload than they were before.
This means that some rows probably end up in different partitions than
before, causing restore to fail because of partition constraint
violations. (pg_upgrade dodges this problem by using hacks to force
the enum values to keep the same OIDs, but that's not possible nor
desirable for pg_dump.)
Users can work around that by specifying --load-via-partition-root,
but since that's a dump-time not restore-time decision, one might
find out the need for it far too late. Instead, teach pg_dump to
apply that option automatically when dealing with a partitioned
table that has hash-on-enum partitioning.
Also deal with a pre-existing issue for --load-via-partition-root
mode: in a parallel restore, we try to TRUNCATE target tables just
before loading them, in order to enable some backend optimizations.
This is bad when using --load-via-partition-root because (a) we're
likely to suffer deadlocks from restore jobs trying to restore rows
into other partitions than they came from, and (b) if we miss getting
a deadlock we might still lose data due to a TRUNCATE removing rows
from some already-completed restore job.
The fix for this is conceptually simple: just don't TRUNCATE if we're
dealing with a --load-via-partition-root case. The tricky bit is for
pg_restore to identify those cases. In dumps using COPY commands we
can inspect each COPY command to see if it targets the nominal target
table or some ancestor. However, in dumps using INSERT commands it's
pretty impractical to examine the INSERTs in advance. To provide a
solution for that going forward, modify pg_dump to mark TABLE DATA
items that are using --load-via-partition-root with a comment.
(This change also responds to a complaint from Robert Haas that
the dump output for --load-via-partition-root is pretty confusing.)
pg_restore checks for the special comment as well as checking the
COPY command if present. This will fail to identify the combination
of --load-via-partition-root and --inserts in pre-existing dump files,
but that should be a pretty rare case in the field. If it does
happen you will probably get a deadlock failure that you can work
around by not using parallel restore, which is the same as before
this bug fix.
Having done this, there seems no remaining reason for the alarmism
in the pg_dump man page about combining --load-via-partition-root
with parallel restore, so remove that warning.
Patch by me; thanks to Julien Rouhaud for review. Back-patch to
v11 where hash partitioning was introduced.
Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
Unfortunately it turns out that the logfile-only option added in b9f8d1cbad7
is only available in openldap starting in 2.6.
Luckily the option to control the log level (loglevel/-s) have been around for
much longer. As it turns out loglevel/-s only control what goes into syslog,
not what ends up in the file specified with 'logfile' and stderr.
While we currently are specifying 'logfile', nothing ends up in it, as the
option only controls debug messages, and we didn't set a debug level. The
debug level can only be configured on the commandline and also prevents
forking. That'd require larger changes, so this commit doesn't tackle that
issue.
Specify the syslog level when starting slapd using -s, as that allows to
prevent all syslog messages if one uses '0' instead of 'none', while loglevel
doesn't prevent the first message.
Discussion: https://postgr.es/m/20230311233708.3yjdbjkly2q4gq2j@awork3.anarazel.de
Backpatch: 11-
Until now the tests using slapd spammed syslog for every connection /
query. Use logfile-only to prevent syslog activity. Unfortunately that only
takes effect after logging the first message, but that's still much better
than the prior situation.
Discussion: https://postgr.es/m/20230311233708.3yjdbjkly2q4gq2j@awork3.anarazel.de
Backpatch: 11-
Further to commit 6a9229da, checking for NULL is now redundant. An "out
of memory" error would have been thrown already by palloc() and treated
as FATAL, so we can delete a few more lines.
Back-patch to all releases, like those other commits.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/4040668.1679013388%40sss.pgh.pa.us
gcc 12+ has complaints like the following:
../../../../../pgsql/src/backend/utils/adt/network.c: In function 'inetnot':
../../../../../pgsql/src/backend/utils/adt/network.c:1893:34: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
1893 | pdst[nb] = ~pip[nb];
| ~~~~~~~~~^~~~~~~~~~
../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16
27 | unsigned char ipaddr[16]; /* up to 128 bits of address */
| ^~~~~~
../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16
This is due to a compiler bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104986
It has been a year since the bug has been reported without getting fixed. As
the warnings are verbose and use of gcc 12 is becoming more common, it seems
worth working around the bug. Particularly because a simple reformulation of
the loop condition fixes the issue and isn't any less readable.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/144536.1648326206@sss.pgh.pa.us
Backpatch: 11-
A comment was left behind claiming that we needed to use malloc() rather
than palloc() because the corresponding free would run in another
thread, but that's not true anymore. Remove that comment. And, with
the reason being gone, we might as well actually use palloc().
Back-patch to supported releases, like d41a178b.
Discussion: https://postgr.es/m/CA%2BhUKG%2BpdM9v3Jv4tc2BFx2jh_daY3uzUyAGBhtDkotEQDNPYw%40mail.gmail.com
The current implementation of _pgfstat64() is ineffective in detecting a
terminal handle or an anonymous named pipe. This commit improves our
port of fstat() to detect more efficiently such cases by relying on
GetFileType(), and returning more correct data when the type found is
either a FILE_TYPE_PIPE (_S_IFIFO) or a FILE_TYPE_CHAR (_S_IFCHR).
This is part of a more global fix to address failures when feeding the
output generated by pg_dump to pg_restore through a pipe, for example,
but not all of it. We are also going to need to do something about
fseek() and ftello() which are not reliable on WIN32 for the same cases
where fstat() was incorrect. Fixing fstat() is independent of the rest,
though, which is why both fixes are handled separately, and this is the
first part of it.
Reported-by: Daniel Watzinger
Author: Daniel Watzinger, Juan José Santamaría Flecha
Discussion: https://postgr.es/m/b1448cd7-871e-20e3-8398-895e2d1d3bf9@gmail.com
Backpatch-through: 14
Commit 4753ef37 changed vacuum_delay_point() to use the WaitLatch() API,
to fix the problem that vacuum could keep running for a very long time
after the postmaster died.
Unfortunately, that broke commit caf626b2's support for fractional
vacuum_cost_delay, which shipped in PostgreSQL 12. WaitLatch() works in
whole milliseconds.
For now, revert the change from commit 4753ef37, but add an explicit
check for postmaster death. That's an extra system call on systems
other than Linux and FreeBSD, but that overhead doesn't matter much
considering that we willingly went to sleep and woke up again. (In
later work, we might add higher resolution timeouts to the latch API so
that we could do this with our standard programming pattern, but that
wouldn't be back-patched.)
Back-patch to 14, where commit 4753ef37 arrived.
Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_b-q0hXCBUCAATh0Z4Zi6UkiC0k2DFgoD3nC-r3SkR3tg%40mail.gmail.com
Our waitpid() emulation didn't prevent a PID from being recycled by the
OS before the call to waitpid(). The postmaster could finish up
tracking more than one child process with the same PID, and confuse
them.
Fix, by moving the guts of pgwin32_deadchild_callback() into waitpid(),
so that resources are released synchronously. The process and PID
continue to exist until we close the process handle, which only happens
once we're ready to adjust our book-keeping of running children.
This seems to explain a couple of failures on CI. It had never been
reported before, despite the code being as old as the Windows port.
Perhaps Windows started recycling PIDs more rapidly, or perhaps timing
changes due to commit 7389aad6 made it more likely to break.
Thanks to Alexander Lakhin for analysis and Andres Freund for tracking
down the root cause.
Back-patch to all supported branches.
Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230208012852.bvkn2am4h4iqjogq%40awork3.anarazel.de
The band-aid applied in commit f0bedf3e4 turns out to still need
some work: it made sure we didn't set Np->last_relevant too small
(to the left of the decimal point), but it didn't prevent setting
it too large (off the end of the partially-converted string).
This could result in fetching data beyond the end of the allocated
space, which with very bad luck could cause a SIGSEGV, though
I don't see any hazard of interesting memory disclosure.
Per bug #17839 from Thiago Nunes. The bug's pretty ancient,
so back-patch to all supported versions.
Discussion: https://postgr.es/m/17839-aada50db24d7b0da@postgresql.org
Scanning the expression for compatible Vars isn't really necessary,
because the subsequent match against StatisticExtInfo entries will
eliminate expressions containing other Vars just fine. Moreover,
this code hadn't stopped to think about what to do with
PlaceHolderVars or Aggrefs in the clause; and at least for the PHV
case, that demonstrably leads to failures. Rather than work out
whether it's reasonable to ignore those, let's just remove the
whole stanza.
Per report from Richard Guo. Back-patch to v14 where this code
was added.
Discussion: https://postgr.es/m/CAMbWs48Mmvm-acGevXuwpB=g5JMqVSL6i9z5UaJyLGJqa-XPAA@mail.gmail.com
The majority of error exit cases in json_lex_string() failed to
set lex->token_terminator, causing problems for the error context
reporting code: it would see token_terminator less than token_start
and do something more or less nuts. In v14 and up the end result
could be as bad as a crash in report_json_context(). Older
versions accidentally avoided that fate; but all versions produce
error context lines that are far less useful than intended,
because they'd stop at the end of the prior token instead of
continuing to where the actually-bad input is.
To fix, invent some macros that make it less notationally painful
to do the right thing. Also add documentation about what the
function is actually required to do; and in >= v14, add an assertion
in report_json_context about token_terminator being sufficiently
far advanced.
Per report from Nikolay Shaplov. Back-patch to all supported
versions.
Discussion: https://postgr.es/m/7332649.x5DLKWyVIX@thinkpad-pgpro
check_agg_arguments_walker() supposed that it needn't descend into
the arguments of a lower-level aggregate function, but this is
just wrong in the presence of multiple levels of sub-select. The
oversight would lead to executor failures on queries that should
be rejected. (Prior to v11, they actually were rejected, thanks
to a "redundant" execution-time check.)
Per bug #17835 from Anban Company. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17835-4f29f3098b2d0ba4@postgresql.org
The error cases for TLS and GSS encryption were inconsistent. After TLS
fails, the connection is marked as dead and follow-up calls of
PQconnectPoll() would return immediately, but GSS encryption was not
doing that, so the connection would still have been allowed to enter the
GSS handling code. This was handled incorrectly when gssencmode was set
to "require". "prefer" was working correctly, and this could not happen
under "disable" as GSS encryption would not be attempted.
This commit makes the error handling of GSS encryption on par with TLS
portion, fixing the case of gssencmode=require.
Reported-by: Jacob Champion
Author: Michael Paquier
Reviewed-by: Jacob Champion, Stephen Frost
Discussion: https://postgr.es/m/23787477-5fe1-a161-6d2a-e459f74c4713@timescale.com
Backpatch-through: 12
This was an omission in the original creation of the module.
Also slightly adjust some wording to avoid a double "is".
Backpatch the non-meson piece of this to release 12, where the module
was introduced.
Discussion: https://postgr.es/m/be869e1c-8e3f-4cde-8609-212c899cccf9@dunslane.net
64bit xids can't represent xids before epoch 0 (see also be504a3e974). When
FullTransactionIdFromXidAndCtx() was passed such an xid, it'd create a 64bit
xid far into the future. Noticed while adding assertions in the course of
investigating be504a3e974, as amcheck's test create such xids.
To fix the issue, just return FirstNormalFullTransactionId in this case. A
freshly initdb'd cluster already has a newer horizon. The most minimal version
of this would make the messages for some detected corruptions differently
inaccurate. To make those cases accurate, switch
FullTransactionIdFromXidAndCtx() to use the 32bit modulo difference between
xid and nextxid to compute the 64bit xid, yielding sensible "in the future" /
"in the past" answers.
Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/20230108002923.cyoser3ttmt63bfn@awork3.anarazel.de
Backpatch: 14-, where heapam verification was introduced
The COPY documentation is quite clear that "COPY relation TO" copies
rows from only the named table, not any inheritance children it may
have. However, if you enabled row-level security on the table then
this stopped being true, because the code forgot to apply the ONLY
modifier in the "SELECT ... FROM relation" query that it constructs
in order to allow RLS predicates to be attached. Fix that.
Report and patch by Antonin Houska (comment adjustments and test case
by me). Back-patch to all supported branches.
Discussion: https://postgr.es/m/3472.1675251957@antos
Commit bdaabb9b started skipping doomed transactions when building the
list of possible conflicts for SERIALIZABLE READ ONLY. That makes
sense, because doomed transactions won't commit, but a couple of subtle
things broke:
1. If all uncommitted r/w transactions are doomed, a READ ONLY
transaction would arbitrarily not benefit from the safe snapshot
optimization. It would not be taken immediately, and yet no other
transaction would set SXACT_FLAG_RO_SAFE later.
2. In the same circumstances but with DEFERRABLE, GetSafeSnapshot()
would correctly exit its wait loop without sleeping and then take the
optimization in non-assert builds, but assert builds would fail a sanity
check that SXACT_FLAG_RO_SAFE had been set by another transaction.
This is similar to the case for PredXact->WritableSxactCount == 0. We
should opt out immediately if our possibleUnsafeConflicts list is empty
after filtering.
The code to maintain the serializable global xmin is moved down below
the new opt out site, because otherwise we'd have to reverse its effects
before returning.
Back-patch to all supported releases. Bug #17368.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17116-d6ca217acc180e30%40postgresql.org
Discussion: https://postgr.es/m/20110707212159.GF76634%40csail.mit.edu
When vacuum_defer_cleanup_age is bigger than the current xid, including the
epoch, the subtraction of vacuum_defer_cleanup_age would lead to a wrapped
around xid. While that normally is not a problem, the subsequent conversion to
a 64bit xid results in a 64bit-xid very far into the future. As that xid is
used as a horizon to detect whether rows versions are old enough to be
removed, that allows removal of rows that are still visible (i.e. corruption).
If vacuum_defer_cleanup_age was never changed from the default, there is no
chance of this bug occurring.
This bug was introduced in dc7420c2c92. A lesser version of it exists in
12-13, introduced by fb5344c969a, affecting only GiST.
The 12-13 version of the issue can, in rare cases, lead to pages in a gist
index getting recycled too early, potentially causing index entries to be
found multiple times.
The fix is fairly simple - don't allow vacuum_defer_cleanup_age to retreat
further than FirstNormalTransactionId.
Patches to make similar bugs easier to find, by adding asserts to the 64bit
xid infrastructure, have been proposed, but are not suitable for backpatching.
Currently there are no tests for vacuum_defer_cleanup_age. A patch introducing
infrastructure to make writing a test easier has been posted to the list.
Reported-by: Michail Nikolaev <michail.nikolaev@gmail.com>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230108002923.cyoser3ttmt63bfn@awork3.anarazel.de
Backpatch: 12-, but impact/fix is smaller for 12-13
If a view is defined atop another view, and then CREATE OR REPLACE
VIEW is used to add columns to the lower view, then when the upper
view's referencing RTE is expanded by ApplyRetrieveRule we will have
a subquery RTE with fewer eref->colnames than output columns. This
confuses various code that assumes those lists are always in sync,
as they are in plain parser output.
We have seen such problems before (cf commit d5b760ecb), and now
I think the time has come to do what was speculated about in that
commit: let's make ApplyRetrieveRule synthesize some column names to
preserve the invariant that holds in parser output. Otherwise we'll
be chasing this class of bugs indefinitely. Moreover, it appears from
testing that this actually gives us better results in the test case
d5b760ecb added, and likely in other corner cases that we lack
coverage for.
In HEAD, I replaced d5b760ecb's hack to make expandRTE exit early with
an elog(ERROR) call, since the case is now presumably unreachable.
But it seems like changing that in back branches would bring more risk
than benefit, so there I just updated the comment.
Per bug #17811 from Alexander Lakhin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17811-d31686b78f0dffc9@postgresql.org
If UPDATE is forced to retry after an EvalPlanQual check, it neglected
to repeat GENERATED-column computations, even though those might well
have changed since we're dealing with a different tuple than before.
Fixing this is mostly a matter of looping back a bit further when
we retry. In v15 and HEAD that's most easily done by altering the API
of ExecUpdateAct so that it includes computing GENERATED expressions.
Also, if an UPDATE in a partitioned table turns into a cross-partition
INSERT operation, we failed to recompute GENERATED columns. That's a
bug since 8bf6ec3ba allowed partitions to have different generation
expressions; although it seems to have no ill effects before that.
Fixing this is messier because we can now have situations where the same
query needs both the UPDATE-aligned set of GENERATED columns and the
INSERT-aligned set, and it's unclear which set will be generated first
(else we could hack things by forcing the INSERT-aligned set to be
generated, which is indeed how fe9e658f4 made it work for MERGE).
The best fix seems to be to build and store separate sets of expressions
for the INSERT and UPDATE cases. That would create ABI issues in the
back branches, but so far it seems we can leave this alone in the back
branches.
Per bug #17823 from Hisahiro Kauchi. The first part of this affects all
branches back to v12 where GENERATED columns were added.
Discussion: https://postgr.es/m/17823-b64909cf7d63de84@postgresql.org
1. Make sure that we don't decrement SxactGlobalXminCount twice when
the SXACT_FLAG_RO_SAFE optimization is reached in a parallel query.
This could trigger a sanity check failure in assert builds. Non-assert
builds recompute the count in SetNewSxactGlobalXmin(), so the problem
was hidden, explaining the lack of field reports. Add a new isolation
test to exercise that case.
2. Remove an assertion that the DOOMED flag can't be set on a partially
released SERIALIZABLEXACT. Instead, ignore the flag (our transaction
was already determined to be read-only safe, and DOOMED is in fact set
during partial release, and there was already an assertion that it
wasn't set sooner). Improve an existing isolation test so that it
reaches that case (previously it wasn't quite testing what it was
supposed to be testing; see discussion).
Back-patch to 12. Bug #17116. Defects in commit 47a338cf.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17116-d6ca217acc180e30%40postgresql.org
This is usually harmless, but if you were very unlucky it could
provoke a segfault due to the "to" string being right up against
the end of memory. Found via valgrind testing (so we might've
found it earlier, except that our regression tests lacked any
exercise of translate()'s deletion feature).
Fix by switching the order of the test-for-end-of-string and
advance-pointer steps. While here, compute "to_ptr + tolen"
just once. (Smarter compilers might figure that out for
themselves, but let's just make sure.)
Report and fix by Daniil Anisimov, in bug #17816.
Discussion: https://postgr.es/m/17816-70f3d2764e88a108@postgresql.org
It's been this way for a very long time, but it appears to have been
masking an issue that only manifests with different settings. Therefore,
run the tests in the installation's default encoding/locale.
Backpatch to all live branches.
We already tried to fix this in commits 3f7323cbb et al (and follow-on
fixes), but now it emerges that there are still unfixed cases;
moreover, these cases affect all branches not only pre-v14. I thought
we had eliminated all cases of making multiple clones of an UPDATE's
target list when we nuked inheritance_planner. But it turns out we
still do that in some partitioned-UPDATE cases, notably including
INSERT ... ON CONFLICT UPDATE, because ExecInitPartitionInfo thinks
it's okay to clone and modify the parent's targetlist.
This fix is based on a suggestion from Andres Freund: let's stop
abusing the ParamExecData.execPlan mechanism, which was only ever
meant to handle initplans, and instead solve the execution timing
problem by having the expression compiler move MULTIEXPR_SUBLINK steps
to the front of their expression step lists. This is feasible because
(a) all branches still in support compile the entire targetlist of
an UPDATE into a single ExprState, and (b) we know that all
MULTIEXPR_SUBLINKs do need to be evaluated --- none could be buried
inside a CASE, for example. There is a minor semantics change
concerning the order of execution of the MULTIEXPR's subquery versus
other parts of the parent targetlist, but that seems like something
we can get away with. By doing that, we no longer need to worry
about whether different clones of a MULTIEXPR_SUBLINK share output
Params; their usage of that data structure won't overlap.
Per bug #17800 from Alexander Lakhin. Back-patch to all supported
branches. In v13 and earlier, we can revert 3f7323cbb and follow-on
fixes; however, I chose to keep the SubPlan.subLinkId field added
in ccbb54c72. We don't need that anymore in the core code, but it's
cheap enough to fill, and removing a plan node field in a minor
release seems like it'd be asking for trouble.
Andres Freund and Tom Lane
Discussion: https://postgr.es/m/17800-ff90866b3906c964@postgresql.org
If a rule action contains a subquery that refers to columns from OLD
or NEW, then those are really lateral references, and the planner will
complain if it sees such things in a subquery that isn't marked as
lateral. However, at rule-definition time, the user isn't required to
mark the subquery with LATERAL, and so it can fail when the rule is
used.
Fix this by marking such subqueries as lateral in the rewriter, at the
point where they're used.
Dean Rasheed and Tom Lane, per report from Alexander Lakhin.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/5e09da43-aaba-7ea7-0a51-a2eb981b058b%40gmail.com
Multiple cycles of starting up and shutting down the plugin within a
single session would eventually lead to "out of relcache_callback_list
slots", because pgoutput_startup blindly re-registered its cache
callbacks each time. Fix it to register them only once, as all other
users of cache callbacks already take care to do.
This has been broken all along, so back-patch to all supported branches.
Shi Yu
Discussion: https://postgr.es/m/OSZPR01MB631004A78D743D68921FFAD3FDA79@OSZPR01MB6310.jpnprd01.prod.outlook.com
Given an updatable view with a DO ALSO INSERT ... SELECT rule, a
multi-row INSERT ... VALUES query on the view fails if the VALUES list
contains any DEFAULTs that are not replaced by view defaults. This
manifests as an "unrecognized node type" error, or an Assert failure,
in an assert-enabled build.
The reason is that when RewriteQuery() attempts to replace the
remaining DEFAULT items with NULLs in any product queries, using
rewriteValuesRTEToNulls(), it assumes that the VALUES RTE is located
at the same rangetable index in each product query. However, if the
product query is an INSERT ... SELECT, then the VALUES RTE is actually
in the SELECT part of that query (at the same index), rather than the
top-level product query itself.
Fix, by descending to the SELECT in such cases. Note that we can't
simply use getInsertSelectQuery() for this, since that expects to be
given a raw rule action with OLD and NEW placeholder entries, so we
duplicate its logic instead.
While at it, beef up the checks in getInsertSelectQuery() by checking
that the jointree->fromlist node is indeed a RangeTblRef, and that the
RTE it points to has rtekind == RTE_SUBQUERY.
Per bug #17803, from Alexander Lakhin. Back-patch to all supported
branches.
Dean Rasheed, reviewed by Tom Lane.
Discussion: https://postgr.es/m/17803-53c63ed4ecb4eac6%40postgresql.org
Whe decoding a transactional logical message, logicalmsg_decode called
SnapBuildGetOrBuildSnapshot. But we may not have a consistent snapshot
yet at that point. We don't actually need the snapshot in this case
(during replay we'll have the snapshot from the transaction), so in
practice this is harmless. But in assert-enabled build this crashes.
Fixed by requesting the snapshot only in non-transactional case, where
we are guaranteed to have SNAPBUILD_CONSISTENT.
Backpatch to 11. The issue exists since 9.6.
Backpatch-through: 11
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/84d60912-6eab-9b84-5de3-41765a5449e8@enterprisedb.com
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
Failing to do so results in an error when a pgbench script tries to
start a serializable transaction inside a pipeline, because by the time
BEGIN ISOLATION LEVEL SERIALIZABLE is executed, we're already in a
transaction that has acquired a snapshot, so the server rightfully
complains.
We can work around that by preparing all commands in the pipeline before
actually starting the pipeline. This changes the existing code in two
aspects: first, we now prepare each command individually at the point
where that command is about to be executed; previously, we would prepare
all commands in a script as soon as the first command of that script
would be executed. It's hard to see that this would make much of a
difference (particularly since it only affects the first time to execute
each script in a client), but I didn't actually try to measure it.
Secondly, we no longer use PQsendPrepare() in pipeline mode, but only
PQprepare. There's no specific reason for this change other than no
longer needing to do differently in pipeline mode. (Previously we had
no choice, because in pipeline mode PQprepare could not be used.)
Backpatch to 14, where pgbench got support for pipeline mode.
Reported-by: Yugo NAGATA <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/20210716153013.fc53b1c780b06fccc07a7f0d@sraoss.co.jp
When evaluating clauses on multiple scan keys of a multi-column BRIN
index, we can stop processing as soon as we find a scan key eliminating
the range, and the range should not be added to tbe bitmap.
That's how it worked before 14, but since a681e3c107a the code treated
the range as matching if it matched at least the last scan key.
Backpatch to 14, where this code was introduced.
Backpatch-through: 14
Discussion: https://postgr.es/m/ebc18613-125e-60df-7520-fcbe0f9274fc%40enterprisedb.com
ruleutils.c blindly printed the user-given alias (or nothing if there
hadn't been one) for the target table of INSERT/UPDATE/DELETE queries.
That works a large percentage of the time, but not always: for queries
appearing in WITH, it's possible that we chose a different alias to
avoid conflict with outer-scope names. Since the chosen alias would
be used in any Var references to the target table, this'd lead to an
inconsistent printout with consequences such as dump/restore failures.
The correct logic for printing (or not) a relation alias was embedded
in get_from_clause_item. Factor it out to a separate function so that
we don't need a jointree node to use it. (Only a limited part of that
function can be reached from these new call sites, but this seems like
the cleanest non-duplicative factorization.)
In passing, I got rid of a redundant "\d+ rules_src" step in rules.sql.
Initial report from Jonathan Katz; thanks to Vignesh C for analysis.
This has been broken for a long time, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/e947fa21-24b2-f922-375a-d4f763ef3e4b@postgresql.org
Discussion: https://postgr.es/m/CALDaNm1MMntjmT_NJGp-Z=xbF02qHGAyuSHfYHias3TqQbPF2w@mail.gmail.com
OpenSSL 1.1.1 and newer versions have added support for RSA-PSS
certificates, which requires the use of a specific routine in OpenSSL to
determine which hash function to use when compiling it when using
channel binding in SCRAM-SHA-256. X509_get_signature_nid(), that is the
original routine the channel binding code has relied on, is not able to
determine which hash algorithm to use for such certificates. However,
X509_get_signature_info(), new to OpenSSL 1.1.1, is able to do it. This
commit switches the channel binding logic to rely on
X509_get_signature_info() over X509_get_signature_nid(), which would be
the choice when building with 1.1.1 or newer.
The error could have been triggered on the client or the server, hence
libpq and the backend need to have their related code paths patched.
Note that attempting to load an RSA-PSS certificate with OpenSSL 1.1.0
or older leads to a failure due to an unsupported algorithm.
The discovery of relying on X509_get_signature_info() comes from Jacob,
the tests have been written by Heikki (with few tweaks from me), while I
have bundled the whole together while adding the bits needed for MSVC
and meson.
This issue exists since channel binding exists, so backpatch all the way
down. Some tests are added in 15~, triggered if compiling with OpenSSL
1.1.1 or newer, where the certificate and key files can easily be
generated for RSA-PSS.
Reported-by: Gunnar "Nick" Bluth
Author: Jacob Champion, Heikki Linnakangas
Discussion: https://postgr.es/m/17760-b6c61e752ec07060@postgresql.org
Backpatch-through: 11
When an aggregate function is used as a WindowFunc and a tuple transitions
out of the window frame, we ordinarily try to make use of the aggregate
function's inverse transition function to "unaggregate" the exiting tuple.
This optimization is disabled for various cases, including when the
aggregate contains a volatile function. In such a case we'd be unable to
ensure that the transition value was calculated to the same value during
transitions and inverse transitions. Unfortunately, we did this check by
calling contain_volatile_functions() which does not recursively search
SubPlans for volatile functions. If the aggregate function's arguments or
its FILTER clause contained a subplan with volatile functions then we'd
fail to notice this.
Here we fix this by just disabling the optimization when the WindowFunc
contains any subplans. Volatile functions are not the only reason that a
subplan may have nonrepeatable results.
Bug: #17777
Reported-by: Anban Company
Discussion: https://postgr.es/m/17777-860b739b6efde977%40postgresql.org
Reviewed-by: Tom Lane
Backpatch-through: 11
Try to disable ASLR when building in EXEC_BACKEND mode, to avoid random
memory mapping failures while testing. For developer use only, no
effect on regular builds.
This has been originally applied as of f3e7806 for v15~, but
recently-added buildfarm member gokiburi tests this configuration on
older branches as well, causing it to fail randomly as ASLR would be
enabled.
Suggested-by: Andres Freund <andres@anarazel.de>
Tested-by: Bossart, Nathan <bossartn@amazon.com>
Discussion: https://postgr.es/m/20210806032944.m4tz7j2w47mant26%40alap3.anarazel.de
Backpatch-through: 12
pqsecure_open_gss() includes a code path handling error messages with
v2-style protocol messages coming from the server. The client-side
buffer holding the error message does not force a NULL-termination, with
the data of the server getting copied to the errorMessage of the
connection. Hence, it would be possible for a server to send an
unterminated string and copy arbitrary bytes in the buffer receiving the
error message in the client, opening the door to a crash or even data
exposure.
As at this stage of the authentication process the exchange has not been
completed yet, this could be abused by an attacker without Kerberos
credentials. Clients that have a valid kerberos cache are vulnerable as
libpq opportunistically requests for it except if gssencmode is
disabled.
Author: Jacob Champion
Backpatch-through: 12
Security: CVE-2022-41862
The prior coding of int64_div_fast_to_numeric() had a number of bugs
that would cause it to fail under different circumstances, such as
with log10val2 <= 0, or log10val2 a multiple of 4, or in the "slow"
numeric path with log10val2 >= 10.
None of those could be triggered by any of our current code, which
only uses log10val2 = 3 or 6. However, they made it a hazard for any
future code that might use it. Also, since this is exported by
numeric.c, users writing their own C code might choose to use it.
Therefore fix, and back-patch to v14, where it was introduced.
Dean Rasheed, reviewed by Tom Lane.
Discussion: https://postgr.es/m/CAEZATCW8gXgW0tgPxPgHDPhVX71%2BSWFRkhnXy%2BTfGDsKLepu2g%40mail.gmail.com
DST law changes in Greenland and Mexico. Notably, a new timezone
America/Ciudad_Juarez has been split off from America/Ojinaga.
Historical corrections for northern Canada, Colombia, and Singapore.
This test has been added as of 857ee8e that has introduced the SQL
function txid_status(), with the purpose of checking that a transaction
ID still in-progress during a crash is correctly marked as aborted after
recovery finishes.
This test is unstable, and some configuration scenarios may that easier
to reproduce (wal_level=minimal, wal_compression=on) because the WAL
holding the information about the in-progress transaction ID may not
have made it to disk yet, hence a post-crash recovery may cause the same
XID to be reused, triggering a test failure.
We have discussed a few approaches, like making this function force a
WAL flush to make it reliable across crashes, but we don't want to pay a
performance penalty in some scenarios, as well. The test could have
been tweaked to enforce a checkpoint but that actually breaks the
promise of the test to rely on a stable result of txid_status() after
a crash.
This issue has been reported a few times across the past years, with an
original report from Kyotaro Horiguchi. The buildfarm machines tanager,
hachi and gokiburi enable wal_compression, and fail on this test
periodically.
Discussion: https://postgr.es/m/3163112.1674762209@sss.pgh.pa.us
Discussion: https://postgr.es/m/20210305.115011.558061052471425531.horikyota.ntt@gmail.com
Backpatch-through: 11
If the final chunk of an oversized tuple being written out to disk was
exactly 32760 bytes, it would be corrupted due to a fencepost bug.
Bug #17619. Back-patch to 11 where the code arrived.
While testing that (see test module in archives), I (tmunro) noticed
that the per-participant page counter was not initialized to zero as it
should have been; that wasn't a live bug when it was written since DSM
memory was originally always zeroed, but since 14
min_dynamic_shared_memory might be configured and it supplies non-zeroed
memory, so that is also fixed here.
Author: Dmitry Astapov <dastapov@gmail.com>
Discussion: https://postgr.es/m/17619-0de62ceda812b8b5%40postgresql.org
The drop database command waits for the logical replication sync worker to
accept ProcSignalBarrier and the worker's slot creation waits for the drop
database to finish which leads to a deadlock. This happens because the
tablesync worker holds interrupts while creating a slot.
We prevent cancel/die interrupts while creating a slot in the table sync
worker because it is possible that before the server finishes this
command, a concurrent drop subscription happens which would complete
without removing this slot and that leads to the slot existing until the
end of walsender. However, the slot will eventually get dropped at the
walsender exit time, so there is no danger of the dangling slot.
This patch reallows cancel/die interrupts while creating a slot and
modifies the test to wait for slots to become zero to prevent finding an
ephemeral slot.
The reported hang doesn't happen in PG14 as the drop database starts to
wait for ProcSignalBarrier with PG15 (commits 4eb2176318 and e2f65f4255)
but it is good to backpatch this till PG14 as it is not a good idea to
prevent interrupts during a network call that could block indefinitely.
Reported-by: Lakshmi Narayanan Sreethar
Diagnosed-by: Andres Freund
Author: Hou Zhijie
Reviewed-by: Vignesh C, Amit Kapila
Backpatch-through: 14, where it was introduced in commit 6b67d72b60
Discussion: https://postgr.es/m/CA+kvmZELXQ4ZD3U=XCXuG3KvFgkuPoN1QrEj8c-rMRodrLOnsg@mail.gmail.com
When libpqrcv_connect (also known as walrcv_connect()) failed, it leaked the
libpq connection. In most paths that's fairly harmless, as the calling process
will exit soon after. But e.g. CREATE SUBSCRIPTION could lead to a somewhat
longer lived leak.
Fix by releasing resources, including the libpq connection, on error.
Add a test exercising the error code path. To make it reliable and safe, the
test tries to connect to port=-1, which happens to fail during connection
establishment, rather than during connection string parsing.
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20230121011237.q52apbvlarfv6jm6@awork3.anarazel.de
Backpatch: 11-