Commit Graph

47038 Commits

Author SHA1 Message Date
1eb09ed63a Use C99 designated designators in a couple of places
This makes the arrays somewhat easier to read.

Author: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/202601281204.sdxbr5qvpunk@alvherre.pgsql
2026-01-30 10:11:04 +01:00
bb26a81ee2 Remove unused argument from ApplyLogicalMappingFile().
Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/20260128120056.b2a3e8184712ab5a537879eb@sraoss.co.jp
2026-01-30 09:05:35 +09:00
87f7b824f2 tableam: Perform CheckXidAlive check once per scan
Previously, the CheckXidAlive check was performed within the table_scan*next*
functions. This caused the check to be executed for every fetched tuple, an
unnecessary overhead.

To fix, move the check to table_beginscan* so it is performed once per scan
rather than once per row.

Note: table_tuple_fetch_row_version() does not use a scan descriptor;
therefore, the CheckXidAlive check is retained in that function. The overhead
is unlikely to be relevant for the existing callers.

Reported-by: Andres Freund <andres@anarazel.de>
Author: Dilip Kumar <dilipbalaut@gmail.com>
Suggested-by: Andres Freund <andres@anarazel.de>
Suggested-by: Amit Kapila <akapila@postgresql.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/tlpltqm5jjwj7mp66dtebwwhppe4ri36vdypux2zoczrc2i3mp%40dhv4v4nikyfg
2026-01-29 17:52:07 -05:00
333f586372 bufmgr: Allow conditionally locking of already locked buffer
In fcb9c977aa5 I included an assertion in BufferLockConditional() to detect if
a conditional lock acquisition is done on a buffer that we already have
locked. The assertion was added in the course of adding other assertions.
Unfortunately I failed to realize that some of our code relies on such lock
acquisitions to silently fail. E.g. spgist and nbtree may try to conditionally
lock an already locked buffer when acquiring a empty buffer.

LWLockAcquireConditional(), which was previously used to implement
ConditionalLockBuffer(), does not have such an assert.

Instead of just removing the assert, and relying on the lock acquisition to
fail due to the buffer already locked, this commit changes the behaviour of
conditional content lock acquisition to fail if the current backend has any
pre-existing lock on the buffer, even if the lock modes would not
conflict. The reason for that is that we currently do not have space to track
multiple lock acquisitions on a single buffer. Allowing multiple locks on the
same buffer by a backend also seems likely to lead to bugs.

There is only one non-self-exclusive conditional content lock acquisition, in
GetVictimBuffer(), but it only is used if the target buffer is not pinned and
thus can't already be locked by the current backend.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/90bd2cbb-49ce-4092-9f61-5ac2ab782c94@gmail.com
2026-01-29 16:49:01 -05:00
bd9dfac8b1 Further fix extended alignment for older g++.
Commit 6ceef9408 was still one brick shy of a load, because it caused
any usage at all of PGIOAlignedBlock or PGAlignedXLogBlock to fail
under older g++.  Notably, this broke "headerscheck --cplusplus".
We can permit references to these structs as abstract structs though;
only actual declaration of such a variable needs to be forbidden.

Discussion: https://www.postgresql.org/message-id/3119480.1769189606@sss.pgh.pa.us
2026-01-29 16:16:36 -05:00
de90bb7db1 Fix theoretical memory leaks in pg_locale_libc.c.
The leaks were hard to reach in practice and the impact was low.

The callers provide a buffer the same number of bytes as the source
string (plus one for NUL terminator) as a starting size, and libc
never increases the number of characters. But, if the byte length of
one of the converted characters is larger, then it might need a larger
destination buffer. Previously, in that case, the working buffers
would be leaked.

Even in that case, the call typically happens within a context that
will soon be reset. Regardless, it's worth fixing to avoid such
assumptions, and the fix is simple so it's worth backporting.

Discussion: https://postgr.es/m/e2b7a0a88aaadded7e2d19f42d5ab03c9e182ad8.camel@j-davis.com
Backpatch-through: 18
2026-01-29 10:14:55 -08:00
ec31744071 Replace literal 0 with InvalidXLogRecPtr for XLogRecPtr assignments
Use the proper constant InvalidXLogRecPtr instead of literal 0 when
assigning XLogRecPtr variables and struct fields.

This improves code clarity by making it explicit that these are
invalid LSN values rather than ambiguous zero literals.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aRtd2dw8FO1nNX7k@ip-10-97-1-34.eu-west-3.compute.internal
2026-01-29 18:37:09 +01:00
71c1136989 Fix mistakes in commit 4020b370f214315b8c10430301898ac21658143f
cost_tidrangescan() was setting the disabled_nodes value correctly,
and then immediately resetting it to zero, due to poor code editing on
my part.

materialized_finished_plan correctly set matpath.parent to
zero, but forgot to also set matpath.parallel_workers = 0, causing
an access to uninitialized memory in cost_material. (This shouldn't
result in any real problem, but it makes valgrind unhappy.)

reparameterize_path was dereferencing a variable before verifying that
it was not NULL.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us> (issue #1)
Reported-by: Michael Paquier <michael@paquier.xyz> (issue #1)
Diagnosed-by: Lukas Fittl <lukas@fittl.com> (issue #1)
Reported-by: Zsolt Parragi <zsolt.parragi@percona.com> (issue #2)
Reported-by: Richard Guo <guofenglinux@gmail.com> (issue #3)
Discussion: http://postgr.es/m/CAN4CZFPvwjNJEZ_JT9Y67yR7C=KMNa=LNefOB8ZY7TKDcmAXOA@mail.gmail.com
Discussion: http://postgr.es/m/aXrnPgrq6Gggb5TG@paquier.xyz
2026-01-29 08:04:47 -05:00
20a8f783e1 Wake LSN waiters before recovery target stop
Move WaitLSNWakeup() immediately after ApplyWalRecord() so waiters are
signaled even when recoveryStopsAfter() breaks out for pause/promotion
targets.

Discussion: https://postgr.es/m/9533608f-f289-44bd-b881-9e5a73203c5b%40iki.fi
Discussion: https://postgr.es/m/CABPTF7Wdq6KbvC3EhLX3Pz%3DODCCPEX7qVQ%2BE%3DcokkB91an2E-A%40mail.gmail.com
Reported-by: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Xuneng Zhou <xunengzhou@gmail.com>
2026-01-29 09:47:09 +02:00
4b77282f25 psql: Disable %P (pipeline status) for non-active connection
In the psql prompt, %P prompt shows the current pipeline status.  Unlike
most of the other options, its status was showing up in the output
generated even if psql was not connected to a database.  This was
confusing, because without a connection a pipeline status makes no
sense.

Like the other options, %P is updated so as its data is now hidden
without an active connection.

Author: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/86EF76B5-6E62-404D-B9EC-66F4714D7D5F@gmail.com
Backpatch-through: 18
2026-01-29 16:20:45 +09:00
740a1494f4 Fix two error messages in extended_stats_funcs.c
These have been fat-fingered in 0e80f3f88dea and 302879bd68d1.  The
error message for ndistinct had an incorrect grammar, while the one for
dependencies had finished with a period (incorrect based on the project
guidelines).

Discussion: https://postgr.es/m/aXrsjZQbVuB6236u@paquier.xyz
2026-01-29 14:57:47 +09:00
fc365e4fcc Add test doing some cloning of extended statistics data
The test added in this commit copies the data of an ANALYZE run on one
relation to a secondary relation with the same attribute definitions and
extended statistics objects.  Once the clone is done, the target and
origin should have the same extended statistics information, with no
differences.

This test would have been able to catch e3094679b983, for example, as we
expect the full range of statistics to be copied over, with no
differences generated between the results of an ANALYZE and the data
copied to the cloned relation.

Note that this new test should remain at the bottom of stats_import.sql,
so as any additions in the main relation and its clone are automatically
covered when copying their statistics, so as it would work as a sanity
check in the future.

Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
2026-01-29 13:22:07 +09:00
0b7beec42a Add test for pg_restore_extended_stats() with multiranges
The restore of extended statistics has some paths dedicated to
multirange types and expressions for all the stats kinds supported, and
we did not have coverage for the case where an extended stats object
uses a multirange attribute with or without an expression.

Extracted from a larger patch by the same author, with a couple of
tweaks from me regarding the format of the output generated, to make it
more readable to the eye.

Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
2026-01-29 12:38:10 +09:00
3a98f989e8 Fix CI failure introduced in commit 851f6649cc.
The test added in commit 851f6649cc uses a backup taken from a node
created by the previous test to perform standby related checks. On
Windows, however, the standby failed to start with the following error:
FATAL:  could not rename file "backup_label" to "backup_label.old": Permission denied

This occurred because some background sessions from the earlier test were
still active. These leftover processes continued accessing the parent
directory of the backup_label file, likely preventing the rename and
causing the failure. Ensuring that these sessions are cleanly terminated
resolves the issue in local testing.

Additionally, the has_restoring => 1 option has been removed, as it was
not required by the new test.

Reported-by: Robert Haas <robertmhaas@gmail.com>
Backpatch-through: 17
Discussion: https://postgr.es/m/CA+TgmobdVhO0ckZfsBZ0wqDO4qHVCwZZx8sf=EinafvUam-dsQ@mail.gmail.com
2026-01-29 03:22:02 +00:00
efbebb4e85 Add support for "mcv" in pg_restore_extended_stats()
This commit adds support for the restore of extended statistics of the
kind "mcv", aka most-common values.

This format is different from n_distinct and dependencies stat types in
that it is the combination of three of the four different arrays from the
pg_stats_ext view which in turn require three different input parameters
on pg_restore_extended_statistics().  These are translated into three
input arguments for the function:
- "most_common_vals", acting as a leader of the others.  It is a
2-dimension array, that includes the common values.
- "most_common_freqs", 1-dimension array of float8[], with a number of
elements that has to match with "most_common_vals".
- "most_common_base_freqs", 1-dimension array of float8[], with a number
of elements that has to match with "most_common_vals".

All three arrays are required to achieve the restore of this type of
extended statistics (if "most_common_vals" happens to be NULL in the
catalogs, the rest is NULL by design).

Note that "most_common_val_nulls" is not required in input, its data is
rebuilt from the decomposition of the "most_common_vals" array based on
its text[] representation.  The initial versions of the patch provided
this option in input, but we do not require it and it simplifies a lot
the result.

Support in pg_dump is added down to v13 which is where the support for
this type of extended statistics has been added, when --statistics is
used.  This means that upgrade and dumps can restore extended statistics
data transparently, like "dependencies", "ndistinct", attribute and
relation statistics.  For MCV, the values are directly queried from the
relevant catalogs.

Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
2026-01-29 12:14:08 +09:00
e09e5ad69a Fix whitespace issue in regression test stats_import
Issue noticed while playing this area of the tests for a different
patch.
2026-01-29 11:59:43 +09:00
8f1e2dfe03 Consolidate replication origin session globals into a single struct.
This commit moves the separate global variables for replication origin
state into a single ReplOriginXactState struct. This groups logically
related variables, which improves code readability and simplifies
state management (e.g., resetting the state) by handling them as a
unit.

Author: Chao Li <lic@highgo.com>
Suggested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAEoWx2=pYvfRthXHTzSrOsf5_FfyY4zJyK4zV2v4W=yjUij1cA@mail.gmail.com
2026-01-28 12:26:22 -08:00
227eb4eea2 Refactor replication origin state reset helpers.
Factor out common logic for clearing replorigin_session_* variables
into a dedicated helper function, replorigin_xact_clear().

This removes duplicated assignments of these variables across multiple
call sites, and makes the intended scope of each reset explicit.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/CAEoWx2=pYvfRthXHTzSrOsf5_FfyY4zJyK4zV2v4W=yjUij1cA@mail.gmail.com
2026-01-28 11:45:26 -08:00
1fdbca159e Standardize replication origin naming to use "ReplOrigin".
The replication origin code was using inconsistent naming
conventions. Functions were typically prefixed with 'replorigin',
while typedefs and constants used "RepOrigin".

This commit unifies the naming convention by renaming RepOriginId to
ReplOriginId.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAD21AoBDgm3hDqUZ+nqu=ViHmkCnJBuJyaxG_yvv27BAi2zBmQ@mail.gmail.com
2026-01-28 11:03:29 -08:00
4020b370f2 Allow for plugin control over path generation strategies.
Each RelOptInfo now has a pgs_mask member which is a mask of acceptable
strategies. For most rels, this is populated from PlannerGlobal's
default_pgs_mask, which is computed from the values of the enable_*
GUCs at the start of planning.

For baserels, get_relation_info_hook can be used to adjust pgs_mask for
each new RelOptInfo, at least for rels of type RTE_RELATION. Adjusting
pgs_mask is less useful for other types of rels, but if it proves to
be necessary, we can revisit the way this hook works or add a new one.

For joinrels, two new hooks are added. joinrel_setup_hook is called each
time a joinrel is created, and one thing that can be done from that hook
is to manipulate pgs_mask for the new joinrel. join_path_setup_hook is
called each time we're about to add paths to a joinrel by considering
some particular combination of an outer rel, an inner rel, and a join
type. It can modify the pgs_mask propagated into JoinPathExtraData to
restrict strategy choice for that particular combination of rels.

To make joinrel_setup_hook work as intended, the existing calls to
build_joinrel_partition_info are moved later in the calling functions;
this is because that function checks whether the rel's pgs_mask includes
PGS_CONSIDER_PARTITIONWISE, so we want it to only be called after
plugins have had a chance to alter pgs_mask.

Upper rels currently inherit pgs_mask from the input relation. It's
unclear that this is the most useful behavior, but at the moment there
are no hooks to allow the mask to be set in any other way.

Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Greg Burd <greg@burd.me>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Haibo Yan <tristan.yim@gmail.com>
Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com
2026-01-28 11:28:34 -05:00
e6d6e32f42 Fix duplicate arbiter detection during REINDEX CONCURRENTLY on partitions
Commit 90eae926a fixed ON CONFLICT handling during REINDEX CONCURRENTLY
on partitioned tables by treating unparented indexes as potential
arbiters.  However, there's a remaining race condition: when pg_inherits
records are swapped between consecutive calls to get_partition_ancestors(),
two different child indexes can appear to have the same parent, causing
duplicate entries in the arbiter list and triggering "invalid arbiter
index list" errors.

Note that this is not a new problem introduced by 90eae926a.  The same
error could occur before that commit in a slightly different scenario:
an index is selected during planning, then index_concurrently_swap()
commits, and a subsequent call to get_partition_ancestors() uses a new
catalog snapshot that sees zero ancestors for that index.

Fix by tracking which parent indexes have already been processed.  If a
subsequent call to get_partition_ancestors() returns a parent we've
already seen, treat that index as unparented instead, allowing it to be
matched via IsIndexCompatibleAsArbiter() like other concurrent reindex
scenarios.

Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/e5a8c1df-04e5-4343-85ef-5df2a7e3d90c@gmail.com
2026-01-28 14:38:53 +01:00
e3094679b9 Fix pg_restore_extended_stats() with expressions
This commit fixes an issue with the restore of ndistinct and
dependencies statistics, causing the operation to fail when any of these
kinds included expressions.

In extended statistics, expressions use strictly negative attribute
numbers, decremented from -1.  For example, let's imagine an object
defined as follows:
CREATE STATISTICS stats_obj (dependencies) ON lower(name), upper(name)
  FROM tab_obj;

This object would generate dependencies stats using -1 and -2 as
attribute numbers, like that:
[{"attributes": [-1], "dependency": -2, "degree": 1.000000},
 {"attributes": [-2], "dependency": -1, "degree": 1.000000}]

However, pg_restore_extended_stats() forgot to account for the number of
expressions defined in an extended statistics object.  This would cause
the validation step of ndistinct and dependencies data to fail,
preventing a restore of their stats even if the input is valid.

This issue has come up due to an incorrect split of the patch set.  Some
tests are included to cover this behavior.

Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aXl4bMfSTQUxM_yy@paquier.xyz
2026-01-28 11:48:45 +09:00
f9562b95c6 Add output test for pg_dependencies statistics import
Commit 302879bd68d115 has added the ability to restore extended stats of
the type "dependencies", but it has forgotten the addition of a test to
verify that the value restored was actually set.

This test is the pg_dependencies equivalent of the test added for
pg_ndistinct in 0e80f3f88dea.

Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CADkLM=dZr_Ut3jKw94_BisyyDtNZPRJWeOALXVzcJz=ZFTAhvQ@mail.gmail.com
2026-01-28 08:37:46 +09:00
c6a10a89fe oauth: Correct test dependency on oauth_hook_client
The oauth_validator tests missed the lessons of c89525d57 et al, so
certain combinations of command-line build order and `meson test`
options can result in

    Command 'oauth_hook_client' not found in [...] at src/test/perl/PostgreSQL/Test/Utils.pm line 427.

Add the missing dependency on the test executable. This fixes, for
example,

    $ ninja clean && ninja meson-test-prereq && PG_TEST_EXTRA=oauth meson test --no-rebuild

Reported-by: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com>
Author: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com>
Discussion: https://postgr.es/m/6e8f4f7c23faf77c4b6564c4b7dc5d3de64aa491.camel@gmail.com
Discussion: https://postgr.es/m/qh4c5tvkgjef7jikjig56rclbcdrrotngnwpycukd2n3k25zi2%4044hxxvtwmgum
Backpatch-through: 18
2026-01-27 11:56:44 -08:00
9a446d0256 pg_waldump: Remove file-level global WalSegSz.
It's better style to pass the value around to just the places that
need it. This makes it easier to determine whether the value is
always properly initialized before use.

Reviewed-by: Amul Sul <sulamul@gmail.com>
Discussion: http://postgr.es/m/CAAJ_b94+wObPn-z1VECipnSFhjMJ+R2cpTmKVYLjyQuVn+B5QA@mail.gmail.com
2026-01-27 08:33:20 -05:00
851f6649cc Prevent invalidation of newly synced replication slots.
A race condition could cause a newly synced replication slot to become
invalidated between its initial sync and the checkpoint.

When syncing a replication slot to a standby, the slot's initial
restart_lsn is taken from the publisher's remote_restart_lsn. Because slot
sync happens asynchronously, this value can lag behind the standby's
current redo pointer. Without any interlocking between WAL reservation and
checkpoints, a checkpoint may remove WAL required by the newly synced
slot, causing the slot to be invalidated.

To fix this, we acquire ReplicationSlotAllocationLock before reserving WAL
for a newly synced slot, similar to commit 006dd4b2e5. This ensures that
if WAL reservation happens first, the checkpoint process must wait for
slotsync to update the slot's restart_lsn before it computes the minimum
required LSN.

However, unlike in ReplicationSlotReserveWal(), this lock alone cannot
protect a newly synced slot if a checkpoint has already run
CheckPointReplicationSlots() before slotsync updates the slot. In such
cases, the remote restart_lsn may be stale and earlier than the current
redo pointer. To prevent relying on an outdated LSN, we use the oldest
WAL location available if it is greater than the remote restart_lsn.

This ensures that newly synced slots always start with a safe, non-stale
restart_lsn and are not invalidated by concurrent checkpoints.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Vitaly Davydov <v.davydov@postgrespro.ru>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Backpatch-through: 17
Discussion: https://postgr.es/m/TY4PR01MB16907E744589B1AB2EE89A31F94D7A%40TY4PR01MB16907.jpnprd01.prod.outlook.com
2026-01-27 05:06:29 +00:00
c32fb29e97 Include extended statistics data in pg_dump
This commit integrates the new pg_restore_extended_stats() function into
pg_dump, so as the data of extended statistics is detected and included
in dumps when the --statistics switch is specified.  Currently, the same
extended stats kinds as the ones supported by the SQL function can be
dumped: "n_distinct" and "dependencies".

The extended statistics data can be dumped down to PostgreSQL 10, with
the following changes depending on the backend version dealt with:
- In v19 and newer versions, the format of pg_ndistinct and
pg_dependencies has changed, catalogs can be directly queried.
- In v18 and older versions, the format is translated to the new format
supported by the backend.
- In v14 and older versions, inherited extended statistics are not
supported.
- In v11 and older versions, the data for ndistinct and dependencies
was stored in pg_statistic_ext.  These have been moved to pg_stats_ext
in v12.
- Extended Statistics have been introduced in v10, no support is needed
for versions older than that.

The extended statistics data is dumped if it can be found in the
catalogs.  If the catalogs are empty, then no restore of the stats data
is attempted.

Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
2026-01-27 13:42:32 +09:00
1ea44d7ddf Remove unnecessary abort() from WalSndShutdown().
WalSndShutdown() previously called abort() after proc_exit(0) to
silence compiler warnings. This is no longer needed, because both
WalSndShutdown() and proc_exit() are declared pg_noreturn,
allowing the compiler to recognize that the function does not return.
Also there are already other functions, such as CheckpointerMain(),
that call proc_exit() without an abort(), and they do not produce warnings.

Therefore this abort() call in WalSndShutdown() is useless and
this commit removes it.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CAHGQGwHPX1yoixq+YB5rF4zL90TMmSEa3FpHURtqW3Jc5+=oSA@mail.gmail.com
2026-01-27 11:55:32 +09:00
3fccbd94cb Handle ENOENT status when querying NUMA node
We've assumed that touching the memory is sufficient for a page to be
located on one of the NUMA nodes. But a page may be moved to a swap
after we touch it, due to memory pressure.

We touch the memory before querying the status, but there is no
guarantee it won't be moved to the swap in the meantime. The touching
happens only on the first call, so later calls are more likely to be
affected. And the batching increases the window too.

It's up to the kernel if/when pages get moved to swap. We have to accept
ENOENT (-2) as a valid result, and handle it without failing. This patch
simply treats it as an unknown node, and returns NULL in the two
affected views (pg_shmem_allocations_numa and pg_buffercache_numa).

Hugepages cannot be swapped out, so this affects only regular pages.

Reported by Christoph Berg, investigation and fix by me. Backpatch to
18, where the two views were introduced.

Reported-by: Christoph Berg <myon@debian.org>
Discussion: 18
Backpatch-through: https://postgr.es/m/aTq5Gt_n-oS_QSpL@msg.df7cb.de
2026-01-27 00:21:40 +01:00
302879bd68 Add support for "dependencies" in pg_restore_extended_stats()
This commit adds support for the restore of extended statistics of the
kind "dependencies", for the following input data:
[{"attributes": [2], "dependency": 3, "degree": 1.000000},
 {"attributes": [3], "dependency": 2, "degree": 1.000000}]

This relies on the existing routines of "dependencies" to cross-check
the input data with the definition of the extended statistics objects
for the attribute numbers.  An input argument of type "pg_dependencies"
is required for this new option.

Thanks to the work done in 0e80f3f88dea for the restore function and
e1405aa5e3ac for the input handling of data type pg_dependencies, this
addition is straight-forward.  This will be used so as it is possible to
transfer these statistics across dumps and upgrades, removing the need
for a post-operation ANALYZE for these kinds of statistics.

Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
2026-01-27 08:20:13 +09:00
19af794b66 Refactor lazy_scan_prune() VM clear logic into helper
Encapsulating the cases that clear the visibility map after vacuum phase
I, when corruption is detected, into in a helper makes the code cleaner
and enables further refactoring in future commits.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/7ib3sa55sapwjlaz4sijbiq7iezna27kjvvvar4dpgkmadml6t%40gfpkkwmdnepx
2026-01-26 17:12:05 -05:00
648a7e28d7 Eliminate use of cached VM value in lazy_scan_prune()
lazy_scan_prune() takes a parameter from lazy_scan_heap() indicating
whether the page was marked all-visible in the VM at the time it was
last checked in find_next_unskippable_block(). This behavior is
historical, dating back to commit 608195a3a365, when we did not pin the
VM page until deciding we must read it. Now that the VM page is already
pinned, there is no meaningful benefit to relying on a cached VM status.

Removing this cached value simplifies the logic in both lazy_scan_heap()
and lazy_scan_prune(). It also clarifies future work that will set the
visibility map on-access: such paths will not have a cached value
available, which would make the logic harder to reason about. And
eliminating it enables us to detect and repair VM corruption on-access.

Along with removing the cached value and unconditionally checking the
visibility status of the heap page, this commit also moves the VM
corruption handling to occur first. This reordering should have no
performance impact, since the checks are inexpensive and performed only
once per page. It does, however, make the control flow easier to
understand. The new restructuring also makes it possible to set the VM
after fixing corruption (if pruning found the page all-visible).

Now that no callers of visibilitymap_set() use its return value, change
its (and visibilitymap_set_vmbits()) return type to void.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/5CEAA162-67B1-44DA-B60D-8B65717E8B05%40gmail.com
2026-01-26 17:00:13 -05:00
21796c267d Combine visibilitymap_set() cases in lazy_scan_prune()
lazy_scan_prune() previously had two separate cases that called
visibilitymap_set() after pruning and freezing. These branches were
nearly identical except that one attempted to avoid dirtying the heap
buffer. However, that situation can never occur — the heap buffer cannot
be clean at that point (and we would hit an assertion if it were).

In lazy_scan_prune(), when we change a previously all-visible page to
all-frozen and the page was recorded as all-visible in the visibility
map by find_next_unskippable_block(), the heap buffer will always be
dirty. Either we have just frozen a tuple and already dirtied the
buffer, or the buffer was modified between find_next_unskippable_block()
and heap_page_prune_and_freeze() and then pruned in
heap_page_prune_and_freeze().

Additionally, XLogRegisterBuffer() asserts that the buffer is dirty, so
attempting to add a clean heap buffer to the WAL chain would assert out
anyway.

Since the “clean heap buffer with already set VM” case is impossible,
the two visibilitymap_set() branches in lazy_scan_prune() can be merged.
Doing so makes the intent clearer and emphasizes that the heap buffer
must always be marked dirty before being added to the WAL chain.

This commit also adds a test case for vacuuming when no heap
modifications are required. Currently this ensures that the heap buffer
is marked dirty before it is added to the WAL chain, but if we later
remove the heap buffer from the VM-set WAL chain or pass it with the
REGBUF_NO_CHANGES flag, this test would guard that behavior.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/5CEAA162-67B1-44DA-B60D-8B65717E8B05%40gmail.com
Discussion: https://postgr.es/m/flat/CAAKRu_ZWx5gCbeCf7PWCv8p5%3D%3Db7EEws0VD2wksDxpXCvCyHvQ%40mail.gmail.com
2026-01-26 16:03:32 -05:00
f6e5d21bf7 Exercise parallel GIN builds in regression tests
Modify two places creating GIN indexes in regression tests, so that the
build is parallel. This provides a basic test coverage, even if the
amounts of data are fairly small.

Reported-by: Kirill Reshke <reshkekirill@gmail.com>
Backpatch-through: 18
Discussion: https://postgr.es/m/CALdSSPjUprTj+vYp1tRKWkcLYzdy=N=O4Cn4y_HoxNSqQwBttg@mail.gmail.com
2026-01-26 20:05:17 +01:00
db14dcdec6 Lookup the correct ordering for parallel GIN builds
When building a tuplesort during parallel GIN builds, the function
incorrectly looked up the default B-Tree operator, not the function
associated with the GIN opclass (through GIN_COMPARE_PROC).

Fixed by using the same logic as initGinState(), and the other place
in parallel GIN builds.

This could cause two types of issues. First, a data type might not have
a B-Tree opclass, in which case the PrepareSortSupportFromOrderingOp()
fails with an ERROR. Second, a data type might have both B-Tree and GIN
opclasses, defining order/equality in different ways. This could lead to
logical corruption in the index.

Backpatch to 18, where parallel GIN builds were introduced.

Discussion: https://postgr.es/m/73a28b94-43d5-4f77-b26e-0d642f6de777@iki.fi
Reported-by: Heikki Linnakangas <hlinnaka@iki.fi>
Backpatch-through: 18
2026-01-26 20:05:17 +01:00
4cbaf4dcd2 Reduce length of TAP test file name.
Buildfarm member fairywren hit the Windows limitation on the length of a
file path. While there may be other things we should also do to prevent
this from happening, it's certainly the case that the length of this
test file name is much longer than others in the same directory, so make
it shorter.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: http://postgr.es/m/274e0a1a-d7d2-4bc8-8b56-dd09f285715e@gmail.com
Backpatch-through: 17
2026-01-26 12:43:52 -05:00
5ca5f12c2c Fix accidentally cast away qualifiers
This fixes cases where a qualifier (const, in all cases here) was
dropped by a cast, but the cast was otherwise necessary or desirable,
so the straightforward fix is to add the qualifier into the cast.

Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b04f4d3a-5e70-4e73-9ef2-87f777ca4aac%40eisentraut.org
2026-01-26 16:02:31 +01:00
7027dd499d Remove deduplication logic from find_window_functions
This code thought it was optimizing WindowAgg evaluation by getting rid
of duplicate WindowFuncs, but it turns out all it does today is lead to
cost-underestimations and makes it possible that optimize_window_clauses
could miss some of the WindowFuncs that must receive an updated winref.

The deduplication likely was useful when it was first added, but since
the projection code was changed in b8d7f053c, the list of WindowFuncs
gathered by find_window_functions isn't used during execution.  Instead,
the expression evaluation code will process the node's targetlist to find
the WindowFuncs.

The reason the deduplication could cause issues for
optimize_window_clauses() is because if a WindowFunc is moved to another
WindowClause, the winref is adjusted to reference the new WindowClause.
If any duplicate WindowFuncs were discarded in find_window_functions()
then the WindowFuncLists may not include all the WindowFuncs that need
their winref adjusted.  This could lead to an error message such as:

ERROR:  WindowFunc with winref 2 assigned to WindowAgg with winref 1

The back-branches will receive a different fix so that the WindowAgg costs
are not affected.

Author: Meng Zhang <mza117jc@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAErYLFAuxmW0UVdgrz7iiuNrxGQnFK_OP9hBD5CUzRgjrVrz=Q@mail.gmail.com
2026-01-26 23:27:15 +13:00
6ceef9408c Disable extended alignment uses on older g++
Fix for commit a9bdb63bba8.  The previous plan of redefining alignas
didn't work, because it interfered with other C++ header files (e.g.,
LLVM).  So now the new workaround is to just disable the affected
typedefs under the affected compilers.  These are not typically used
in extensions anyway.

Discussion: https://www.postgresql.org/message-id/3119480.1769189606%40sss.pgh.pa.us
2026-01-26 10:23:14 +01:00
d9abd9e105 Add test for MAINTAIN permission with pg_restore_extended_stats()
Like its cousin functions for the restore of relation and attribute
stats, pg_restore_extended_stats() needs to be run by a user that is the
database owner or has MAINTAIN privileges on the table whose stats are
restored.  This commit adds a regression test ensuring that MAINTAIN is
required when calling the function.  This test also checks that a
ShareUpdateExclusive lock is taken on the table whose stats are
restored.

This has been split from the commit that has introduced
pg_restore_extended_stats(), for clarity.

Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
2026-01-26 16:32:33 +09:00
114e84c532 Fix missing initialization in pg_restore_extended_stats()
The tuple data upserted into pg_statistic_ext_data was missing an
initialization for the nulls flag of stxoid and stxdinherit.  This would
cause an incorrect handling of the stats data restored.

This issue has been spotted by CatalogTupleCheckConstraints(),
translating to a NOT NULL constraint inconsistency, while playing more
with the follow-up portions of the patch set.

Oversight in 0e80f3f88dea (mea culpa).  Surprisingly, the buildfarm did
not complain yet.

Discussion: https://postgr.es/m/CADkLM=c7DY3Jv6ef0n_MGUJ1FyTMUoT697LbkST05nraVGNHYg@mail.gmail.com
2026-01-26 16:13:41 +09:00
0e80f3f88d Add pg_restore_extended_stats()
This function closely mirror its relation and attribute counterparts,
but for extended statistics (i.e. CREATE STATISTICS) objects, being
able to restore extended statistics for an extended stats object.  Like
the other functions, the goal of this feature is to ease the dump or
upgrade of clusters so as ANALYZE would not be required anymore after
these operations, stats being directly loaded into the target cluster
without any post-dump/upgrade computation.

The caller of this function needs the following arguments for the
extended stats to restore:
- The name of the relation.
- The schema name of the relation.
- The name of the extended stats object.
- The schema name of the extended stats object.
- If the stats are inherited or not.
- One or more extended stats kind with its data.

This commit adds only support for the restore of the extended statistics
kind "n_distinct", building the basic infrastructure for the restore
of more extended statistics kinds in follow-up commits, including MVC
and dependencies.

The support for "n_distinct" is eased in this commit thanks to the
previous work done particularly in commits 1f927cce4498 and
44eba8f06e55, that have added the input function for the type
pg_ndistinct, used as data type in input of this new restore function.

Bump catalog version.

Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
2026-01-26 15:08:15 +09:00
d4504d6f60 Add test with multirange type for pg_restore_attribute_stats()
This commit adds a test for pg_restore_attribute_stats() with the
injection of statistics related to a multirange type.  This case is
supported in statatt_get_type() since its introduction in ce207d2a7901,
but there was no test in the main regression test suite to check for the
case where attribute stats is restored for a multirange type, as done by
multirange_typanalyze().

Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CADkLM=c3JivzHNXLt-X_JicYknRYwLTiOCHOPiKagm2_vdrFUg@mail.gmail.com
2026-01-26 13:32:17 +09:00
c100340729 Remove PG_MMAP_FLAGS from mem.h
Based on name of the macro, it was implied that it could be used for all
mmap() calls on portability grounds.  However, its use is limited to
sysv_shmem.c, for CreateAnonymousSegment().  This commit removes the
declaration, reducing the confusion around it as a portability tweak,
being limited to SysV-style shared memory.

This macro has been introduced in b0fc0df9364d for sysv_shmem.c,
originally.  It has been moved to mem.h in 0ac5e5a7e152 a bit later.

Suggested by: Peter Eisentraut <peter@eisentraut.org>
Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAExHW5vTWABxuM5fbQcFkGuTLwaxuZDEE2vtx2WuMUWk6JnF4g@mail.gmail.com
Discussion: https://postgr.es/m/12add41a-7625-4639-a394-a5563e349322@eisentraut.org
2026-01-26 10:52:02 +09:00
83a53572a6 Always inline SeqNext and SeqRecheck
The intention of the work done in fb9f95502 was that these functions are
inlined.  I noticed my compiler isn't doing this on -O2 (gcc version
15.2.0).  Also, clang version 20.1.8 isn't inlining either.  Fix by
marking both of these functions as pg_attribute_always_inline to avoid
leaving this up to the compiler's heuristics.

A quick test with a Seq Scan on a table with a single int column running
a query that filters all 1 million rows in the WHERE clause yields a
3.9% speedup on my Zen4 machine.

Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvrL7Q41B=gv+3wc8+AJGKZugGegUbBo8FPQ+3+NGTPb+w@mail.gmail.com
2026-01-26 14:29:10 +13:00
168765e5d4 Add more tests with clause STORAGE on table and TOAST interactions
This commit adds more tests to cover STORAGE MAIN and EXTENDED, checking
how these use TOAST tables.  EXTENDED is already widely tested as the
default behavior, but there were no tests where the clause pattern is
directly specified.  STORAGE MAIN and its interactions with TOAST was
not covered at all.

This hole in the tests has been noticed for STORAGE MAIN (inline
compressible varlenas), where I have managed to break the backend
without the tests able to notice the breakage while playing with the
varlena structures.

Reviewed-by: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com>
Discussion: https://postgr.es/m/aXMdX1UTHnzYPkHk@paquier.xyz
2026-01-26 09:30:22 +09:00
a9bdb63bba Work around buggy alignas in older g++
Older g++ (<9.3) mishandle the alignas specifier (raise warnings that
the alignment is too large), but the more or less equivalent attribute
works.  So as a workaround, #define alignas to that attribute for
those versions.

see <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89357>

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/3119480.1769189606%40sss.pgh.pa.us
2026-01-25 11:32:47 +01:00
b4307ae2e5 Fix trigger transition table capture for MERGE in CTE queries.
When executing a data-modifying CTE query containing MERGE and some
other DML operation on a table with statement-level AFTER triggers,
the transition tables passed to the triggers would fail to include the
rows affected by the MERGE.

The reason is that, when initializing a ModifyTable node for MERGE,
MakeTransitionCaptureState() would create a TransitionCaptureState
structure with a single "tcs_private" field pointing to an
AfterTriggersTableData structure with cmdType == CMD_MERGE. Tuples
captured there would then not be included in the sets of tuples
captured when executing INSERT/UPDATE/DELETE ModifyTable nodes in the
same query.

Since there are no MERGE triggers, we should only create
AfterTriggersTableData structures for INSERT/UPDATE/DELETE. Individual
MERGE actions should then use those, thereby sharing the same capture
tuplestores as any other DML commands executed in the same query.

This requires changing the TransitionCaptureState structure, replacing
"tcs_private" with 3 separate pointers to AfterTriggersTableData
structures, one for each of INSERT, UPDATE, and DELETE. Nominally,
this is an ABI break to a public structure in commands/trigger.h.
However, since this is a private field pointing to an opaque data
structure, the only way to create a valid TransitionCaptureState is by
calling MakeTransitionCaptureState(), and no extensions appear to be
doing that anyway, so it seems safe for back-patching.

Backpatch to v15, where MERGE was introduced.

Bug: #19380
Reported-by: Daniel Woelfel <dwwoelfel@gmail.com>
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/19380-4e293be2b4007248%40postgresql.org
Backpatch-through: 15
2026-01-24 11:30:48 +00:00
9b9eaf08ab libpq_pipeline: Test the default protocol version
In preparation for a future change to libpq's default protocol version,
pin today's default (3.0) in the libpq_pipeline tests.

Patch by Jelte Fennema-Nio, with some additional refactoring of the
PQconnectdbParams() logic by me.

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/DDPR5BPWH1RJ.1LWAK6QAURVAY%40jeltef.nl
2026-01-23 12:59:03 -08:00
f7521bf721 pqcomm.h: Explicitly reserve protocol v3.1
Document this unused version alongside the other special protocol
numbers.

Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAOYmi%2BkKyw%3Dh-5NKqqpc7HC5M30_QmzFx3kgq2AdipyNj47nUw%40mail.gmail.com
2026-01-23 12:57:15 -08:00