Commit Graph

11923 Commits

Author SHA1 Message Date
ea5ff5833c Be clearer about when jsonapi's need_escapes is needed
Most operations beyond pure json parsing need to set need_escapes to
true to get access to field names and string scalars. Document this
fact more explicitly.

Slightly tweaked patch from:

Author: Corey Huinker <corey.huinker@gmail.com>

Discussion: https://postgr.es/m/CADkLM=c49Vkfg2+A8ubSuEtaGEjuaKZXCA6SrXA8kdwHjx3uxQ@mail.gmail.com
2025-01-19 09:09:58 -05:00
d3d0983169 Support PG_UNICODE_FAST locale in the builtin collation provider.
The PG_UNICODE_FAST locale uses code point sort order (fast,
memcmp-based) combined with Unicode character semantics. The character
semantics are based on Unicode full case mapping.

Full case mapping can map a single codepoint to multiple codepoints,
such as "ß" uppercasing to "SS". Additionally, it handles
context-sensitive mappings like the "final sigma", and it uses
titlecase mappings such as "Dž" when titlecasing (rather than plain
uppercase mappings).

Importantly, the uppercasing of "ß" as "SS" is specifically mentioned
by the SQL standard. In Postgres, UCS_BASIC uses plain ASCII semantics
for case mapping and pattern matching, so if we changed it to use the
PG_UNICODE_FAST locale, it would offer better compliance with the
standard. For now, though, do not change the behavior of UCS_BASIC.

Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com
Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org
Reviewed-by: Peter Eisentraut, Daniel Verite
2025-01-17 15:56:30 -08:00
286a365b9c Support Unicode full case mapping and conversion.
Generate tables from Unicode SpecialCasing.txt to support more
sophisticated case mapping behavior:

 * support case mappings to multiple codepoints, such as "ß"
   uppercasing to "SS"
 * support conditional case mappings, such as the "final sigma"
 * support titlecase variants, such as "dž" uppercasing to "DŽ" but
   titlecasing to "Dž"

Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com
Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org
Reviewed-by: Peter Eisentraut, Daniel Verite
2025-01-17 15:56:20 -08:00
b0eff10988 Add pg_nodiscard decorations to base64 functions
The result of pg_b64_encode() and pg_b64_decode() should be checked
for errors.  This attribute could detect mistakes such as those fixed
in commit ff030ebe250 and d278541be42.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAEudQAq-3yHsSdWoOOaw%2BgAQYgPMpMGuB5pt2yCXgv-YuxG2Hg%40mail.gmail.com
2025-01-17 08:21:32 +01:00
0dc9c7d200 Remove redefinitions of SIG_* macros in win32_port.h.
It is not clear why these were originally added.  One hypothesis is
that an ancient version of MinGW didn't define them.  In any case,
they appear to now be superfluous, so let's remove them.  If
nothing else, the buildfarm might offer us clues to their origins.

Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/Z4chOKfnthRH71mw%40nathan
2025-01-16 20:55:24 -06:00
f7a8fc10cc Add and use BitmapHeapScanDescData struct
Move the several members of HeapScanDescData which are specific to
Bitmap Heap Scans into a new struct, BitmapHeapScanDescData, which
inherits from HeapScanDescData.

This reduces the size of the HeapScanDescData for other types of scans
and will allow us to add additional bitmap heap scan-specific members in
the future without fear of bloating the HeapScanDescData.

Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/c736f6aa-8b35-4e20-9621-62c7c82e2168%40vondra.me
2025-01-16 18:42:39 -05:00
7b6468cc95 Rework macro pgstat_is_ioop_tracked_in_bytes()
As written, it was triggering a compilation warning for old versions of
clang, as reported by buildfarm members ayu, batfish and demoiselle.
Forcing a cast with "unsigned int" should fix the warning.

While on it, the macro is moved to pgstat.h, closer to the declaration
of IOOp, per suggestion from Tom Lane.

Reported-by: Tom Lane
Reviewed-by: Bertrand Drouvot, Tom Lane, Nazir Bilal Yavuz
Discussion: https://postgr.es/m/1272824.1736961543@sss.pgh.pa.us
2025-01-17 08:26:17 +09:00
d4a43b2837 Convert libpgport's pqsignal() to a void function.
The protections added by commit 3b00fdba9f introduced race
conditions to this function that can lead to bogus return values.
Since nobody seems to inspect the return value, this is of little
consequence, but it would have been nice to convert it to a void
function to avoid any possibility of a bogus return value.  I
originally thought that doing so would have required also modifying
legacy-pqsignal.c's version of the function (which would've
required an SONAME bump), but commit 9a45a89c38 gave
legacy-pqsignal.c its own dedicated extern for pqsignal(), thereby
decoupling it enough that libpgport's pqsignal() can be modified.

This commit also adds an assertion for the return value of
sigaction()/signal().  Since a failure most likely indicates a
coding error, and nobody has ever bothered to check pqsignal()'s
return value, it's probably not worth the effort to do anything
fancier.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/Z4chOKfnthRH71mw%40nathan
2025-01-16 16:41:05 -06:00
d7674c9fab Seek zone abbreviations in the IANA data before timezone_abbreviations.
If a time zone abbreviation used in datetime input is defined in
the currently active timezone, use that definition in preference
to looking in the timezone_abbreviations list.  That allows us to
correctly handle abbreviations that have different meanings in
different timezones.  Also, it eliminates an inconsistency between
datetime input and datetime output: the non-ISO datestyles for
timestamptz have always printed abbreviations taken from the IANA
data, not from timezone_abbreviations.  Before this fix, it was
possible to demonstrate cases where casting a timestamp to text
and back fails or changes the value significantly because of that
inconsistency.

While this change removes the ability to override the IANA data about
an abbreviation known in the current zone, it's not clear that there's
any real use-case for doing so.  But it is clear that this makes life
a lot easier for dealing with abbreviations that have conflicts across
different time zones.

Also update the pg_timezone_abbrevs view to report abbreviations
that are recognized via the IANA data, and *not* report any
timezone_abbreviations entries that are thereby overridden.
Under the hood, there are now two SRFs, one that pulls the IANA
data and one that pulls timezone_abbreviations entries.  They're
combined by logic in the view.  This approach was useful for
debugging (since the functions can be called on their own).
While I don't intend to document the functions explicitly,
they might be useful to call directly.

Also improve DecodeTimezoneAbbrev's caching logic so that it can
cache zone abbreviations found in the IANA data.  Without that,
this patch would have caused a noticeable degradation of the
runtime of timestamptz_in.

Per report from Aleksander Alekseev and additional investigation.

Discussion: https://postgr.es/m/CAJ7c6TOATjJqvhnYsui0=CO5XFMF4dvTGH+skzB--jNhqSQu5g@mail.gmail.com
2025-01-16 14:11:19 -05:00
80feb727c8 Add OLD/NEW support to RETURNING in DML queries.
This allows the RETURNING list of INSERT/UPDATE/DELETE/MERGE queries
to explicitly return old and new values by using the special aliases
"old" and "new", which are automatically added to the query (if not
already defined) while parsing its RETURNING list, allowing things
like:

  RETURNING old.colname, new.colname, ...

  RETURNING old.*, new.*

Additionally, a new syntax is supported, allowing the names "old" and
"new" to be changed to user-supplied alias names, e.g.:

  RETURNING WITH (OLD AS o, NEW AS n) o.colname, n.colname, ...

This is useful when the names "old" and "new" are already defined,
such as inside trigger functions, allowing backwards compatibility to
be maintained -- the interpretation of any existing queries that
happen to already refer to relations called "old" or "new", or use
those as aliases for other relations, is not changed.

For an INSERT, old values will generally be NULL, and for a DELETE,
new values will generally be NULL, but that may change for an INSERT
with an ON CONFLICT ... DO UPDATE clause, or if a query rewrite rule
changes the command type. Therefore, we put no restrictions on the use
of old and new in any DML queries.

Dean Rasheed, reviewed by Jian He and Jeff Davis.

Discussion: https://postgr.es/m/CAEZATCWx0J0-v=Qjc6gXzR=KtsdvAE7Ow=D=mu50AgOe+pvisQ@mail.gmail.com
2025-01-16 14:57:35 +00:00
d5221c49a3 Fix cpluspluscheck for "Change gist stratnum function to use CompareType"
Commit 630f9a43cec introduced an enum forward declaration, which
doesn't work in C++.  To fix, just include the header file to get the
type.
2025-01-15 23:11:08 +01:00
fecc8021e1 IWYU pragmas for catalog headers
Add "IWYU pragma: export" annotations in each catalog header file so
that, for instance, including "catalog/pg_aggregate.h" is considered
acceptable in place of "catalog/pg_aggregate_d.h".  This is very
common and it seems better to silence IWYU about it than trying to fix
this up.

Discussion: https://www.postgresql.org/message-id/flat/9395d484-eff4-47c2-b276-8e228526c8ae@eisentraut.org
2025-01-15 18:57:53 +01:00
74938d1320 IWYU widely useful pragmas
Add various widely useful "IWYU pragma" annotations, such as

- Common header files such as c.h, postgres.h should be "always_keep".

- System headers included in c.h, postgres.h etc. should be considered
  "export".

- Some portability headers such as getopt_long.h should be
  "always_keep", so they are not considered superfluous on some
  platforms.

- Certain system headers included from portability headers should be
  considered "export" because the purpose of the portability header is
  to wrap them.

- Superfluous includes marked as "for backward compatibility" get a
  formal IWYU annotation.

- Generated header included in utils/syscache.h is marked exported.
  This is a very commonly used include and this avoids lots of
  complaints.

Discussion: https://www.postgresql.org/message-id/flat/9395d484-eff4-47c2-b276-8e228526c8ae@eisentraut.org
2025-01-15 18:57:53 +01:00
761c79508e postgres_fdw: SCRAM authentication pass-through
This enables SCRAM authentication for postgres_fdw when connecting to
a foreign server without having to store a plain-text password on user
mapping options.

This is done by saving the SCRAM ClientKey and ServeryKey from the
client authentication and using those instead of the plain-text
password for the server-side SCRAM exchange.  The new foreign-server
or user-mapping option "use_scram_passthrough" enables this.

Co-authored-by: Matheus Alcantara <mths.dev@pm.me>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/27b29a35-9b96-46a9-bc1a-914140869dac@gmail.com
2025-01-15 17:58:05 +01:00
630f9a43ce Change gist stratnum function to use CompareType
This changes commit 7406ab623fe in that the gist strategy number
mapping support function is changed to use the CompareType enum as
input, instead of the "well-known" RT*StrategyNumber strategy numbers.

This is a bit cleaner, since you are not dealing with two sets of
strategy numbers.  Also, this will enable us to subsume this system
into a more general system of using CompareType to define operator
semantics across index methods.

Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-01-15 11:34:04 +01:00
6339f6468e Rename RowCompareType to CompareType
RowCompareType served as a way to describe the fundamental meaning of
an operator, notionally independent of an operator class (although so
far this was only really supported for btrees).  Its original purpose
was for use inside RowCompareExpr, and it has also found some small
use outside, such as for get_op_btree_interpretation().

We want to expand this now, as a more general way to describe operator
semantics for other index access methods, including gist (to improve
GistTranslateStratnum()) and others not written yet.  To avoid future
confusion, we rename the type to CompareType and the symbols from
ROWCOMPARE_XXX to COMPARE_XXX to reflect their more general purpose.

Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-01-15 08:44:01 +01:00
9a45a89c38 Avoid symbol collisions between pqsignal.c and legacy-pqsignal.c.
In the name of ABI stability (that is, to avoid a library major
version bump for libpq), libpq still exports a version of pqsignal()
that we no longer want to use ourselves.  However, since that has
the same link name as the function exported by src/port/pqsignal.c,
there is a link ordering dependency determining which version will
actually get used by code that uses libpq as well as libpgport.a.

It now emerges that the wrong version has been used by pgbench and
psql since commit 06843df4a rearranged their link commands.  This
can result in odd failures in pgbench with the -T switch, since its
SIGALRM handler will now not be marked SA_RESTART.  psql may have
some edge-case problems in \watch, too.

Since we don't want to depend on link ordering effects anymore,
let's fix this in the same spirit as b6c7cfac8: use macros to change
the actual link names of the competing functions.  We cannot change
legacy-pqsignal.c's exported name of course, so the victim has to be
src/port/pqsignal.c.

In master, rename its exported name to be pqsignal_fe in frontend or
pqsignal_be in backend.  (We could perhaps have gotten away with using
the same symbol in both cases, but since the FE and BE versions now
work a little differently, it seems advisable to use different names.)

In back branches, rename to pqsignal_fe in frontend but keep it as
pqsignal in backend.  The frontend change could affect third-party
code that is calling pqsignal from libpgport.a or libpgport_shlib.a,
but only if the code is compiled against port.h from a different minor
release than libpgport.  Since we don't support using libpgport as a
shared library, it seems unlikely that there will be such a problem.
I left the backend symbol unchanged to avoid an ABI break for
extensions.  This means that the link ordering hazard still exists
for any extension that links against libpq.  However, none of our own
extensions use both pqsignal() and libpq, and we're not making things
any worse for third-party extensions that do.

Report from Andy Fan, diagnosis by Fujii Masao, patch by me.
Back-patch to all supported branches, as 06843df4a was.

Discussion: https://postgr.es/m/87msfz5qv2.fsf@163.com
2025-01-14 18:50:24 -05:00
2ae98ea5ab Synchronize guc_tables.c categories with vacuum docs categories
ca9c6a5680d consolidated most of the vacuum-related GUCs' documentation
into a new subsection. af2317652d5daf8b then enforced this order in
postgresql.conf.sample. This commit reorganizes the GUC groups in
guc_tables.c/h to match the updated ordering in the docs.

Reported-by: Álvaro Herrera
Reviewed-by: Álvaro Herrera, Alena Rybakina
Discussion: https://postgr.es/m/202501132046.m4mcvxxswznu%40alvherre.pgsql
2025-01-14 15:31:00 -05:00
4cb560b53f Consistently spell "leakproof" without a hyphen.
The overwhelming majority of places already did this, but a small
handful of places had a hyphen.

Yugo Nagata.

Discussion: https://postgr.es/m/CAEZATCXnnuORE2BoGwHw2zbtVvsPOLhbfVmEk9GxRzK%2Bx3OW-Q%40mail.gmail.com
2025-01-14 13:50:54 +00:00
af8cd1639a Fix catcache invalidation of a list entry that's being built
If a new catalog tuple is inserted that belongs to a catcache list
entry, and cache invalidation happens while the list entry is being
built, the list entry might miss the newly inserted tuple.

To fix, change the way we detect concurrent invalidations while a
catcache entry is being built. Keep a stack of entries that are being
built, and apply cache invalidation to those entries in addition to
the real catcache entries. This is similar to the in-progress list in
relcache.c.

Back-patch to all supported versions.

Reviewed-by: Noah Misch
Discussion: https://www.postgresql.org/message-id/2234dc98-06fe-42ed-b5db-ac17384dc880@iki.fi
2025-01-14 14:28:49 +02:00
ce9a74707d Bump PGSTAT_FILE_FORMAT_ID
Oversight in f92c854cf406, that has changed the definition of
PgStat_BktypeIO, impacting PgStat_IO which is the on-disk data for IO
pgstats data.
2025-01-14 15:17:22 +09:00
d35ea27e51 Move information about pgstats kinds into its own header pgstat_kind.h
This includes all the definitions for the various PGSTAT_KIND_* values,
the range allowed for custom stats kinds and some macros related all
that.

One use-case behind this split is the possibility to use this
information for frontend tools, without having to rely on pgstat.h and a
backend footprint.

Author: Michael Paquier
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Z24fyb3ipXKR38oS@paquier.xyz
2025-01-14 12:43:07 +09:00
f92c854cf4 Make pg_stat_io count IOs as bytes instead of blocks for some operations
Currently in pg_stat_io view, IOs are counted as blocks of size
BLCKSZ.  There are two limitations with this design:
* The actual number of I/O requests sent to the kernel is lower because
I/O requests may be merged before being sent.  Additionally, it gives
the impression that all I/Os are done in block size, which shadows the
benefits of merging I/O requests.
* Some patches are under work to extend pg_stat_io for the tracking of
operations that may not be linked to the block size.  For example, WAL
read IOs are done in variable bytes and it is not possible to correctly
show these IOs in pg_stat_io view, and we want to keep all this data in
a single system view rather than spread it across multiple relations to
ease monitoring.

WaitReadBuffers() can now be tracked as a single read operation
worth N blocks.  Same for ExtendBufferedRelShared() and
ExtendBufferedRelLocal() for extensions.

Three columns are added to pg_stat_io for reads, writes and extensions
for the byte calculations.  op_bytes, which was always hardcoded to
BLCKSZ, is removed.  IO backend statistics are updated to reflect these
changes.

Bump catalog version.

Author: Nazir Bilal Yavuz
Reviewed-by: Bertrand Drouvot, Melanie Plageman
Discussion: https://postgr.es/m/CAN55FZ0oqxBaaHAEsj=xFqkzE3n5P=3RA1V_igXwL-RV7QRzyw@mail.gmail.com
2025-01-14 12:14:29 +09:00
b4a07f532b Revert "TupleHashTable: store additional data along with tuple."
This reverts commit e0ece2a981ee9068f50c4423e303836c2585eb02 due to
performance regressions.

Reported-by: David Rowley
2025-01-13 14:14:33 -08:00
1c854eb893 Add BTOPTIONS_PROC comments to nbtree.h.
Add comments explaining the purpose of B-Tree support function 5 to
nbtree.h for consistency (all other support functions were already
described by nearby comments).

This fixes what was arguably an oversight in commit 911e702077, or in
follow-up doc commit 15cb2bd2 (which documented support function 5 in
btree.sgml, but neglected to add anything to nbtree.h).
2025-01-13 15:02:14 -05:00
597b1ffbf1 Move nbtree preprocessing into new .c file.
Quite a bit of code within nbtutils.c is only called during nbtree
preprocessing.  Move that code into a new .c file, nbtpreprocesskeys.c.
Also reorder some of the functions within the new file for clarity.

This commit has no functional impact.  It is strictly mechanical.

Author: Peter Geoghegan <pg@bowt.ie>
Suggested-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CAH2-WznwNn1BDOpWxHBUK1f3Rdw8pO9UCenWXnvT=n9GO8GnLA@mail.gmail.com
Discussion: https://postgr.es/m/86930045-5df5-494a-b4f1-815bc3fbcce0%40iki.fi
2025-01-13 12:15:00 -05:00
6e826278f1 Fix pgindent damage
Oversight in commit e0ece2a98.
2025-01-13 11:27:32 +09:00
ca87c415e2 Add support for NOT ENFORCED in CHECK constraints
This adds support for the NOT ENFORCED/ENFORCED flag for constraints,
with support for check constraints.

The plan is to eventually support this for foreign key constraints,
where it is typically more useful.

Note that CHECK constraints do not currently support ALTER operations,
so changing the enforceability of an existing constraint isn't
possible without dropping and recreating it.  This could be added
later.

Author: Amul Sul <amul.sul@enterprisedb.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Tested-by: Triveni N <triveni.n@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
2025-01-11 10:52:30 +01:00
e0ece2a981 TupleHashTable: store additional data along with tuple.
Previously, the caller needed to allocate the memory and the
TupleHashTable would store a pointer to it. That wastes space for the
palloc overhead as well as the size of the pointer itself.

Now, the TupleHashTable relies on the caller to correctly specify the
additionalsize, and allocates that amount of space. The caller can
then request a pointer into that space.

Discussion: https://postgr.es/m/b9cbf0219a9859dc8d240311643ff4362fd9602c.camel@j-davis.com
Reviewed-by: Heikki Linnakangas
2025-01-10 17:14:37 -08:00
34c6e65242 Make verify_compact_attribute available in non-assert builds
6f3820f37 adjusted the assert-enabled validation of the CompactAttribute
to call a new external function to perform the validation.  That commit
made it so the function was only available when building with
USE_ASSERT_CHECKING, and because TupleDescCompactAttr() is a static
inline function, the call to verify_compact_attribute() was compiled
into any extension which uses TupleDescCompactAttr().  This caused issues
for such extensions when loading the assert-enabled extension into
PostgreSQL versions without asserts enabled due to that function being
unavailable in core.

To fix this, make verify_compact_attribute() available unconditionally,
but make it do nothing unless building with USE_ASSERT_CHECKING.

Author: Andrew Kane <andrew@ankane.org>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAOdR5yHfMEMW00XGo=v1zCVUS6Huq2UehXdvKnwtXPTcZwXhmg@mail.gmail.com
2025-01-11 13:45:54 +13:00
a9dcbb4d5c Add new StringInfo APIs to allow callers to specify the buffer size.
Previously StringInfo APIs allocated buffers with fixed initial
allocation size of 1024 bytes. This may be too large and inappropriate
for some callers that can do with smaller memory buffers. To fix this,
introduce new APIs that allow callers to specify initial buffer size.

extern StringInfo makeStringInfoExt(int initsize);
extern void initStringInfoExt(StringInfo str, int initsize);

Existing APIs (makeStringInfo() and initStringInfo()) are changed to
call makeStringInfoExt and initStringInfoExt respectively (via inline
helper functions makeStringInfoInternal and initStringInfoInternal),
with the default buffer size of 1024.

Reviewed-by: Nathan Bossart, David Rowley, Michael Paquier, Gurjeet Singh
Discussion: https://postgr.es/m/20241225.123704.1194662271286702010.ishii%40postgresql.org
2025-01-11 08:23:46 +09:00
3d0b4b1068 Use a non-locking initial test in TAS_SPIN on AArch64.
Our testing showed that this is helpful at sufficiently high
contention levels and doesn't hurt performance on smaller machines.
The new TAS_SPIN macro for AArch64 is identical to the ones added
for PPC and x86_64 (see commits bc2a050d40 and b03d196be0).

Reported-by: Salvatore Dipietro
Reviewed-by: Jingtang Zhang, Andres Freund
Tested-by: Tom Lane
Discussion: https://postgr.es/m/ZxgDEb_VpWyNZKB_%40nathan
2025-01-10 13:18:04 -06:00
cc811f92ba Adjust signature of cluster_rel() and its subroutines
cluster_rel() receives the OID of the relation to process, which it
opens and locks; but then its subroutine copy_table_data() also receives
the relation OID and opens it by itself.  This is a bit wasteful.  It's
better to have cluster_rel() receive the relation already open, and pass
it down to its subroutines as necessary; then cluster_rel closes the rel
before returning.  This simplifies things.

But a better motivation to make this change is that a future command to
do logical-decoding-based "concurrent VACUUM FULL" will need to release
all locks on the relation (and possibly on the clustering index) at some
point.  Since it makes little sense to keep the relation reference
without the lock, the cluster_rel() function will also close it (and
the index).  With this arrangement, neither the function nor its
subroutines need open extra references, which, again, makes things simpler.

Author: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/82651.1720540558@antos
2025-01-10 13:09:38 +01:00
f0bf7857be Merge pgstat_count_io_op_n() and pgstat_count_io_op()
The pgstat_count_io_op() function, which counts a single I/O operation,
wraps pgstat_count_io_op_n() with a counter value of 1.  The latter is
declared in pgstat.h and used nowhere in the code, so let's remove it in
favor of the former.

This change makes also the code more symmetric with
pgstat_count_io_op_time(), that already uses a similar set of arguments,
except that it counts also the I/O time.  This will ease a bit the
integration of a follow-up patch that adds byte-level tracking in
pg_stat_io for some of its attributes, lifting the current restriction
based on BLCKSZ as all I/O operations are assumed to be block-based.

Author: Nazir Bilal Yavuz
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/CAN55FZ32ze812=yjyZg1QeXhKvACUM_Nu0_gyPQcUKKuVHL5xA@mail.gmail.com
2025-01-10 09:57:27 +09:00
2c14037bb5 Refactor some code related to backend statistics
This commit changes the way pending backend statistics are tracked by
moving them into a new structure called PgStat_BackendPending, removing
PgStat_BackendPendingIO.  PgStat_BackendPending currently only includes
PgStat_PendingIO for the pending I/O stats.

pgstat_flush_backend() is extended with a "flags" argument to control
which parts of the stats of a backend should be flushed.

With this refactoring, it becomes easier to plug into backend statistics
more data.  A patch to add information related to WAL in this stats kind
is under discussion.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal
2025-01-10 09:00:48 +09:00
69ab446514 Fix SLRU bank selection code
The originally submitted code (using bit masking) was correct when the
number of slots was restricted to be a power of two -- but that
limitation was removed during development that led to commit
53c2a97a9266, which made the bank selection code incorrect.  This led to
always using a smaller number of banks than available.  Change said code
to use integer modulo instead, which works correctly with an arbitrary
number of banks.

It's likely that we could improve on this to avoid runtime use of
integer division.  But with this change we're, at least, not wasting
memory on unused banks, and more banks mean less contention, which is
likely to have a much higher performance impact than a single
instruction's latency.

Author: Yura Sokolov <y.sokolov@postgrespro.ru>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/9444dc46-ca47-43ed-9058-89c456316306@postgrespro.ru
2025-01-09 07:39:05 +01:00
026762dae3 Provide 64-bit ftruncate() and lseek() on Windows.
Change our ftruncate() macro to use the 64-bit variant of chsize(), and
add a new macro to redirect lseek() to _lseeki64().

Back-patch to all supported releases, in preparation for a bug fix.

Tested-by: Davinder Singh <davinder.singh@enterprisedb.com>
Discussion: https://postgr.es/m/CAKZiRmyM4YnokK6Oenw5JKwAQ3rhP0YTz2T-tiw5dAQjGRXE3Q%40mail.gmail.com
2025-01-09 15:00:58 +13:00
229e7793d9 Fix duplicate typedef from commit a2f17f004d.
Reported-by: Thomas Munro
2025-01-08 15:25:05 -08:00
a2f17f004d Control collation behavior with a method table.
Previously, behavior branched based on the provider. A method table is
less error-prone and more flexible.

The ctype behavior will be addressed in an upcoming commit.

Reviewed-by: Andreas Karlsson
Discussion: https://postgr.es/m/2830211e1b6e6a2e26d845780b03e125281ea17b.camel%40j-davis.com
2025-01-08 14:26:46 -08:00
8a96faedc4 Remove unused TupleHashTableData->entrysize.
Discussion: https://postgr.es/m/7530bd8783b1a78d53a3c70383e38d8da0a5ffe5.camel%40j-davis.com
2025-01-07 14:49:18 -08:00
834c9e807c Add missing typedefs.list entry for AggStatePerGroupData.
Discussion: https://postgr.es/m/7530bd8783b1a78d53a3c70383e38d8da0a5ffe5.camel%40j-davis.com
2025-01-07 14:33:21 -08:00
c758119e5b Allow changing autovacuum_max_workers without restarting.
This commit introduces a new parameter named
autovacuum_worker_slots that controls how many autovacuum worker
slots to reserve during server startup.  Modifying this new
parameter's value does require a server restart, but it should
typically be set to the upper bound of what you might realistically
need to set autovacuum_max_workers.  With that new parameter in
place, autovacuum_max_workers can now be changed with a SIGHUP
(e.g., pg_ctl reload).

If autovacuum_max_workers is set higher than
autovacuum_worker_slots, a WARNING is emitted, and the server will
only start up to autovacuum_worker_slots workers at a given time.
If autovacuum_max_workers is set to a value less than the number of
currently-running autovacuum workers, the existing workers will
continue running, but no new workers will be started until the
number of running autovacuum workers drops below
autovacuum_max_workers.

Reviewed-by: Sami Imseih, Justin Pryzby, Robert Haas, Andres Freund, Yogesh Sharma
Discussion: https://postgr.es/m/20240410212344.GA1824549%40nathanxps13
2025-01-06 15:01:22 -06:00
5e68f61192 Remove duplicate definitions in proc.h
These are also present in procnumber.h

Reported-by: Peter Eisentraut
Discussion: https://www.postgresql.org/message-id/bd04d675-4672-4f87-800a-eb5d470c15fc@eisentraut.org
2025-01-06 11:56:03 +02:00
3e70da2781 Always use the caller-provided context for radix tree leaves
Previously, it would not have worked for a caller to pass a slab
context, since it would have been used for other things which likely
had incompatible size. In an attempt to be helpful and avoid possible
space wastage due to aset's power-of-two rounding, RT_CREATE would
create an additional slab context if the value type was fixed-length
and larger than pointer size. The problem was, we have since added
the bump context type, and the generation context was a possibility as
well, so silently overriding the caller's choice may actually be worse.

Commit e8a6f1f908d arranged so that the caller-provided context is
used only for leaves, so it's safe for the caller to use slab here
if they wish. As demonstration, use slab in one of the radix tree
regression tests.

Reviewed by Masahiko Sawada

Discussion: https://postgr.es/m/CANWCAZZDCo4k5oURg_pPxM6+WZ1oiG=sqgjmQiELuyP0Vtrwig@mail.gmail.com
2025-01-06 13:26:02 +07:00
e8a6f1f908 Get rid of radix tree's general purpose memory context
Previously, this was notionally used only for the entry point of the
tree and as a convenient parent for other contexts.

For shared memory, the creator previously allocated the entry point
in this context, but attaching backends didn't have access to that,
so they just used the caller's context. For the sake of consistency,
allocate every instance of an entry point in the caller's context.

For local memory, allocate the control object in the caller's context
as well. This commit also makes the "leaf context" the notional parent
of the child contexts used for nodes, so it's a bit of a misnomer,
but a future commit will make the node contexts independent rather
than children, so leave it this way for now to avoid code churn.

The memory context parameter for RT_CREATE is now unused in the case
of shared memory, so remove it and adjust callers to match.

In passing, remove unused "context" member from struct TidStore,
which seems to have been an oversight.

Reviewed by Masahiko Sawada

Discussion: https://postgr.es/m/CANWCAZZDCo4k5oURg_pPxM6+WZ1oiG=sqgjmQiELuyP0Vtrwig@mail.gmail.com
2025-01-06 11:21:21 +07:00
960013f2a1 Use caller's memory context for radix tree iteration state
Typically only one iterator is present at any time, so it's overkill
to devote an entire context for this. Get rid of it and use the
caller's context.

This is tidy-up work, so no backpatch in this form. However, a
hypothetical extension to v17 that tried to start iteration from
an attaching backend would result in a crash, so that'll be fixed
separately in a way that doesn't change behavior in core.

Patch by me, reported and reviewed by Masahiko Sawada

Discussion: https://postgr.es/m/CAD21AoBB2U47V=F+wQRB1bERov_of5=BOZGaybjaV8FLQyqG3Q@mail.gmail.com
2025-01-06 09:01:58 +07:00
11012c5037 Fix an assortment of spelling mistakes and typos
Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/5812a0b9-b0cf-4151-9a14-d9f00e4f2858@gmail.com
2025-01-02 12:42:01 +13:00
50e6eb731d Update copyright for 2025
Backpatch-through: 13
2025-01-01 11:21:55 -05:00
ebf2ab40e5 Remove redundant wording in pg_statistic.h
Author: Junwang Zhao
Discussion: https://postgr.es/m/CAEG8a3JbMCHna=N5ZSx6huLnTDfW34kw7Pf2n8+3M-9UrrwesA@mail.gmail.com
2024-12-30 12:18:45 +09:00
508a97ee49 Replace PGPROC.isBackgroundWorker with isRegularBackend.
Commit 34486b609 effectively redefined isBackgroundWorker as meaning
"not a regular backend", whereas before it had the narrower
meaning of AmBackgroundWorkerProcess().  For clarity, rename the
field to isRegularBackend and invert its sense.

Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
2024-12-28 16:21:54 -05:00