Commit Graph

11096 Commits

Author SHA1 Message Date
e0b1ee17dc Skip checking of scan keys required for directional scan in B-tree
Currently, B-tree code matches every scan key to every item on the page.
Imagine the ordered B-tree scan for the query like this.

SELECT * FROM tbl WHERE col > 'a' AND col < 'b' ORDER BY col;

The (col > 'a') scan key will be always matched once we find the location to
start the scan.  The (col < 'b') scan key will match every item on the page
as long as it matches the last item on the page.

This patch implements prechecking of the scan keys required for directional
scan on beginning of page scan.  If precheck is successful we can skip this
scan keys check for the items on the page.  That could lead to significant
acceleration especially if the comparison operator is expensive.

Idea from patch by Konstantin Knizhnik.

Discussion: https://postgr.es/m/079c3f8e-3371-abe2-e93c-fc8a0ae3f571%40garret.ru
Reviewed-by: Peter Geoghegan, Pavel Borisov
2023-10-06 10:40:51 +03:00
04e485273b Move BuildDescForRelation() from tupdesc.c to tablecmds.c
BuildDescForRelation() main job is to convert ColumnDef lists to
pg_attribute/tuple descriptor arrays, which is really mostly an
internal subroutine of DefineRelation() and some related functions,
which is more the remit of tablecmds.c and doesn't have much to do
with the basic tuple descriptor interfaces in tupdesc.c.  This is also
supported by observing the header includes we can remove in tupdesc.c.
By moving it over, we can also (in the future) make
BuildDescForRelation() use more internals of tablecmds.c that are not
sensible to be exposed in tupdesc.c.

Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org
2023-10-05 16:20:46 +02:00
e29c464395 Refactor ListenSocket array.
Keep track of the used size of the array. That avoids looping through
the whole array in a few places. It doesn't matter from a performance
point of view since the array is small anyway, but this feels less
surprising and is a little less code. Now that we have an explicit
NumListenSockets variable that is statically initialized to 0, we
don't need the loop to initialize the array.

Allocate the array in PostmasterContext. The array isn't needed in
child processes, so this allows reusing that memory. We could easily
make the array resizable now, but we haven't heard any complaints
about the current 64 sockets limit.

Discussion: https://www.postgresql.org/message-id/7bb7ad65-a018-2419-742f-fa5fd877d338@iki.fi
2023-10-05 15:05:25 +03:00
1c99cde2f3 Improve JsonLexContext's freeability
Previously, the JSON code didn't have to worry too much about freeing
JsonLexContext, because it was never too long-lived.  With new features
being added for SQL/JSON this is no longer the case.  Add a routine
that knows how to free this struct and apply that to a few places, to
prevent this from becoming problematic.

At the same time, we change the API of makeJsonLexContextCstringLen to
make it receive a pointer to JsonLexContext for callers that want it to
be stack-allocated; it can also be passed as NULL to get the original
behavior of a palloc'ed one.

This also causes an ABI break due to the addition of flags to
JsonLexContext, so we can't easily backpatch it.  AFAICS that's not much
of a problem; apparently some leaks might exist in JSON usage of
text-search, for example via json_to_tsvector, but I haven't seen any
complaints about that.

Per Coverity complaint about datum_to_jsonb_internal().

Discussion: https://postgr.es/m/20230808174110.oq3iymllsv6amkih@alvherre.pgsql
2023-10-05 10:59:08 +02:00
5e4282772a Remove RelationGetIndexRawAttOptions()
There was only one caller left, for which this function was overkill.

Also, having it in relcache.c was inappropriate, since it doesn't work
with the relcache at all.

Discussion: https://www.postgresql.org/message-id/flat/f84640e3-00d3-5abd-3f41-e6a19d33c40b@eisentraut.org
2023-10-03 17:51:02 +02:00
7841623571 Remove IndexInfo.ii_OpclassOptions field
It is unnecessary to include this field in IndexInfo.  It is only used
by DDL code, not during execution.  It is really only used to pass
local information around between functions in index.c and indexcmds.c,
for which it is clearer to use local variables, like in similar cases.

Discussion: https://www.postgresql.org/message-id/flat/f84640e3-00d3-5abd-3f41-e6a19d33c40b@eisentraut.org
2023-10-03 17:51:02 +02:00
1ccc1e05ae Remove retry loop in heap_page_prune().
The retry loop is needed because heap_page_prune() calls
HeapTupleSatisfiesVacuum() and then lazy_scan_prune() does the same
thing again, and they might get different answers due to concurrent
clog updates.  But this patch makes heap_page_prune() return the
HeapTupleSatisfiesVacuum() results that it computed back to the
caller, which allows lazy_scan_prune() to avoid needing to recompute
those values in the first place. That's nice both because it eliminates
the need for a retry loop and also because it's cheaper.

Melanie Plageman, reviewed by David Geier, Andres Freund, and me.

Discussion: https://postgr.es/m/CAAKRu_br124qsGJieuYA0nGjywEukhK1dKBfRdby_4yY3E9SXA%40mail.gmail.com
2023-10-02 11:40:07 -04:00
f0bd0b4489 Add rmgrdesc README
In the README, briefly explain what rmgrdesc functions are, and why
they are in a separate directory. Commit c03c2eae0a added some
guidelines on the preferred output format; move that to the README
too.

Reviewed-by: Melanie Plageman, Peter Geoghegan
Discussion: https://www.postgresql.org/message-id/9159daf7-f42d-781b-458f-1b2cf32cb256%40iki.fi
2023-10-02 12:18:57 +03:00
c8ec5e0543 Revert "Add soft error handling to some expression nodes"
This reverts commit 7fbc75b26ed8ec70c729c5e7f8233896c54c900f.

Looks like the LLVM additions may not be totally correct.
2023-10-02 13:48:15 +09:00
7fbc75b26e Add soft error handling to some expression nodes
This adjusts the expression evaluation code for CoerceViaIO and
CoerceToDomain to handle errors softly if needed.

For CoerceViaIo, this means using InputFunctionCallSafe(), which
provides the option to handle errors softly, instead of calling the
type input function directly.

For CoerceToDomain, this simply entails replacing the ereport() in
ExecEvalConstraintCheck() by errsave().

In both cases, the ErrorSaveContext to be used when evaluating the
expression is stored by ExecInitExprRec() in the expression's struct
in the expression's ExprEvalStep.  The ErrorSaveContext is passed by
setting ExprState.escontext to point to it when calling
ExecInitExprRec() on the expression whose errors are to be handled
softly.

Note that no call site of ExecInitExprRec() has been changed in this
commit, so there's no functional change.  This is intended for
implementing new SQL/JSON expression nodes in future commits that
will use to it suppress errors that may occur during type coercions.

Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2023-10-02 11:52:28 +09:00
e1f95ec8cf Correct assertion and comments about XLogRecordMaxSize.
The largest allocation, of xl_tot_len+8192, is in allocate_recordbuf().

Discussion: https://postgr.es/m/20230812211327.GB2326466@rfd.leadboat.com
2023-10-01 12:20:55 -07:00
f6d4c9cf16 Provide FORCE_NULL * and FORCE_NOT_NULL * options for COPY FROM
These options already exist, but you need to specify a column list for
them, which can be cumbersome. We already have the possibility of all
columns for FORCE QUOTE, so this is simply extending that facility to
FORCE_NULL and FORCE_NOT_NULL.

Author: Zhang Mingli
Reviewed-By: Richard Guo, Kyatoro Horiguchi, Michael Paquier.

Discussion: https://postgr.es/m/CACJufxEnVqzOFtqhexF2+AwOKFrV8zHOY3y=p+gPK6eB14pn_w@mail.gmail.com
2023-09-30 12:34:41 -04:00
1d5caec221 Fix EvalPlanQual rechecking during MERGE.
Under some circumstances, concurrent MERGE operations could lead to
inconsistent results, that varied according the plan chosen. This was
caused by a lack of rowmarks on the source relation, which meant that
EvalPlanQual rechecking was not guaranteed to return the same source
tuples when re-running the join query.

Fix by ensuring that preprocess_rowmarks() sets up PlanRowMarks for
all non-target relations used in MERGE, in the same way that it does
for UPDATE and DELETE.

Per bug #18103. Back-patch to v15, where MERGE was introduced.

Dean Rasheed, reviewed by Richard Guo.

Discussion: https://postgr.es/m/18103-c4386baab8e355e3%40postgresql.org
2023-09-30 10:52:21 +01:00
714780dcdd Fix btmarkpos/btrestrpos array key wraparound bug.
nbtree's mark/restore processing failed to correctly handle an edge case
involving array key advancement and related search-type scan key state.
Scans with ScalarArrayScalarArrayOpExpr quals requiring mark/restore
processing (for a merge join) could incorrectly conclude that an
affected array/scan key must not have advanced during the time between
marking and restoring the scan's position.

As a result of all this, array key handling within btrestrpos could skip
a required call to _bt_preprocess_keys().  This confusion allowed later
primitive index scans to overlook tuples matching the true current array
keys.  The scan's search-type scan keys would still have spurious values
corresponding to the final array element(s) -- not values matching the
first/now-current array element(s).

To fix, remember that "array key wraparound" has taken place during the
ongoing btrescan in a flag variable stored in the scan's state, and use
that information at the point where btrestrpos decides if another call
to _bt_preprocess_keys is required.

Oversight in commit 70bc5833, which taught nbtree to handle array keys
during mark/restore processing, but missed this subtlety.  That commit
was itself a bug fix for an issue in commit 9e8da0f7, which taught
nbtree to handle ScalarArrayOpExpr quals natively.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkgP3DDRJxw6DgjCxo-cu-DKrvjEv_ArkP2ctBJatDCYg@mail.gmail.com
Backpatch: 11- (all supported branches).
2023-09-28 16:29:37 -07:00
4e9fc3a976 Return data from heap_page_prune via a struct.
Previously, one of the values in the struct was returned as the return
value, and another was returned via an output parameter. In
preparation for returning more stuff, consolidate both values into a
struct returned via an output parameter.

Melanie Plageman, reviewed by Andres Freund and by me.

Discussion: https://postgr.es/m/CAAKRu_br124qsGJieuYA0nGjywEukhK1dKBfRdby_4yY3E9SXA%40mail.gmail.com
2023-09-28 10:36:34 -04:00
11c34b342b Show parameters of CALL as constants in pg_stat_statements
This commit changes the query jumbling of CallStmt so as its IN/OUT
parameters are able to show up as constants with a parameter symbol in
pg_stat_statements, like:
CALL proc1($1, $2);
CALL proc2($1, $2, $3);

The transformed FuncExpr is used in the query ID computation instead of
the FuncCall generated by the parser, so as it is sensitive to the OID
of the procedure and its list of input arguments.  The output arguments
are handled in a separate list in CallStmt, which is also included in
the computation.

Tests are added to pg_stat_statements to show how this affects CALL with
IN/OUT parameters as well as overloaded functions.

Like 638d42a3c520 or 31de7e60da34, this improves the monitoring of
workloads with a lot of CALL statements, preventing unnecessary bloat
when these use different input (or event output) values.

Author: Sami Imseih
Discussion: https://postgr.es/m/B44FA29D-EBD0-4DD9-ABC2-16F1CB087074@amazon.com
2023-09-28 15:17:55 +09:00
d060e921ea Remove obsolete executor cleanup code
This commit removes unnecessary ExecExprFreeContext() calls in
ExecEnd* routines because the actual cleanup is managed by
FreeExecutorState(). With no callers remaining for
ExecExprFreeContext(), this commit also removes the function.

This commit also drops redundant ExecClearTuple() calls, because
ExecResetTupleTable() in ExecEndPlan() already takes care of
resetting and dropping all TupleTableSlots initialized with
ExecInitScanTupleSlot() and ExecInitExtraTupleSlot().

After these modifications, the ExecEnd*() routines for ValuesScan,
NamedTuplestoreScan, and WorkTableScan became redundant. So, this
commit removes them.

Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
2023-09-28 09:44:39 +09:00
9210afd3bc Move tracking of in_streaming to PGOutputData
"in_streaming" is a flag used to track if an instance of pgoutput is
streaming changes.  When pgoutput is started, the flag was always reset,
switched it back and forth in the stream start/stop callbacks.

Before this commit, it was a global variable, which is confusing as it
is actually attached to a state of PGOutputData.  Per my analysis, using
a global variable did not lead to an active bug like in 54ccfd65868c,
but it makes the code more consistent.  Note that we cannot backpatch
this change anyway as it requires the addition of a new field to
PGOutputData, exposed in pgoutput.h.

Author: Hou Zhijie
Reviewed-by: Amit Kapila, Michael Paquier, Peter Smith
Discussion: https://postgr.es/m/OS0PR01MB571690EF24F51F51EFFCBB0E94FAA@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-09-28 09:33:51 +09:00
ebf76f2753 Add TupleDescGetDefault()
This unifies some repetitive code.

Note: I didn't push the "not found" error message into the new
function, even though all existing callers would be able to make use
of it.  Using the existing error handling as-is would probably require
exposing the Relation type via tupdesc.h, which doesn't seem
desirable.  (Or even if we changed it to just report the OID, it would
inject the concept of a relation containing the tuple descriptor into
tupdesc.h, which might be a layering violation.  Perhaps some further
improvements could be considered here separately.)

Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da%40eisentraut.org
2023-09-27 18:52:40 +01:00
9dce22033d llvmjit: Use explicit LLVMContextRef for inlining
When performing inlining LLVM unfortunately "leaks" types (the
types survive and are usable, but a new round of inlining will
recreate new structurally equivalent types). This accumulation
will over time amount to a memory leak which for some queries
can be large enough to trigger the OOM process killer.

To avoid accumulation of types, all IR related data is stored
in an LLVMContextRef which is dropped and recreated in order
to release all types.  Dropping and recreating incurs overhead,
so it will be done only after 100 queries. This is a heuristic
which might be revisited, but until we can get the size of the
context from LLVM we are flying a bit blind.

This issue has been reported several times, there may be more
references to it in the archives on top of the threads linked
below.

Backpatching of this fix will be handled once it has matured
in master for a bit.

Reported-By: Justin Pryzby <pryzby@telsasoft.com>
Reported-By: Kurt Roeckx <kurt@roeckx.be>
Reported-By: Jaime Casanova <jcasanov@systemguards.com.ec>
Reported-By: Lauri Laanmets <pcspets@gmail.com>
Author: Andres Freund and Daniel Gustafsson
Discussion: https://postgr.es/m/7acc8678-df5f-4923-9cf6-e843131ae89d@www.fastmail.com
Discussion: https://postgr.es/m/20201218235607.GC30237@telsasoft.com
Discussion: https://postgr.es/m/CAPH-tTxLf44s3CvUUtQpkDr1D8Hxqc2NGDzGXS1ODsfiJ6WSqA@mail.gmail.com
2023-09-27 13:02:21 +02:00
ef668d8bf5 llvmjit: Make llvm_types_module variable static
Commit b059d2f45685a introduced llvm_types_module and accidentally
exported it. As there is no usecase for accessing this variable
externally, this makes it static.

Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/20221101055132.pjjsvlkeo4stbjkq@awork3.anarazel.de
2023-09-27 13:02:14 +02:00
54ccfd6586 Fix the misuse of origin filter across multiple pg_logical_slot_get_changes() calls.
The pgoutput module uses a global variable (publish_no_origin) to cache
the action for the origin filter, but we didn't reset the flag when
shutting down the output plugin, so subsequent retries may access the
previous publish_no_origin value.

We fix this by storing the flag in the output plugin's private data.
Additionally, the patch removes the currently unused origin string from the
structure.

For the back branch, to avoid changing the exposed structure, we eliminated the
global variable and instead directly used the origin string for change
filtering.

Author: Hou Zhijie
Reviewed-by: Amit Kapila, Michael Paquier
Backpatch-through: 16
Discussion: http://postgr.es/m/OS0PR01MB571690EF24F51F51EFFCBB0E94FAA@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-09-27 14:32:51 +05:30
b0ae29512c MergeAttributes() and related variable renaming
Mainly, rename "schema" to "columns" and related changes.  The
previous naming has long been confusing.

Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da%40eisentraut.org
2023-09-26 16:08:35 +01:00
13aeaf0797 Add worker type to pg_stat_subscription.
Thanks to commit 2a8b40e368, the logical replication worker type is
easily determined.  The worker type could already be deduced via
other columns such as leader_pid and relid, but that is unnecessary
complexity for users.

Bumps catversion.

Author: Peter Smith
Reviewed-by: Michael Paquier, Maxim Orlov, Amit Kapila
Discussion: https://postgr.es/m/CAHut%2BPtmbSMfErSk0S7xxVdZJ9XVE3xVLhqBTmT91kf57BeKDQ%40mail.gmail.com
2023-09-25 14:12:43 -07:00
036297cf1b Pack struct ParsedWord more tightly.
In a 64-bit build there's an awful lot of useless pad space in
ParsedWords.  Since we may allocate large arrays of these,
it's worth some effort to reduce their size.

Here we reduce the alen field from uint32 to uint16, and then re-order
the fields to avoid unnecessary padding.  alen is only used to
remember the allocated size of the apos[] array, which is not allowed
to exceed MAXNUMPOS (256) elements, so uint16 is plenty of space for
it.  That gets us from 40 bytes to 24 on 64-bit builds, and from 20
bytes to 16 on 32-bit builds.

Per discussion of bug #18080.  Unfortunately this is an ABI break
so we can't back-patch.

Discussion: https://postgr.es/m/1146921.1695411070@sss.pgh.pa.us
2023-09-25 12:07:32 -04:00
7750fefdb2 Add GUC for temporarily disabling event triggers
In order to troubleshoot misbehaving or buggy event triggers, the
documented advice is to enter single-user mode.  In an attempt to
reduce the number of situations where single-user mode is required
(or even recommended) for non-extraordinary maintenance, this GUC
allows to temporarily suspend event triggers.

This was originally extracted from a larger patchset which aimed
at supporting event triggers on login events.

Reviewed-by: Ted Yu <yuzhihong@gmail.com>
Reviewed-by: Mikhail Gribkov <youzhick@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/9140106E-F9BF-4D85-8FC8-F2D3C094A6D9@yesql.se
Discussion: https://postgr.es/m/0d46d29f-4558-3af9-9c85-7774e14a7709@postgrespro.ru
2023-09-25 12:41:49 +02:00
e59fcbd712 Simplify information schema check constraint deparsing
The computation of the column
information_schema.check_constraints.check_clause used
pg_get_constraintdef() plus some string manipulation to get the check
clause back out.  This ended up with an extra pair of parentheses,
which is only an aesthetic problem, but also with suffixes like "NOT
VALID", which don't belong into that column.  We can fix both of these
problems and simplify the code by just using pg_get_expr() instead.

Discussion: https://www.postgresql.org/message-id/799b59ef-3330-f0d2-ee23-8cdfa1740987@eisentraut.org
2023-09-22 07:43:26 +02:00
c103d07381 Add function for removing arbitrary nodes in binaryheap.
This commit introduces binaryheap_remove_node(), which can be used
to remove any node from a binary heap.  The implementation is
straightforward.  The target node is replaced with the last node in
the heap, and then we sift as needed to preserve the heap property.
This new function is intended for use in a follow-up commit that
will improve the performance of pg_restore.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/3612876.1689443232%40sss.pgh.pa.us
2023-09-18 14:06:08 -07:00
5af0263afd Make binaryheap available to frontend code.
There are a couple of places in frontend code that could make use
of this simple binary heap implementation.  This commit makes
binaryheap usable in frontend code, much like commit 26aaf97b68 did
for StringInfo.  Like StringInfo, the header file is left in lib/
to reduce the likelihood of unnecessary breakage.

The frontend version of binaryheap exposes a void *-based API since
frontend code does not have access to the Datum definitions.  This
seemed like a better approach than switching all existing uses to
void * or making the Datum definitions available to frontend code.

Reviewed-by: Tom Lane, Alvaro Herrera
Discussion: https://postgr.es/m/3612876.1689443232%40sss.pgh.pa.us
2023-09-18 12:18:33 -07:00
a0a5e0feb3 Fix information schema for catalogued not-null constraints
The column check_constraints.check_clause should be like

    col IS NOT NULL

without a surrounding CHECK (...).

Discussion: https://www.postgresql.org/message-id/09489196-0bc1-e796-c43e-63425f7c5910@eisentraut.org
2023-09-18 08:10:51 +02:00
9d17e5f16f Update Unicode data to Unicode 15.1.0 2023-09-18 07:26:34 +02:00
e0b2eed047 Flush logical slots to disk during a shutdown checkpoint if required.
It's entirely possible for a logical slot to have a confirmed_flush LSN
higher than the last value saved on disk while not being marked as dirty.
Currently, it is not a major problem but a later patch adding support for
the upgrade of slots relies on that value being properly flushed to disk.

It can also help avoid processing the same transactions again in some
boundary cases after the clean shutdown and restart.  Say, we process
some transactions for which we didn't send anything downstream (the
changes got filtered) but the confirm_flush LSN is updated due to
keepalives.  As we don't flush the latest value of confirm_flush LSN, it
may lead to processing the same changes again without this patch.

The approach taken by this patch has been suggested by Ashutosh Bapat.

Author: Vignesh C, Julien Rouhaud, Kuroda Hayato
Reviewed-by: Amit Kapila, Dilip Kumar, Michael Paquier, Ashutosh Bapat, Peter Smith, Hou Zhijie
Discussion: http://postgr.es/m/CAA4eK1JzJagMmb_E8D4au=GYQkxox0AfNBm1FbP7sy7t4YWXPQ@mail.gmail.com
Discussion: http://postgr.es/m/TYAPR01MB58664C81887B3AF2EB6B16E3F5939@TYAPR01MB5866.jpnprd01.prod.outlook.com
2023-09-14 08:57:05 +05:30
5a3423ad8e Add JIT deform_counter
generation_counter includes time spent on both JIT:ing expressions
and tuple deforming which are configured independently via options
jit_expressions and jit_tuple_deforming.  As they are  combined in
the same counter it's not apparent what fraction of time the tuple
deforming takes.

This adds deform_counter dedicated to tuple deforming, which allows
seeing more directly the influence jit_tuple_deforming is having on
the query. The counter is exposed in EXPLAIN and pg_stat_statements
bumpin pg_stat_statements to 1.11.

Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/20220612091253.eegstkufdsu4kfls@erthalion.local
2023-09-08 15:05:12 +02:00
9f0602539d Remove some more "snapshot too old" vestiges.
Commit f691f5b8 removed the logic, but left behind some now-useless
Snapshot arguments to various AM-internal functions, and missed a couple
of comments.

Reported-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wznj9qSNXZ1P1uWTUD_FeaTezbUazb416EPwi4Qr_jR_6A%40mail.gmail.com
2023-09-08 17:12:12 +12:00
3af7217942 Update information_schema definition for not-null constraints
Now that we have catalogued not-null constraints, our information_schema
definition can be updated to grab those rather than fabricate synthetic
definitions.

Note that we still don't have catalog rows for not-null constraints on
domains, but we've never had not-null constraints listed in
information_schema, so that's a problem to be solved separately.

Co-authored-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/81b461c4-edab-5d8c-2f88-203108425340@enterprisedb.com
Discussion: https://postgr.es/m/202309041710.psytrxlsiqex@alvherre.pgsql
2023-09-07 11:33:01 +02:00
0da096d78e Fix recovery conflict SIGUSR1 handling.
We shouldn't be doing non-trivial work in signal handlers in general,
and in this case the handler could reach unsafe code and corrupt state.
It also clobbered its own "reason" code.

Move all recovery conflict decision logic into the next
CHECK_FOR_INTERRUPTS(), and have the signal handler just set flags and
the latch, following the standard pattern.  Since there are several
different "reasons", use a separate flag for each.

With this refactoring, the recovery conflict system no longer
piggy-backs on top of the regular query cancelation mechanism, but
instead raises an error directly if it decides that is necessary.  It
still needs to respect QueryCancelHoldoffCount, because otherwise the
FEBE protocol might get out of sync (see commit 2b3a8b20c2d).

This fixes one class of intermittent failure in the new
031_recovery_conflict.pl test added by commit 9f8a050f, though the buggy
coding is much older.  Failures outside contrived testing seem to be
very rare (or perhaps incorrectly attributed) in the field, based on
lack of reports.

No back-patch for now due to complexity and release schedule.  We have
the option to back-patch into 16 later, as 16 has prerequisite commit
bea3d7e.

Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Reviewed-by: Michael Paquier <michael@paquier.xyz> (earlier version)
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier version)
Tested-by: Christoph Berg <myon@debian.org>
Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
Discussion: https://postgr.es/m/CALj2ACVr8au2J_9D88UfRCi0JdWhyQDDxAcSVav0B0irx9nXEg%40mail.gmail.com
2023-09-07 12:39:24 +12:00
8c16ad3b43 Allow using syncfs() in frontend utilities.
This commit allows specifying a --sync-method in several frontend
utilities that must synchronize many files to disk (initdb,
pg_basebackup, pg_checksums, pg_dump, pg_rewind, and pg_upgrade).
On Linux, users can specify "syncfs" to synchronize the relevant
file systems instead of calling fsync() for every single file.  In
many cases, using syncfs() is much faster.

As with recovery_init_sync_method, this new option comes with some
caveats.  The descriptions of these caveats have been moved to a
new appendix section in the documentation.

Co-authored-by: Justin Pryzby
Reviewed-by: Michael Paquier, Thomas Munro, Robert Haas, Justin Pryzby
Discussion: https://postgr.es/m/20210930004340.GM831%40telsasoft.com
2023-09-06 16:27:16 -07:00
cccc6cdeb3 Add support for syncfs() in frontend support functions.
This commit adds support for using syncfs() in fsync_pgdata() and
fsync_dir_recurse() (which have been renamed to sync_pgdata() and
sync_dir_recurse()).  Like recovery_init_sync_method,
sync_pgdata() calls syncfs() for the data directory, each
tablespace, and pg_wal (if it is a symlink).  For now, all of the
frontend utilities that use these support functions are hard-coded
to use fsync(), but a follow-up commit will allow specifying
syncfs().

Co-authored-by: Justin Pryzby
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/20210930004340.GM831%40telsasoft.com
2023-09-06 16:27:00 -07:00
3ed1956719 Make enum for sync methods available to frontend code.
This commit renames RecoveryInitSyncMethod to DataDirSyncMethod and
moves it to common/file_utils.h.  This is preparatory work for a
follow-up commit that will allow specifying the synchronization
method in frontend utilities such as pg_upgrade and pg_basebackup.

Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/ZN2ZB4afQ2JbR9TA%40paquier.xyz
2023-09-06 16:26:39 -07:00
f39b265808 Move PG_TEMP_FILE* macros to file_utils.h.
Presently, frontend code that needs to use these macros must either
include storage/fd.h, which declares several frontend-unsafe
functions, or duplicate the macros.  This commit moves these macros
to common/file_utils.h, which is safe for both frontend and backend
code.  Consequently, we can also remove the duplicated macros in
pg_checksums and stop including storage/fd.h in pg_rewind.

Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/ZOP5qoUualu5xl2Z%40paquier.xyz
2023-09-05 17:02:06 -07:00
f691f5b80a Remove the "snapshot too old" feature.
Remove the old_snapshot_threshold setting and mechanism for producing
the error "snapshot too old", originally added by commit 848ef42b.
Unfortunately it had a number of known problems in terms of correctness
and performance, mostly reported by Andres in the course of his work on
snapshot scalability.  We agreed to remove it, after a long period
without an active plan to fix it.

This is certainly a desirable feature, and someone might propose a new
or improved implementation in the future.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CACG%3DezYV%2BEvO135fLRdVn-ZusfVsTY6cH1OZqWtezuEYH6ciQA%40mail.gmail.com
Discussion: https://postgr.es/m/20200401064008.qob7bfnnbu4w5cw4%40alap3.anarazel.de
Discussion: https://postgr.es/m/CA%2BTgmoY%3Daqf0zjTD%2B3dUWYkgMiNDegDLFjo%2B6ze%3DWtpik%2B3XqA%40mail.gmail.com
2023-09-05 19:53:43 +12:00
226d0a6b98 Restructure DECLARE_INDEX arguments
Separate the table name from the index declaration.  We need that
anyway later for the ALTER TABLE / USING INDEX commands, so we might
as well structure the declarations like that to begin with.

Discussion: https://www.postgresql.org/message-id/flat/75ae5875-3abc-dafc-8aec-73247ed41cde@eisentraut.org
2023-08-31 08:14:57 +02:00
d0fe3046ee Use actual backend IDs in pg_stat_get_backend_subxact().
Unlike the other pg_stat_get_backend* functions,
pg_stat_get_backend_subxact() looks up the backend entry by using
its integer argument as a 1-based index in an internal array.  The
other functions look for the entry with the matching session
backend ID.  These numbers often match, but that isn't reliably
true.

This commit resolves this discrepancy by introducing
pgstat_get_local_beentry_by_backend_id() and using it in
pg_stat_get_backend_subxact().  We cannot use
pgstat_get_beentry_by_backend_id() because it returns a
PgBackendStatus, which lacks the locally computed additions
available in LocalPgBackendStatus that are required by
pg_stat_get_backend_subxact().

Author: Ian Barwick
Reviewed-by: Sami Imseih, Michael Paquier, Robert Haas
Discussion: https://postgr.es/m/CAB8KJ%3Dj-ACb3H4L9a_b3ZG3iCYDW5aEu3WsPAzkm2S7JzS1Few%40mail.gmail.com
Backpatch-through: 16
2023-08-30 14:47:01 -07:00
3d51cb5197 Rename some support functions for pgstat* views.
Presently, pgstat_fetch_stat_beentry() accepts a session's backend
ID as its argument, and pgstat_fetch_stat_local_beentry() accepts a
1-based index in an internal array as its argument.  The former is
typically used wherever a user must provide a backend ID, and the
latter is usually used internally when looping over all entries in
the array.  This difference was first introduced by d7e39d72ca.
Before that commit, both functions accepted a 1-based index to the
internal array.

This commit renames these two functions to make it clear whether
they use the backend ID or the 1-based index to look up the entry.
This is preparatory work for a follow-up change that will introduce
a function for looking up a LocalPgBackendStatus using a backend
ID.

Reviewed-by: Ian Barwick, Sami Imseih, Michael Paquier, Robert Haas
Discussion: https://postgr.es/m/CAB8KJ%3Dj-ACb3H4L9a_b3ZG3iCYDW5aEu3WsPAzkm2S7JzS1Few%40mail.gmail.com
Backpatch-through: 16
2023-08-30 14:46:52 -07:00
9b581c5341 Disallow changing NO INHERIT status of a not-null constraint
It makes no sense to add a NO INHERIT not-null constraint to a child
table that already has one in that column inherited from its parent.
Disallow that, and add tests for the relevant cases.

Per complaint from Kyotaro Horiguchi.  I also used part of his proposed
patch.

Co-authored-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20230828.161658.1184657435220765047.horikyota.ntt@gmail.com
2023-08-29 19:19:24 +02:00
63956bed7b Rename logical_replication_mode to debug_logical_replication_streaming
The logical_replication_mode GUC is intended for testing and debugging
purposes, but its current name may be misleading and encourage users to make
unnecessary changes.

To avoid confusion, renaming the GUC to a less misleading name
debug_logical_replication_streaming that casual users are less likely to mistakenly
assume needs to be modified in a regular logical replication setup.

Author: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/d672d774-c44b-6fec-f993-793e744f169a%40eisentraut.org
2023-08-29 15:19:56 +02:00
36e4419d1f Make error messages about WAL segment size more consistent
Make the primary messages more compact and make the detail messages
uniform.  In initdb.c and pg_resetwal.c, use the newish
option_parse_int() to simplify some of the option parsing.  For the
backend GUC wal_segment_size, add a GUC check hook to do the
verification instead of coding it in bootstrap.c.  This might be
overkill, but that way the check is in the right place and it becomes
more self-documenting.

In passing, make pg_controldata use the logging API for warning
messages.

Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://www.postgresql.org/message-id/flat/9939aa8a-d7be-da2c-7715-0a0b5535a1f7@eisentraut.org
2023-08-28 15:17:04 +02:00
9a0ddc39c6 Format list of catalog files in makefile vertically
This makes it easier to compare the lists visually with the
corresponding meson lists.

In passing, copy over some relevant comments from the makefiles to
meson.build.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/a306be82-ee71-4554-d499-49a45a654396%40eisentraut.org
2023-08-28 06:20:56 +02:00
bb45156f34 Show names of DEALLOCATE as constants in pg_stat_statements
This commit switches query jumbling so as prepared statement names are
treated as constants in DeallocateStmt.  A boolean field is added to
DeallocateStmt to make a distinction between ALL and named prepared
statements, as "name" was used to make this difference before, NULL
meaning DEALLOCATE ALL.

Prior to this commit, DEALLOCATE was not tracked in pg_stat_statements,
for the reason that it was not possible to treat its name parameter as a
constant.  Now that query jumbling applies to all the utility nodes,
this reason does not apply anymore.

Like 638d42a3c520, this can be a huge advantage for monitoring where
prepared statement names are randomly generated, preventing bloat in
pg_stat_statements.  A couple of tests are added to track the new
behavior.

Author: Dagfinn Ilmari Mannsåker, Michael Paquier
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/ZMhT9kNtJJsHw6jK@paquier.xyz
2023-08-27 17:27:44 +09:00
b0e96f3119 Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.

We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables.  We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to.  This means we don't
require the constraint names to be identical across a hierarchy.

For now, we omit them for system catalogs.  Maybe this is worth
reconsidering.  We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)

psql shows these constraints in \d+.

pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key.  We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed.  This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.

pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17.  I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.

This patch has been very long in the making.  The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints.  However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again.  During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.

In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00