Commit Graph

2031 Commits

Author SHA1 Message Date
f0b3573c3a Introduce "REFRESH SEQUENCES" for subscriptions.
This patch adds support for a new SQL command:
ALTER SUBSCRIPTION ... REFRESH SEQUENCES
This command updates the sequence entries present in the
pg_subscription_rel catalog table with the INIT state to trigger
resynchronization.

In addition to the new command, the following subscription commands have
been enhanced to automatically refresh sequence mappings:
ALTER SUBSCRIPTION ... REFRESH PUBLICATION
ALTER SUBSCRIPTION ... ADD PUBLICATION
ALTER SUBSCRIPTION ... DROP PUBLICATION
ALTER SUBSCRIPTION ... SET PUBLICATION

These commands will perform the following actions:
Add newly published sequences that are not yet part of the subscription.
Remove sequences that are no longer included in the publication.

This ensures that sequence replication remains aligned with the current
state of the publication on the publisher side.

Note that the actual synchronization of sequence data/values will be
handled in a subsequent patch that introduces a dedicated sequence sync
worker.

Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com>
Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com
2025-10-23 08:30:27 +00:00
86d118f9a6 Englishify comment wording
Switch to using the English word here rather than using a verbified
function name.

The full word still fits within a single comment line, so it's probably
better just to use that instead of trying to shorten it, which might
cause confusion.

Author: Rafia Sabih <rafia.pghackers@gmail.com>
Discussion: https://postgr.es/m/CA+FpmFe7LnRF2NA_QfARjkSWme4mNt+Udwbh2Yb=zZm35Ji31w@mail.gmail.com
2025-10-18 12:50:14 +13:00
812221b204 Remove partColsUpdated.
This information appears to have been unused since commit
c5b7ba4e67.  We could not find any references in third-party code,
either.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aO_CyFRpbVMtgJWM%40nathan
2025-10-16 11:31:38 -05:00
2436b8c047 Standardize use of REFRESH PUBLICATION in code and messages.
This patch replaces ALTER SUBSCRIPTION REFRESH with
ALTER SUBSCRIPTION REFRESH PUBLICATION in comments and error messages to
improve clarity and support future extensibility. The change aligns with
upcoming addition REFRESH SEQUENCES for sequence synchronization.

Author: vignesh C <vignesh21@gmail.com>
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com
2025-10-15 03:42:27 +00:00
1206df04c2 Rename apply_at to apply_agg_at for clarity
The field name "apply_at" in RelAggInfo was a bit ambiguous.  Rename
it to "apply_agg_at" to improve clarity and make its purpose clearer.

Per complaint from David Rowley, Robert Haas.

Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+TgmoZ0KR2_XCWHy17=HHcQ3p2Mamc9c6Dnnhf1J6wPYFD9ng@mail.gmail.com
2025-10-14 16:35:22 +09:00
96b3784973 Add "ALL SEQUENCES" support to publications.
This patch adds support for the ALL SEQUENCES clause in publications,
enabling synchronization/replication of all sequences that is useful for
upgrades.

Publications can now include all sequences via FOR ALL SEQUENCES.
psql enhancements:
\d shows publications for a given sequence.
\dRp indicates if a publication includes all sequences.

ALL SEQUENCES can be combined with ALL TABLES, but not with other options
like TABLE or TABLES IN SCHEMA. We can extend support for more granular
clauses in future.

The view pg_publication_sequences provides information about the mapping
between publications and sequences.

This patch enables publishing of sequences; subscriber-side support will
be added in upcoming patches.

Author: vignesh C <vignesh21@gmail.com>
Author: Tomas Vondra <tomas@vondra.me>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com
2025-10-09 03:48:54 +00:00
4685977cc5 Add extension_state member to PlannedStmt.
Extensions can stash data computed at plan time into this list using
planner_shutdown_hook (or perhaps other mechanisms) and then access
it from any code that has access to the PlannedStmt (such as explain
hooks), allowing for extensible debugging and instrumentation of
plans.

Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com
2025-10-08 09:07:49 -04:00
8e11859102 Implement Eager Aggregation
Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations.  To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node.  During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied.  If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
is currently not supported.

To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

The patch was originally proposed by Antonin Houska in 2017.  This
commit reworks various important aspects and rewrites most of the
current code.  However, the original patch and reviews were very
useful.

Author: Richard Guo <guofenglinux@gmail.com>
Author: Antonin Houska <ah@cybertec.at> (in an older version)
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me> (in an older version)
Reviewed-by: Andy Fan <zhihuifan1213@163.com> (in an older version)
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> (in an older version)
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
2025-10-08 17:04:23 +09:00
64095d1574 Remove PlannerInfo's join_search_private method.
Instead, use the new mechanism that allows planner extensions to store
private state inside a PlannerInfo, treating GEQO as an in-core planner
extension.  This is a useful test of the new facility, and also buys
back a few bytes of storage.

To make this work, we must remove innerrel_is_unique_ext's hack of
testing whether join_search_private is set as a proxy for whether
the join search might be retried. Add a flag that extensions can
use to explicitly signal their intentions instead.

Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com
2025-10-07 12:43:45 -04:00
0132dddab3 Allow private state in certain planner data structures.
Extension that make extensive use of planner hooks may want to
coordinate their efforts, for example to avoid duplicate computation,
but that's currently difficult because there's no really good way to
pass data between different hooks.

To make that easier, allow for storage of extension-managed private
state in PlannerGlobal, PlannerInfo, and RelOptInfo, along very
similar lines to what we have permitted for ExplainState since commit
c65bc2e1d14a2d4daed7c1921ac518f2c5ac3d17.

Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com
2025-10-07 12:09:30 -04:00
8c49a484e8 Assign each subquery a unique name prior to planning it.
Previously, subqueries were given names only after they were planned,
which makes it difficult to use information from a previous execution of
the query to guide future planning. If, for example, you knew something
about how you want "InitPlan 2" to be planned, you won't know whether
the subquery you're currently planning will end up being "InitPlan 2"
until after you've finished planning it, by which point it's too late to
use the information that you had.

To fix this, assign each subplan a unique name before we begin planning
it. To improve consistency, use textual names for all subplans, rather
than, as we did previously, a mix of numbers (such as "InitPlan 1") and
names (such as "CTE foo"), and make sure that the same name is never
assigned more than once.

We adopt the somewhat arbitrary convention of using the type of sublink
to set the plan name; for example, a query that previously had two
expression sublinks shown as InitPlan 2 and InitPlan 1 will now end up
named expr_1 and expr_2. Because names are assigned before rather than
after planning, some of the regression test outputs show the numerical
part of the name switching positions: what was previously SubPlan 2 was
actually the first one encountered, but we finished planning it later.

We assign names even to subqueries that aren't shown as such within the
EXPLAIN output. These include subqueries that are a FROM clause item or
a branch of a set operation, rather than something that will be turned
into an InitPlan or SubPlan. The purpose of this is to make sure that,
below the topmost query level, there's always a name for each subquery
that is stable from one planning cycle to the next (assuming no changes
to the query or the database schema).

Author: Robert Haas <rhaas@postgresql.org>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: http://postgr.es/m/3641043.1758751399@sss.pgh.pa.us
2025-10-07 09:18:54 -04:00
25a30bbd42 Add IGNORE NULLS/RESPECT NULLS option to Window functions.
Add IGNORE NULLS/RESPECT NULLS option (null treatment clause) to lead,
lag, first_value, last_value and nth_value window functions.  If
unspecified, the default is RESPECT NULLS which includes NULL values
in any result calculation. IGNORE NULLS ignores NULL values.

Built-in window functions are modified to call new API
WinCheckAndInitializeNullTreatment() to indicate whether they accept
IGNORE NULLS/RESPECT NULLS option or not (the API can be called by
user defined window functions as well).  If WinGetFuncArgInPartition's
allowNullTreatment argument is true and IGNORE NULLS option is given,
WinGetFuncArgInPartition() or WinGetFuncArgInFrame() will return
evaluated function's argument expression on specified non NULL row (if
it exists) in the partition or the frame.

When IGNORE NULLS option is given, window functions need to visit and
evaluate same rows over and over again to look for non null rows. To
mitigate the issue, 2-bit not null information array is created while
executing window functions to remember whether the row has been
already evaluated to NULL or NOT NULL. If already evaluated, we could
skip the evaluation work, thus we could get better performance.

Author: Oliver Ford <ojford@gmail.com>
Co-authored-by: Tatsuo Ishii <ishii@postgresql.org>
Reviewed-by: Krasiyan Andreev <krasiyan@gmail.com>
Reviewed-by: Andrew Gierth <andrew@tao11.riddles.org.uk>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Fetter <david@fetter.org>
Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Discussion: https://postgr.es/m/flat/CAGMVOdsbtRwE_4+v8zjH1d9xfovDeQAGLkP_B6k69_VoFEgX-A@mail.gmail.com
2025-10-03 09:47:36 +09:00
3bf31dd243 Do a tiny bit of header file maintenance
Stop including utils/relcache.h in access/genam.h, and stop including
htup_details.h in nodes/tidbitmap.h.  Both these files (genam.h and
tidbitmap.h) are widely used in other header files, so it's in our best
interest that they remain as lean as reasonable.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/202509291356.o5t6ny2hoa3q@alvherre.pgsql
2025-09-30 12:28:29 +02:00
ef38a4d975 Add GROUP BY ALL.
GROUP BY ALL is a form of GROUP BY that adds any TargetExpr that does
not contain an aggregate or window function into the groupClause of
the query, making it exactly equivalent to specifying those same
expressions in an explicit GROUP BY list.

This feature is useful for certain kinds of data exploration.  It's
already present in some other DBMSes, and the SQL committee recently
accepted it into the standard, so we can be reasonably confident in
the syntax being stable.  We do have to invent part of the semantics,
as the standard doesn't allow for expressions in GROUP BY, so they
haven't specified what to do with window functions.  We assume that
those should be treated like aggregates, i.e., left out of the
constructed GROUP BY list.

In passing, wordsmith some existing documentation about GROUP BY,
and update some neglected synopsis entries in select_into.sgml.

Author: David Christensen <david@pgguru.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAHM0NXjz0kDwtzoe-fnHAqPB1qA8_VJN0XAmCgUZ+iPnvP5LbA@mail.gmail.com
2025-09-29 16:55:17 -04:00
66cdef4425 Remove unused for_all_tables field from AlterPublicationStmt.
No backpatch as AlterPublicationStmt struct is exposed.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/CAD21AoC6B6AuxWOST-TkxUbDgp8FwX=BLEJZmKLG_VrH-hfxpA@mail.gmail.com
2025-09-26 09:23:00 -07:00
81fc3e28e3 Update some more forward declarations to use typedef
As commit d4d1fc527bdb.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/202509191025.22agk3fvpilc@alvherre.pgsql
2025-09-25 14:33:19 +02:00
f2bae51dfd Keep track of what RTIs a Result node is scanning.
Result nodes now include an RTI set, which is only non-NULL when they
have no subplan, and is taken from the relid set of the RelOptInfo that
the Result is generating. ExplainPreScanNode now takes notice of these
RTIs, which means that a few things get schema-qualified in the
regression tests that previously did not. This makes the output more
consistent between cases where some part of the plan tree is replaced by
a Result node and those where this does not happen.

Likewise, pg_overexplain's EXPLAIN (RANGE_TABLE) now displays the RTIs
stored in a Result node just as it already does for other RTI-bearing
node types.

Result nodes also now include a result_reason, which tells us something
about why the Result node was inserted.  Using that information, EXPLAIN
now emits, where relevant, a "Replaces" line describing the origin of
a Result node.

The purpose of these changes is to allow code that inspects a Plan
tree to understand the origin of Result nodes that appear therein.

Discussion: http://postgr.es/m/CA+TgmoYeUZePZWLsSO+1FAN7UPePT_RMEZBKkqYBJVCF1s60=w@mail.gmail.com
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
2025-09-23 09:07:55 -04:00
e3a0304eba Fix misleading comment in RangeTblEntry
The comment describing join_using_alias incorrectly referred to the
alias field as being defined "below", when it actually appears earlier
in the RangeTblEntry struct.  This patch fixes that.

Author: Steve Lau <stevelauc@outlook.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/TYWPR01MB10612B020C33FD08F729415CEB613A@TYWPR01MB10612.jpnprd01.prod.outlook.com
2025-09-22 10:04:39 +09:00
d4d1fc527b Update various forward declarations to use typedef
There are a number of forward declarations that use struct but not the
customary typedef, because that could have led to repeat typedefs,
which was not allowed.  This is now allowed in C11, so we can update
these to provide the typedefs as well, so that the later uses of the
types look more consistent.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/10d32190-f31b-40a5-b177-11db55597355@eisentraut.org
2025-09-15 11:04:10 +02:00
1e3b5edb8e Remove workarounds against repeat typedefs
This is allowed in C11, so we don't need the workarounds anymore.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/10d32190-f31b-40a5-b177-11db55597355@eisentraut.org
2025-09-15 11:04:10 +02:00
bdc6cfcd12 Eliminate duplicative hashtempcxt in nodeSubplan.c.
Instead of building a separate memory context that's used just
for running hash functions, make the hash functions run in the
per-tuple context of the node's innerecontext.  This saves a
little space at runtime, and it avoids needing to reset two
contexts instead of one inside buildSubPlanHash's main loop.

This largely reverts commit 133924e13.  That's safe to do now
because bf6c614a2 decoupled the evaluation context used by
TupleHashTableMatch from that used for hash function evaluation,
so that there's no longer a risk of resetting the innerecontext
too soon.

Per discussion of bug #19040, although this is not directly
a fix for that.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Haiyang Li <mohen.lhy@alibaba-inc.com>
Reviewed-by: Fei Changhong <feichanghong@qq.com>
Discussion: https://postgr.es/m/19040-c9b6073ef814f48c@postgresql.org
2025-09-10 16:15:08 -04:00
24225ad9aa Pathify RHS unique-ification for semijoin planning
There are two implementation techniques for semijoins: one uses the
JOIN_SEMI jointype, where the executor emits at most one matching row
per left-hand side (LHS) row; the other unique-ifies the right-hand
side (RHS) and then performs a plain inner join.

The latter technique currently has some drawbacks related to the
unique-ification step.

* Only the cheapest-total path of the RHS is considered during
unique-ification.  This may cause us to miss some optimization
opportunities; for example, a path with a better sort order might be
overlooked simply because it is not the cheapest in total cost.  Such
a path could help avoid a sort at a higher level, potentially
resulting in a cheaper overall plan.

* We currently rely on heuristics to choose between hash-based and
sort-based unique-ification.  A better approach would be to generate
paths for both methods and allow add_path() to decide which one is
preferable, consistent with how path selection is handled elsewhere in
the planner.

* In the sort-based implementation, we currently pay no attention to
the pathkeys of the input subpath or the resulting output.  This can
result in redundant sort nodes being added to the final plan.

This patch improves semijoin planning by creating a new RelOptInfo for
the RHS rel to represent its unique-ified version.  It then generates
multiple paths that represent elimination of distinct rows from the
RHS, considering both a hash-based implementation using the cheapest
total path of the original RHS rel, and sort-based implementations
that either exploit presorted input paths or explicitly sort the
cheapest total path.  All resulting paths compete in add_path(), and
those deemed worthy of consideration are added to the new RelOptInfo.
Finally, the unique-ified rel is joined with the other side of the
semijoin using a plain inner join.

As a side effect, most of the code related to the JOIN_UNIQUE_OUTER
and JOIN_UNIQUE_INNER jointypes -- used to indicate that the LHS or
RHS path should be made unique -- has been removed.  Besides, the
T_Unique path now has the same meaning for both semijoins and upper
DISTINCT clauses: it represents adjacent-duplicate removal on
presorted input.  This patch unifies their handling by sharing the
same data structures and functions.

This patch also removes the UNIQUE_PATH_NOOP related code along the
way, as it is dead code -- if the RHS rel is provably unique, the
semijoin should have already been simplified to a plain inner join by
analyzejoins.c.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4-EBnaRvEs7frTLbsXiweSTUXifsteF-d3rvv01FKO86w@mail.gmail.com
2025-08-19 09:35:40 +09:00
5a8ab650a7 Update obsolete comments in ResultRelInfo struct.
Commit c5b7ba4e6 changed things so that the ri_RootResultRelInfo field
of this struct is set for both partitions and inheritance children and
used for tuple routing and transition capture (before that commit, it
was only set for partitions to route tuples into), but failed to update
these comments.

Author: Etsuro Fujita <etsuro.fujita@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAPmGK14NF5CcdCmTZpxrvpvBiT0y4EqKikW1r_wAu1CEHeOmUA%40mail.gmail.com
Backpatch-through: 14
2025-08-17 19:40:00 +09:00
2a600a93c7 Make type Datum be 8 bytes wide everywhere.
This patch makes sizeof(Datum) be 8 on all platforms including
32-bit ones.  The objective is to allow USE_FLOAT8_BYVAL to be true
everywhere, and in consequence to remove a lot of code that is
specific to pass-by-reference handling of float8, int8, etc.  The
code for abbreviated sort keys can be simplified similarly.  In this
way we can reduce the maintenance effort involved in supporting 32-bit
platforms, without going so far as to actually desupport them.  Since
Datum is strictly an in-memory concept, this has no impact on on-disk
storage, though an initdb or pg_upgrade will be needed to fix affected
catalog entries.

We have required platforms to support [u]int64 for ages, so this
breaks no supported platform.  We can expect that this change will
make 32-bit builds a bit slower and more memory-hungry, although being
able to use pass-by-value handling of 8-byte types may buy back some
of that.  But we stopped optimizing for 32-bit cases a long time ago,
and this seems like just another step on that path.

This initial patch simply forces the correct type definition and
USE_FLOAT8_BYVAL setting, and cleans up a couple of minor compiler
complaints that ensued.  This is sufficient for testing purposes.
In the wake of a bunch of Datum-conversion cleanups by Peter
Eisentraut, this now compiles cleanly with gcc on a 32-bit platform.
(I'd only tested the previous version with clang, which it turns out
is less picky than gcc about width-changing coercions.)

There is a good deal of now-dead code that I'll remove in separate
follow-up patches.

A catversion bump is required because this affects initial catalog
contents (on 32-bit machines) in two ways: pg_type.typbyval changes
for some built-in types, and Const nodes in stored views/rules will
now have 8 bytes not 4 for pass-by-value types.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/1749799.1752797397@sss.pgh.pa.us
2025-08-13 17:18:22 -04:00
e125e36002 Rename CachedPlanType to PlannedStmtOrigin for PlannedStmt
Commit 719dcf3c42 introduced a field called CachedPlanType in
PlannedStmt to allow extensions to determine whether a cached plan is
generic or custom.

After discussion, the concepts that we want to track are a bit wider
than initially anticipated, as it is closer to knowing from which
"source" or "origin" a PlannedStmt has been generated or retrieved.
Custom and generic cached plans are a subset of that.

Based on the state of HEAD, we have been able to define two more
origins:
- "standard", for the case where PlannedStmt is generated in
standard_planner(), the most common case.
- "internal", for the fake PlannedStmt generated internally by some
query patterns.

This could be tuned in the future depending on what is needed.  This
looks like a good starting point, at least.  The default value is called
"UNKNOWN", provided as fallback value.  This value is not used in the
core code, the idea is to let extensions building their own PlannedStmts
know about this new field.

Author: Michael Paquier <michael@paquier.xyz>
Co-authored-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/aILaHupXbIGgF2wJ@paquier.xyz
2025-07-31 10:06:34 +09:00
4bc62b8684 Display Memoize planner estimates in EXPLAIN
There've been a few complaints that it can be overly difficult to figure
out why the planner picked a Memoize plan.  To help address that, here we
adjust the EXPLAIN output to display the following additional details:

1) The estimated number of cache entries that can be stored at once
2) The estimated number of unique lookup keys that we expect to see
3) The number of lookups we expect
4) The estimated hit ratio

Technically #4 can be calculated using #1, #2 and #3, but it's not a
particularly obvious calculation, so we opt to display it explicitly.
The original patch by Lukas Fittl only displayed the hit ratio, but
there was a fear that might lead to more questions about how that was
calculated.  The idea with displaying all 4 is to be transparent which
may allow queries to be tuned more easily.  For example, if #2 isn't
correct then maybe extended statistics or a manual n_distinct estimate can
be used to help fix poor plan choices.

Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Author: Lukas Fittl <lukas@fittl.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CAP53Pky29GWAVVk3oBgKBDqhND0BRBN6yTPeguV_qSivFL5N_g%40mail.gmail.com
2025-07-29 15:18:01 +12:00
719dcf3c42 Introduce field tracking cached plan type in PlannedStmt
PlannedStmt gains a new field, called CachedPlanType, able to track if a
given plan tree originates from the cache and if we are dealing with a
generic or custom cached plan.

This field can be used for monitoring or statistical purposes, in the
executor hooks, for example, based on the planned statement attached to
a QueryDesc.  A patch is under discussion for pg_stat_statements to
provide an equivalent of the counters in pg_prepared_statements for
custom and generic plans, to provide a more global view of such data, as
this data is now restricted to the current session.

The concept introduced in this commit is useful on its own, and has been
extracted from a larger patch by the same author.

Author: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAA5RZ0uFw8Y9GCFvafhC=OA8NnMqVZyzXPfv_EePOt+iv1T-qQ@mail.gmail.com
2025-07-24 15:41:18 +09:00
e2debb6438 Reduce "Var IS [NOT] NULL" quals during constant folding
In commit b262ad440, we introduced an optimization that reduces an IS
[NOT] NULL qual on a NOT NULL column to constant true or constant
false, provided we can prove that the input expression of the NullTest
is not nullable by any outer joins or grouping sets.  This deduction
happens quite late in the planner, during the distribution of quals to
rels in query_planner.  However, this approach has some drawbacks: we
can't perform any further folding with the constant, and it turns out
to be prone to bugs.

Ideally, this deduction should happen during constant folding.
However, the per-relation information about which columns are defined
as NOT NULL is not available at that point.  This information is
currently collected from catalogs when building RelOptInfos for base
or "other" relations.

This patch moves the collection of NOT NULL attribute information for
relations before pull_up_sublinks, storing it in a hash table keyed by
relation OID.  It then uses this information to perform the NullTest
deduction for Vars during constant folding.  This also makes it
possible to leverage this information to pull up NOT IN subqueries.

Note that this patch does not get rid of restriction_is_always_true
and restriction_is_always_false.  Removing them would prevent us from
reducing some IS [NOT] NULL quals that we were previously able to
reduce, because (a) the self-join elimination may introduce new IS NOT
NULL quals after constant folding, and (b) if some outer joins are
converted to inner joins, previously irreducible NullTest quals may
become reducible.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAMbWs4-bFJ1At4btk5wqbezdu8PLtQ3zv-aiaY3ry9Ymm=jgFQ@mail.gmail.com
2025-07-22 11:21:36 +09:00
a4f126516e Add option list to CHECKPOINT command.
This commit adds the boilerplate code for supporting a list of
options in CHECKPOINT commands.  No actual options are supported
yet, but follow-up commits will add support for MODE and
FLUSH_UNLOGGED.  While at it, this commit refactors the code for
executing CHECKPOINT commands to its own function since it's about
to become significantly larger.

Author: Christoph Berg <myon@debian.org>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de
2025-07-11 11:51:25 -05:00
8ec04c8577 Refactor subtype field of AlterDomainStmt
AlterDomainStmt.subtype used characters for its subtypes of commands,
SET|DROP DEFAULT|NOT NULL and ADD|DROP|VALIDATE CONSTRAINT, which were
hardcoded in a couple of places of the code.  The code is improved by
using an enum instead, with the same character values as the original
code.

Note that the field was documented in parsenodes.h and that it forgot to
mention 'V' (VALIDATE CONSTRAINT).

Author: Quan Zongliang <quanzongliang@yeah.net>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/41ff310b-16bd-44b9-a3ef-97e20f14b709@yeah.net
2025-07-03 16:34:28 +09:00
f039c22441 meson: Increase minimum version to 0.57.2
The previous minimum was to maintain support for Python 3.5, but we
now require Python 3.6 anyway (commit 45363fca637), so that reason is
obsolete.  A small raise to Meson 0.57 allows getting rid of a fair
amount of version conditionals and silences some future-deprecated
warnings.

With the version bump, the following deprecation warnings appeared and
are fixed:

WARNING: Project targets '>=0.57' but uses feature deprecated since '0.55.0': ExternalProgram.path. use ExternalProgram.full_path() instead
WARNING: Project targets '>=0.57' but uses feature deprecated since '0.56.0': meson.build_root. use meson.project_build_root() or meson.global_build_root() instead.

It turns out that meson 0.57.0 and 0.57.1 are buggy for our use, so
the minimum is actually set to 0.57.2.  This is specific to this
version series; in the future we won't necessarily need to be this
precise.

Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/42e13eb0-862a-441e-8d84-4f0fd5f6def0%40eisentraut.org
2025-07-02 11:14:53 +02:00
de5aa15209 Reformat some node comments
Use per-field comments for IndexInfo, instead of one big header
comment listing all the fields.  This makes the relevant comments
easier to find, and it will also make it less likely that comments are
not updated when fields are added or removed, as has happened in the
past.

Author: Japin Li <japinli@hotmail.com>
Discussion: https://www.postgresql.org/message-id/flat/ME0P300MB04453E6C7EA635F0ECF41BFCB6832%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-07-02 09:50:51 +02:00
bee23ea4dd Show sizes of FETCH queries as constants in pg_stat_statements
Prior to this patch, every FETCH call would generate a unique queryId
with a different size specified.  Depending on the workloads, this could
lead to a significant bloat in pg_stat_statements, as repeatedly calling
a specific cursor would result in a new queryId each time.  For example,
FETCH 1 c1; and FETCH 2 c1; would produce different queryIds.

This patch improves the situation by normalizing the fetch size, so as
semantically similar statements generate the same queryId.  As a result,
statements like the below, which differ syntactically but have the same
effect, will now share a single queryId:
FETCH FROM c1
FETCH NEXT c1
FETCH 1 c1

In order to do a normalization based on the keyword used in FETCH,
FetchStmt is tweaked with a new FetchDirectionKeywords.  This matters
for "howMany", which could be set to a negative value depending on the
direction, and we want to normalize the queries with enough information
about the direction keywords provided, including RELATIVE, ABSOLUTE or
all the ALL variants.

Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0tA6LbHCg2qSS+KuM850BZC_+ZgHV7Ug6BXw22TNyF+MA@mail.gmail.com
2025-07-02 08:39:25 +09:00
184595836b Update comment for IndexInfo.ii_NullsNotDistinct
Commit 7a7b3e11e61 added the ii_NullsNotDistinct field, but the
comment was not updated.

Author: Japin Li <japinli@hotmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/ME0P300MB04453E6C7EA635F0ECF41BFCB6832%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-07-01 23:12:24 +02:00
7a7b3e11e6 Update comment for IndexInfo.ii_WithoutOverlaps
Commit fc0438b4e80 added the ii_WithoutOverlaps field, but the comment
was not updated.

Author: Japin Li <japinli@hotmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/ME0P300MB04453E6C7EA635F0ECF41BFCB6832%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-07-01 20:37:24 +02:00
9e5fee8228 Fix outdated comment for IndexInfo
Commit 78416235713 removed the ii_OpclassOptions field, but the
comment was not updated.

Author: Japin Li <japinli@hotmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/ME0P300MB04453E6C7EA635F0ECF41BFCB6832%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-07-01 20:12:36 +02:00
c2da1a5d63 Make query jumbling also squash PARAM_EXTERN params
Commit 62d712ecfd94 made query jumbling squash lists of Consts as a
single element, but there's no reason not to treat PARAM_EXTERN
parameters the same.  For these purposes, these values are indeed
constants for any particular execution of a query.

In particular, this should make list squashing more useful for
applications using extended query protocol, which would use parameters
extensively.

A complication arises: if a query has both external parameters and
squashable lists, then the parameter number used as placeholder for the
squashed list might be inconsistent with regards to the parameter
numbers used by the query literal.  To reduce the surprise factor, all
parameters are renumbered starting from 1 in that case.

Author: Sami Imseih <samimseih@gmail.com>
Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAA5RZ0tRXoPG2y6bMgBCWNDt0Tn=unRerbzYM=oW0syi1=C1OA@mail.gmail.com
2025-06-24 19:36:32 +02:00
0f65f3eec4 Fix squashing algorithm for query texts
The algorithm to squash lists of constants added by commit 62d712ecfd94
was a bit too simplistic; we wanted to avoid adding unnecessary
complexity, but cases like direct function calls of typecasting
functions (and others) were missed, and bogus SQL syntax was being shown
in pg_stat_statements normalized query text field.  To fix normalization
for those cases, we need the parser to transmit information about were
each list of constant values starts and ends, so add that to a couple of
nodes.  Also add a few more test cases to make sure we're doing the
right thing.

The patch initially submitted by Sami added a new private struct in
gram.y to carry the start/end information for A_Expr, but I (Álvaro)
decided that a better fix was to remove the parser indirection via the
in_expr production, and instead create separate components in the a_expr
rule.  I'm surprised that this works and doesn't require more changes,
but I assume (without checking) that the grammar used to be more complex
and got simplified at some point.

Bump catversion.

Author: Sami Imseih <samimseih@gmail.com>
Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAA5RZ0tRXoPG2y6bMgBCWNDt0Tn=unRerbzYM=oW0syi1=C1OA@mail.gmail.com
2025-06-12 14:21:21 +02:00
f85f6ab051 Revert support for improved tracking of nested queries
This commit reverts the two following commits:
- 499edb09741b, track more precisely query locations for nested
statements.
- 06450c7b8c70, a follow-up fix of 499edb09741b with query locations.
The test introduced in this commit is not reverted.  This is proving
useful to track a problem that only pgaudit was able to detect.

These prove to have issues with the tracking of SELECT statements, when
these use multiple parenthesis which is something supported by the
grammar.  Incorrect location and lengths are causing pg_stat_statements
to become confused, failing its job in query normalization with
potential out-of-bound writes because the location and the length may
not match with what can be handled.  A lot of the query patterns
discussed when this issue was reported have no test coverage in the main
regression test suite, or the recovery test 027_stream_regress.pl would
have caught the problems as pg_stat_statements is loaded by the node
running the regression tests.  A first step would be to improve the test
coverage to stress more the query normalization logic.

A different portion of this work was done in 45e0ba30fc40, with the
addition of tests for nested queries.  These can be left in the tree.
They are useful to track the way inner queries are currently tracked by
PGSS with non-top-level entries, and will be useful when reconsidering
in the future the work reverted here.

Reported-by: Alexander Kozhemyakin <a.kozhemyakin@postgrespro.ru>
Discussion: https://postgr.es/m/18947-cdd2668beffe02bf@postgresql.org
2025-06-12 10:08:55 +09:00
e050af2868 Change internal plan ID type from uint64 to int64
uint64 was chosen to be consistent with the type used by the query ID,
but the conclusion of a recent discussion for the query ID is that int64
is a better fit as the signed form is shown to the user, for PGSS or
EXPLAIN outputs.

This commit changes the plan ID to use int64, following c3eda50b0648
that has done the same for the query ID.

The plan ID is new to v18, introduced in 2a0cd38da5cc.

Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/aCvzJNwetyEI3Sgo@paquier.xyz
2025-05-31 09:40:45 +09:00
c3eda50b06 Change internal queryid type from uint64 to int64
uint64 was perhaps chosen in cff440d36 as the type was uint32 prior to
that widening work.

Having this as uint64 doesn't make much sense and just adds the overhead of
having to remember that we always output this in its signed form.  Let's
remove that overhead.

The signed form output is seemingly required since we have no way to
represent the full range of uint64 in an SQL type.  We use BIGINT in places
like pg_stat_statements, which maps directly to int64.

The release notes "Source Code" section may want to mention this
adjustment as some extensions may wish to adjust their code.

Author: David Rowley <dgrowleyml@gmail.com>
Suggested-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/50cb0c8b-994b-48f9-a1c4-13039eb3536b@eisentraut.org
2025-05-30 22:59:39 +12:00
1722d5eb05 Revert "Don't lock partitions pruned by initial pruning"
As pointed out by Tom Lane, the patch introduced fragile and invasive
design around plan invalidation handling when locking of prunable
partitions was deferred from plancache.c to the executor. In
particular, it violated assumptions about CachedPlan immutability and
altered executor APIs in ways that are difficult to justify given the
added complexity and overhead.

This also removes the firstResultRels field added to PlannedStmt in
commit 28317de72, which was intended to support deferred locking of
certain ModifyTable result relations.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/605328.1747710381@sss.pgh.pa.us
2025-05-22 17:02:35 +09:00
c06e909c26 Track the number of presorted outer pathkeys in MergePath
When creating an explicit Sort node for the outer path of a mergejoin,
we need to determine the number of presorted keys of the outer path to
decide whether explicit incremental sort can be applied.  Currently,
this is done by repeatedly calling pathkeys_count_contained_in.

This patch caches the number of presorted outer pathkeys in MergePath,
allowing us to save several calls to pathkeys_count_contained_in.  It
can be considered a complement to the changes in commit 828e94c9d.

Reported-by: David Rowley <dgrowleyml@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAApHDvqvBireB_w6x8BN5txdvBEHxVgZBt=rUnpf5ww5P_E_ww@mail.gmail.com
2025-05-08 18:21:32 +09:00
88e947136b Fix typos and grammar in the code
The large majority of these have been introduced by recent commits done
in the v18 development cycle.

Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/9a7763ab-5252-429d-a943-b28941e0e28b@gmail.com
2025-04-19 19:17:42 +09:00
d69d45a5a9 Speedup child EquivalenceMember lookup in planner
When planning queries to partitioned tables, we clone all
EquivalenceMembers belonging to the partitioned table into em_is_child
EquivalenceMembers for each non-pruned partition.  For partitioned tables
with large numbers of partitions, this meant the ec_members list could
become large and code searching that list would become slow.  Effectively,
the more partitions which were present, the more searches needed to be
performed for operations such as find_ec_member_matching_expr() during
create_plan() and the more partitions present, the longer these searches
would take, i.e., a quadratic slowdown.

To fix this, here we adjust how we store EquivalenceMembers for
em_is_child members.  Instead of storing these directly in ec_members,
these are now stored in a new array of Lists in the EquivalenceClass,
which is indexed by the relid.  When we want to find EquivalenceMembers
belonging to a certain child relation, we can narrow the search to the
array element for that relation.

To make EquivalenceMember lookup easier and to reduce the amount of code
change, this commit provides a pair of functions to allow iteration over
the EquivalenceMembers of an EC which also handles finding the child
members, if required.  Callers that never need to look at child members
can remain using the foreach loop over ec_members, which will now often
be faster due to only parent-level members being stored there.

The actual performance increases here are highly dependent on the number
of partitions and the query being planned.  Performance increases can be
visible with as few as 8 partitions, but the speedup is marginal for
such low numbers of partitions.  The speedups become much more visible
with a few dozen to hundreds of partitions.  With some tested queries
using 56 partitions, the planner was around 3x faster than before.  For
use cases with thousands of partitions, these are likely to become
significantly faster.  Some testing has shown planner speedups of 60x or
more with 8192 partitions.

Author: Yuya Watari <watari.yuya@gmail.com>
Co-authored-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru>
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Tested-by: Thom Brown <thom@linux.com>
Tested-by: newtglobal postgresql_contributors <postgresql_contributors@newtglobalcorp.com>
Discussion: https://postgr.es/m/CAJ2pMkZNCgoUKSE%2B_5LthD%2BKbXKvq6h2hQN8Esxpxd%2Bcxmgomg%40mail.gmail.com
2025-04-08 18:09:57 +12:00
8123e91f5a Convert PathKey to use CompareType
Change the PathKey struct to use CompareType to record the sort
direction instead of hardcoding btree strategy numbers.  The
CompareType is then converted to the index-type-specific strategy when
the plan is created.

This reduces the number of places btree strategy numbers are
hardcoded, and it's a self-contained subset of a larger effort to
allow non-btree indexes to behave like btrees.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-04-04 11:22:20 +02:00
88f55bc976 Make derived clause lookup in EquivalenceClass more efficient
Derived clauses are stored in ec_derives, a List of RestrictInfos.
These clauses are later looked up by matching the left and right
EquivalenceMembers along with the clause's parent EC.

This linear search becomes expensive in queries with many joins or
partitions, where ec_derives may contain thousands of entries. In
particular, create_join_clause() can spend significant time scanning
this list.

To improve performance, introduce a hash table (ec_derives_hash) that
is built when the list reaches 32 entries -- the same threshold used
for join_rel_hash. The original list is retained alongside the hash
table to support EC merging and serialization
(_outEquivalenceClass()).

Each clause is stored in the hash table using a canonicalized key: the
EquivalenceMember with the lower memory address is placed in the key
before the one with the higher memory address. This avoids storing or
searching for both permutations of the same clause. For clauses
involving a constant EM, the key places NULL in the first slot and the
non-constant EM in the second.

The hash table is initialized using list_length(ec_derives_list) as
the size hint. simplehash internally adjusts this to the next power of
two after dividing by the fillfactor, so this typically results in at
least 64 buckets near the threshold -- avoiding immediate resizing
while adapting to the actual number of entries.

The lookup logic for derived clauses is now centralized in
ec_search_derived_clause_for_ems(), which consults the hash table when
available and falls back to the list otherwise.

The new ec_clear_derived_clauses() always frees ec_derives_list, even
though some of the original code paths that cleared the old
ec_derives field did not. This ensures consistent cleanup and avoids
leaking memory when large lists are discarded.

An assertion originally placed in find_derived_clause_for_ec_member()
is moved into ec_search_derived_clause_for_ems() so that it is
enforced consistently, regardless of whether the hash table or list is
used for lookup.

This design incorporates suggestions by David Rowley, who proposed
both the key canonicalization and the initial sizing approach to
balance memory usage and CPU efficiency.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Tested-by: Dmitry Dolgov <9erthalion6@gmail.com>
Tested-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Tested-by: Amit Langote <amitlangote09@gmail.com>
Tested-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAExHW5vZiQtWU6moszLP5iZ8gLX_ZAUbgEX0DxGLx9PGWCtqUg@mail.gmail.com
2025-04-04 10:45:05 +09:00
748e98d05b Fix code comment
The changes made in commit d2b4b4c2259 contained incorrect comments:
They said that certain forward declarations were necessary to "avoid
including pathnodes.h here", but the file is itself pathnodes.h!  So
change the comment to just say it's a forward declaration in one case,
and in the other case we don't need the declaration at all because it
already appeared earlier in the file.
2025-04-02 14:46:47 +02:00
eec0040c4b Add support for NOT ENFORCED in foreign key constraints
This expands the NOT ENFORCED constraint flag, previously only
supported for CHECK constraints (commit ca87c415e2f), to foreign key
constraints.

Normally, when a foreign key constraint is created on a table, action
and check triggers are added to maintain data integrity.  With this
patch, if a constraint is marked as NOT ENFORCED, integrity checks are
no longer required, making these triggers unnecessary.  Consequently,
when creating a NOT ENFORCED foreign key constraint, triggers will not
be created, and the constraint will be marked as NOT VALID.
Similarly, if an existing foreign key constraint is changed to NOT
ENFORCED, the associated triggers will be dropped, and the constraint
will also be marked as NOT VALID.  Conversely, if a NOT ENFORCED
foreign key constraint is changed to ENFORCED, the necessary triggers
will be created, and the will be changed to VALID by performing
necessary validation.

Since not-enforced foreign key constraints have no triggers, the
shortcut used for example in psql and pg_dump to skip looking for
foreign keys if the relation is known not to have triggers no longer
applies.  (It already didn't work for partitioned tables.)

Author: Amul Sul <sulamul@gmail.com>
Reviewed-by: Joel Jacobson <joel@compiler.org>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Isaac Morland <isaac.morland@gmail.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Tested-by: Triveni N <triveni.n@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
2025-04-02 13:36:44 +02:00
cdc168ad4b Add support for not-null constraints on virtual generated columns
This was left out of the original patch for virtual generated columns
(commit 83ea6c54025).

This just involves a bit of extra work in the executor to expand the
generation expressions and run a "IS NOT NULL" test against them.

There is also a bit of work to make sure that not-null constraints are
checked during a table rewrite.

Author: jian he <jian.universality@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Navneet Kumar <thanit3111@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CACJufxHArQysbDkWFmvK+D1TPHQWWTxWN15cMuUaTYX3xhQXgg@mail.gmail.com
2025-03-28 13:53:37 +01:00