Commit Graph

1045 Commits

Author SHA1 Message Date
bb437f995d Add TID Range Scans to support efficient scanning ranges of TIDs
This adds a new executor node named TID Range Scan.  The query planner
will generate paths for TID Range scans when quals are discovered on base
relations which search for ranges on the table's ctid column.  These
ranges may be open at either end. For example, WHERE ctid >= '(10,0)';
will return all tuples on page 10 and over.

To support this, two new optional callback functions have been added to
table AM.  scan_set_tidrange is used to set the scan range to just the
given range of TIDs.  scan_getnextslot_tidrange fetches the next tuple
in the given range.

For AMs were scanning ranges of TIDs would not make sense, these functions
can be set to NULL in the TableAmRoutine.  The query planner won't
generate TID Range Scan Paths in that case.

Author: Edmund Horner, David Rowley
Reviewed-by: David Rowley, Tomas Vondra, Tom Lane, Andres Freund, Zhihong Yu
Discussion: https://postgr.es/m/CAMyN-kB-nFTkF=VA_JPwFNo08S0d-Yk0F741S2B7LDmYAi8eyA@mail.gmail.com
2021-02-27 22:59:36 +13:00
f003a7522b Remove [Merge]AppendPath.partitioned_rels.
It turns out that the calculation of [Merge]AppendPath.partitioned_rels
in allpaths.c is faulty and sometimes omits relevant non-leaf partitions,
allowing an assertion added by commit a929e17e5a8 to trigger.  Rather
than fix that, it seems better to get rid of those fields altogether.
We don't really need the info until create_plan time, and calculating
it once for the selected plan should be cheaper than calculating it
for each append path we consider.

The preceding two commits did away with all use of the partitioned_rels
values; this commit just mechanically removes the fields and the code
that calculated them.

Discussion: https://postgr.es/m/87sg8tqhsl.fsf@aurora.ydns.eu
Discussion: https://postgr.es/m/CAJKUy5gCXDSmFs2c=R+VGgn7FiYcLCsEFEuDNNLGfoha=pBE_g@mail.gmail.com
2021-02-01 14:43:54 -05:00
5076f88bc9 Remove incidental dependencies on partitioned_rels lists.
It turns out that the calculation of [Merge]AppendPath.partitioned_rels
in allpaths.c is faulty and sometimes omits relevant non-leaf partitions,
allowing an assertion added by commit a929e17e5a8 to trigger.  Rather
than fix that, it seems better to get rid of those fields altogether.
We don't really need the info until create_plan time, and calculating
it once for the selected plan should be cheaper than calculating it
for each append path we consider.

This patch undoes a couple of very minor uses of the partitioned_rels
values.

createplan.c was testing for nil-ness to optimize away the preparatory
work for make_partition_pruneinfo().  That is worth doing if the check
is nigh free, but it's not worth going to any great lengths to avoid.

create_append_path() was testing for nil-ness as part of deciding how
to set up ParamPathInfo for an AppendPath.  I replaced that with a
check for the appendrel's parent rel being partitioned.  That's not
quite the same thing but should cover most cases.  If we note any
interesting loss of optimizations, we can dumb this down to just
always use the more expensive method when the parent is a baserel.

Discussion: https://postgr.es/m/87sg8tqhsl.fsf@aurora.ydns.eu
Discussion: https://postgr.es/m/CAJKUy5gCXDSmFs2c=R+VGgn7FiYcLCsEFEuDNNLGfoha=pBE_g@mail.gmail.com
2021-02-01 14:34:59 -05:00
360bd2321b Fix error with CREATE PUBLICATION, wal_level=minimal, and new tables.
CREATE PUBLICATION has failed spuriously when applied to a permanent
relation created or rewritten in the current transaction.  Make the same
change to another site having the same semantic intent; the second
instance has no user-visible consequences.  Back-patch to v13, where
commit c6b92041d38512a4176ed76ad06f713d2e6c01a8 broke this.

Kyotaro Horiguchi

Discussion: https://postgr.es/m/20210113.160705.2225256954956139776.horikyota.ntt@gmail.com
2021-01-30 00:11:38 -08:00
1046dbedde Silence another gcc 11 warning.
Per buildfarm and local experimentation, bleeding-edge gcc isn't
convinced that the MemSet in reorder_function_arguments() is safe.
Shut it up by adding an explicit check that pronargs isn't negative,
and by changing MemSet to memset.  (It appears that either change is
enough to quiet the warning at -O2, but let's do both to be sure.)
2021-01-28 17:19:16 -05:00
55dc86eca7 Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins.  This could result in a malformed plan.  The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.

The right value to use is the join level we actually intend to evaluate
the PHV at.  We can get that from the ph_eval_at field of the associated
PlaceHolderInfo.  However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level.  (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.)  Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.

The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos.  We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter.  (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)

Back-patch to v12.  The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.

Per report from Stephan Springl.

Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 15:37:23 -05:00
ca3b37487b Update copyright for 2021
Backpatch-through: 9.5
2021-01-02 13:06:25 -05:00
fac1b470a9 Disallow SRFs when considering sorts below Gather Merge
While we do allow SRFs in ORDER BY, scan/join processing should not
consider such cases - such sorts should only happen via final Sort atop
a ProjectSet. So make sure we don't try adding such sorts below Gather
Merge, just like we do for expressions that are volatile and/or not
parallel safe.

Backpatch to PostgreSQL 13, where this code was introduced as part of
the Incremental Sort patch.

Author: James Coleman
Reviewed-by: Tomas Vondra
Backpatch-through: 13
Discussion: https://postgr.es/m/CAAaqYe8cK3g5CfLC4w7bs=hC0mSksZC=H5M8LSchj5e5OxpTAg@mail.gmail.com
Discussion: https://postgr.es/m/295524.1606246314%40sss.pgh.pa.us
2020-12-21 19:36:22 +01:00
b3817f5f77 Improve hash_create()'s API for some added robustness.
Invent a new flag bit HASH_STRINGS to specify C-string hashing, which
was formerly the default; and add assertions insisting that exactly
one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set.
This is in hopes of preventing recurrences of the type of oversight
fixed in commit a1b8aa1e4 (i.e., mistakenly omitting HASH_BLOBS).

Also, when HASH_STRINGS is specified, insist that the keysize be
more than 8 bytes.  This is a heuristic, but it should catch
accidental use of HASH_STRINGS for integer or pointer keys.
(Nearly all existing use-cases set the keysize to NAMEDATALEN or
more, so there's little reason to think this restriction should
be problematic.)

Tweak hash_create() to insist that the HASH_ELEM flag be set, and
remove the defaults it had for keysize and entrysize.  Since those
defaults were undocumented and basically useless, no callers
omitted HASH_ELEM anyway.

Also, remove memset's zeroing the HASHCTL parameter struct from
those callers that had one.  This has never been really necessary,
and while it wasn't a bad coding convention it was confusing that
some callers did it and some did not.  We might as well save a few
cycles by standardizing on "not".

Also improve the documentation for hash_create().

In passing, improve reinit.c's usage of a hash table by storing
the key as a binary Oid rather than a string; and, since that's
a temporary hash table, allocate it in CurrentMemoryContext for
neatness.

Discussion: https://postgr.es/m/590625.1607878171@sss.pgh.pa.us
2020-12-15 11:38:53 -05:00
653aa603f5 Provide an error cursor for "can't subscript" error messages.
Commit c7aba7c14 didn't add this, but after more fooling with the
feature I feel that it'd be useful.  To make this possible, refactor
getSubscriptingRoutines() so that the caller is responsible for
throwing any error.  (In clauses.c, I just chose to make the
most conservative assumption rather than throwing an error.  We don't
expect failures there anyway really, so the code space for an error
message would be a poor investment.)
2020-12-11 18:58:21 -05:00
c7aba7c14e Support subscripting of arbitrary types, not only arrays.
This patch generalizes the subscripting infrastructure so that any
data type can be subscripted, if it provides a handler function to
define what that means.  Traditional variable-length (varlena) arrays
all use array_subscript_handler(), while the existing fixed-length
types that support subscripting use raw_array_subscript_handler().
It's expected that other types that want to use subscripting notation
will define their own handlers.  (This patch provides no such new
features, though; it only lays the foundation for them.)

To do this, move the parser's semantic processing of subscripts
(including coercion to whatever data type is required) into a
method callback supplied by the handler.  On the execution side,
replace the ExecEvalSubscriptingRef* layer of functions with direct
calls to callback-supplied execution routines.  (Thus, essentially
no new run-time overhead should be caused by this patch.  Indeed,
there is room to remove some overhead by supplying specialized
execution routines.  This patch does a little bit in that line,
but more could be done.)

Additional work is required here and there to remove formerly
hard-wired assumptions about the result type, collation, etc
of a SubscriptingRef expression node; and to remove assumptions
that the subscript values must be integers.

One useful side-effect of this is that we now have a less squishy
mechanism for identifying whether a data type is a "true" array:
instead of wiring in weird rules about typlen, we can look to see
if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER.  For this
to be bulletproof, we have to forbid user-defined types from using
that handler directly; but there seems no good reason for them to
do so.

This patch also removes assumptions that the number of subscripts
is limited to MAXDIM (6), or indeed has any hard-wired limit.
That limit still applies to types handled by array_subscript_handler
or raw_array_subscript_handler, but to discourage other dependencies
on this constant, I've moved it from c.h to utils/array.h.

Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov,
Peter Eisentraut, Pavel Stehule

Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com
Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
2020-12-09 12:40:37 -05:00
8b069ef5dc Change get_constraint_index() to use pg_constraint.conindid
It was still using a scan of pg_depend instead of using the conindid
column that has been added since.

Since it is now just a catalog lookup wrapper and not related to
pg_depend, move from pg_depend.c to lsyscache.c.

Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/4688d55c-9a2e-9a5a-d166-5f24fe0bf8db%40enterprisedb.com
2020-12-09 15:41:45 +01:00
62ee703313 Teach contain_leaked_vars that assignment SubscriptingRefs are leaky.
array_get_element and array_get_slice qualify as leakproof, since
they will silently return NULL for bogus subscripts.  But
array_set_element and array_set_slice throw errors for such cases,
making them clearly not leakproof.  contain_leaked_vars was evidently
written with only the former case in mind, as it gave the wrong answer
for assignment SubscriptingRefs (nee ArrayRefs).

This would be a live security bug, were it not that assignment
SubscriptingRefs can only occur in INSERT and UPDATE target lists,
while we only care about leakproofness for qual expressions; so the
wrong answer can't occur in practice.  Still, that's a rather shaky
answer for a security-related question; and maybe in future somebody
will want to ask about leakproofness of a tlist.  So it seems wise to
fix and even back-patch this correction.

(We would need some change here anyway for the upcoming
generic-subscripting patch, since extensions might make different
tradeoffs about whether to throw errors.  Commit 558d77f20 attempted
to lay groundwork for that by asking check_functions_in_node whether a
SubscriptingRef contains leaky functions; but that idea fails now that
the implementation methods of a SubscriptingRef are not SQL-visible
functions that could be marked leakproof or not.)

Back-patch to 9.6.  While 9.5 has the same issue, the code's a bit
different.  It seems quite unlikely that we'd introduce any actual bug
in the short time 9.5 has left to live, so the work/risk/reward balance
isn't attractive for changing 9.5.

Discussion: https://postgr.es/m/3143742.1607368115@sss.pgh.pa.us
2020-12-08 17:50:54 -05:00
8286223f3d Fix missing outfuncs.c support for IncrementalSortPath.
For debugging purposes, Path nodes are supposed to have outfuncs
support, but this was overlooked in the original incremental sort patch.

While at it, clean up a couple other minor oversights, as well as
bizarre choice of return type for create_incremental_sort_path().
(All the existing callers just cast it to "Path *" immediately, so
they don't care, but some future caller might care.)

outfuncs.c fix by Zhijie Hou, the rest by me

Discussion: https://postgr.es/m/324c4d81d8134117972a5b1f6cdf9560@G08CNEXMBPEKD05.g08.fujitsu.local
2020-11-30 16:33:09 -05:00
b1738ff6ab Fix miscomputation of direct_lateral_relids for join relations.
If a PlaceHolderVar is to be evaluated at a join relation, but
its value is only needed there and not at higher levels, we neglected
to update the joinrel's direct_lateral_relids to include the PHV's
source rel.  This causes problems because join_is_legal() then won't
allow joining the joinrel to the PHV's source rel at all, leading
to "failed to build any N-way joins" planner failures.

Per report from Andreas Seltenreich.  Back-patch to 9.5
where the problem originated.

Discussion: https://postgr.es/m/87blfgqa4t.fsf@aurora.ydns.eu
2020-11-30 12:22:43 -05:00
660b89928d Properly check index mark/restore in ExecSupportsMarkRestore.
Previously this code assumed that all IndexScan nodes supported
mark/restore, which is not true since it depends on optional index AM
support functions. This could lead to errors about missing support
functions in rare edge cases of mergejoins with no sort keys, where an
unordered non-btree index scan was placed on the inner path without a
protecting Materialize node. (Normally, the fact that merge join
requires ordered input would avoid this error.)

Backpatch all the way since this bug is ancient.

Per report from Eugen Konkov on irc.

Discussion: https://postgr.es/m/87o8jn50be.fsf@news-spur.riddles.org.uk
2020-11-24 21:58:32 +00:00
0a2bc5d61e Move per-agg and per-trans duplicate finding to the planner.
This has the advantage that the cost estimates for aggregates can count
the number of calls to transition and final functions correctly.

Bump catalog version, because views can contain Aggrefs.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/b2e3536b-1dbc-8303-c97e-89cb0b4a9a48%40iki.fi
2020-11-24 10:45:00 +02:00
8e1f37c07a Rethink the generation rule for fmgroids.h macros.
Traditionally, the names of fmgroids.h macros for pg_proc OIDs
have been constructed from the prosrc field.  But sometimes the
same C function underlies multiple pg_proc entries, forcing us
to make an arbitrary choice of which OID to reference; the other
entries are then not namable via fmgroids.h.  Moreover, we could
not have macros at all for pg_proc entries that aren't for
C-coded functions.

Instead, use the proname field, and append the proargtypes field
(replacing inter-argument spaces with underscores) if proname is
not unique.  Special-casing unique entries such as F_OIDEQ removes
the need to change a lot of code.  Indeed, I can only find two
places in the tree that need to be adjusted; while this changes
quite a few existing entries in fmgroids.h, few of them are
referenced from C code.

With this patch, all entries in pg_proc.dat have macros in fmgroids.h.

Discussion: https://postgr.es/m/472274.1604258384@sss.pgh.pa.us
2020-11-02 11:57:28 -05:00
257836a755 Track collation versions for indexes.
Record the current version of dependent collations in pg_depend when
creating or rebuilding an index.  When accessing the index later, warn
that the index may be corrupted if the current version doesn't match.

Thanks to Douglas Doole, Peter Eisentraut, Christoph Berg, Laurenz Albe,
Michael Paquier, Robert Haas, Tom Lane and others for very helpful
discussion.

Author: Thomas Munro <thomas.munro@gmail.com>
Author: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> (earlier versions)
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
2020-11-03 01:19:50 +13:00
a929e17e5a Allow run-time pruning on nested Append/MergeAppend nodes
Previously we only tagged on the required information to allow the
executor to perform run-time partition pruning for Append/MergeAppend
nodes belonging to base relations.  It was thought that nested
Append/MergeAppend nodes were just about always pulled up into the
top-level Append/MergeAppend and that making the run-time pruning info for
any sub Append/MergeAppend nodes was a waste of time.  However, that was
likely badly thought through.

Some examples of cases we're unable to pullup nested Append/MergeAppends
are: 1) Parallel Append nodes with a mix of parallel and non-parallel
paths into a Parallel Append.  2) When planning an ordered Append scan a
sub-partition which is unordered may require a nested MergeAppend path to
ensure sub-partitions don't mix up the order of tuples being fed into the
top-level Append.

Unfortunately, it was not just as simple as removing the lines in
createplan.c which were purposefully not building the run-time pruning
info for anything but RELOPT_BASEREL relations.  The code in
add_paths_to_append_rel() was far too sloppy about which partitioned_rels
it included for the Append/MergeAppend paths.  The original code there
would always assume accumulate_append_subpath() would pull each sub-Append
and sub-MergeAppend path into the top-level path.  While it does not
appear that there were any actual bugs caused by having the additional
partitioned table RT indexes recorded, what it did mean is that later in
planning, when we built the run-time pruning info that we wasted effort
and built PartitionedRelPruneInfos for partitioned tables that we had no
subpaths for the executor to run-time prune.

Here we tighten that up so that partitioned_rels only ever contains the RT
index for partitioned tables which actually have subpaths in the given
Append/MergeAppend.  We can now Assert that every PartitionedRelPruneInfo
has a non-empty present_parts.  That should allow us to catch any weird
corner cases that have been missed.

In passing, it seems there is no longer a good reason to have the
AppendPath and MergeAppendPath's partitioned_rel fields a List of IntList.
We can simply have a List of Relids instead.  This is more compact in
memory and faster to add new members to.  We still know which is the root
level partition as these always have a lower relid than their children.
Previously this field was used for more things, but run-time partition
pruning now remains the only user of it and it has no need for a List of
IntLists.

Here we also get rid of the RelOptInfo partitioned_child_rels field. This
is what was previously used to (sometimes incorrectly) set the
Append/MergeAppend path's partitioned_rels field.  That was the only usage
of that field, so we can happily just remove it.

I also couldn't resist changing some nearby code to make use of the newly
added for_each_from macro so we can skip the first element in the list
without checking if the current item was the first one on each
iteration.

A bug report from Andreas Kretschmer prompted all this work, however,
after some consideration, I'm not personally classing this as a bug fix.
So no backpatch.  In Andreas' test case, it just wasn't that clear that
there was a nested Append since the top-level Append just had a single
sub-path which was pulled up a level, per 8edd0e794.

Author: David Rowley
Reviewed-by: Amit Langote
Discussion: https://postgr.es/m/flat/CAApHDvqSchs%2BubdybcfFaSPB%2B%2BEA7kqMaoqajtP0GtZvzOOR3g%40mail.gmail.com
2020-11-02 13:46:56 +13:00
ad1c36b070 Fix foreign-key selectivity estimation in the presence of constants.
get_foreign_key_join_selectivity() looks for join clauses that equate
the two sides of the FK constraint.  However, if we have a query like
"WHERE fktab.a = pktab.a and fktab.a = 1", it won't find any such join
clause, because equivclass.c replaces the given clauses with "fktab.a
= 1 and pktab.a = 1", which can be enforced at the scan level, leaving
nothing to be done for column "a" at the join level.

We can fix that expectation without much trouble, but then a new problem
arises: applying the foreign-key-based selectivity rule produces a
rowcount underestimate, because we're effectively double-counting the
selectivity of the "fktab.a = 1" clause.  So we have to cancel that
selectivity out of the estimate.

To fix, refactor process_implied_equality() so that it can pass back the
new RestrictInfo to its callers in equivclass.c, allowing the generated
"fktab.a = 1" clause to be saved in the EquivalenceClass's ec_derives
list.  Then it's not much trouble to dig out the relevant RestrictInfo
when we need to adjust an FK selectivity estimate.  (While at it, we
can also remove the expensive use of initialize_mergeclause_eclasses()
to set up the new RestrictInfo's left_ec and right_ec pointers.
The equivclass.c code can set those basically for free.)

This seems like clearly a bug fix, but I'm hesitant to back-patch it,
first because there's some API/ABI risk for extensions and second because
we're usually loath to destabilize plan choices in stable branches.

Per report from Sigrid Ehrenreich.

Discussion: https://postgr.es/m/1019549.1603770457@sss.pgh.pa.us
Discussion: https://postgr.es/m/AM6PR02MB5287A0ADD936C1FA80973E72AB190@AM6PR02MB5287.eurprd02.prod.outlook.com
2020-10-28 11:15:47 -04:00
c8ab970179 Fix list-munging bug that broke SQL function result coercions.
Since commit 913bbd88d, check_sql_fn_retval() can either insert type
coercion steps in-line in the Query that produces the SQL function's
results, or generate a new top-level Query to perform the coercions,
if modifying the Query's output in-place wouldn't be safe.  However,
it appears that the latter case has never actually worked, because
the code tried to inject the new Query back into the query list it was
passed ... which is not the list that will be used for later processing
when we execute the SQL function "normally" (without inlining it).
So we ended up with no coercion happening at run-time, leading to
wrong results or crashes depending on the datatypes involved.

While the regression tests look like they cover this area well enough,
through a huge bit of bad luck all the test cases that exercise the
separate-Query path were checking either inline-able cases (which
accidentally didn't have the bug) or cases that are no-ops at runtime
(e.g., varchar to text), so that the failure to perform the coercion
wasn't obvious.  The fact that the cases that don't work weren't
allowed at all before v13 probably contributed to not noticing the
problem sooner, too.

To fix, get rid of the separate "flat" list of Query nodes and instead
pass the real two-level list that is going to be used later.  I chose
to make the same change in check_sql_fn_statements(), although that has
no actual bug, just so that we don't need that data structure at all.

This is an API change, as evidenced by the adjustments needed to
callers outside functions.c.  That's a bit scary to be doing in a
released branch, but so far as I can tell from a quick search,
there are no outside callers of these functions (and they are
sufficiently specific to our semantics for SQL-language functions that
it's not apparent why any extension would need to call them).  In any
case, v13 already changed the API of check_sql_fn_retval() compared to
prior branches.

Per report from pinker.  Back-patch to v13 where this code came in.

Discussion: https://postgr.es/m/1603050466566-0.post@n3.nabble.com
2020-10-19 14:33:09 -04:00
f0f13a3a08 Fix estimates for ModifyTable paths without RETURNING.
In the past, we always estimated that a ModifyTable node would emit the
same number of rows as its subpaths.  Without a RETURNING clause, the
correct estimate is zero.  Fix, in preparation for a proposed parallel
write patch that is sensitive to that number.

A remaining problem is that for RETURNING queries, the estimated width
is based on subpath output rather than the RETURNING tlist.

Reviewed-by: Greg Nancarrow <gregn4422@gmail.com>
Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV%3DqpFJrR3AcrTS3g%40mail.gmail.com
2020-10-13 00:26:49 +13:00
3d351d916b Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.
Historically, we've considered the state with relpages and reltuples
both zero as indicating that we do not know the table's tuple density.
This is problematic because it's impossible to distinguish "never yet
vacuumed" from "vacuumed and seen to be empty".  In particular, a user
cannot use VACUUM or ANALYZE to override the planner's normal heuristic
that an empty table should not be believed to be empty because it is
probably about to get populated.  That heuristic is a good safety
measure, so I don't care to abandon it, but there should be a way to
override it if the table is indeed intended to stay empty.

Hence, represent the initial state of ignorance by setting reltuples
to -1 (relpages is still set to zero), and apply the minimum-ten-pages
heuristic only when reltuples is still -1.  If the table is empty,
VACUUM or ANALYZE (but not CREATE INDEX) will override that to
reltuples = relpages = 0, and then we'll plan on that basis.

This requires a bunch of fiddly little changes, but we can get rid of
some ugly kluges that were formerly needed to maintain the old definition.

One notable point is that FDWs' GetForeignRelSize methods will see
baserel->tuples = -1 when no ANALYZE has been done on the foreign table.
That seems like a net improvement, since those methods were formerly
also in the dark about what baserel->tuples = 0 really meant.  Still,
it is an API change.

I bumped catversion because code predating this change would get confused
by seeing reltuples = -1.

Discussion: https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
2020-08-30 12:21:51 -04:00
2072932407 Suppress unnecessary RelabelType nodes in yet more cases.
Commit a477bfc1d fixed eval_const_expressions() to ensure that it
didn't generate unnecessary RelabelType nodes, but I failed to notice
that some other places in the planner had the same issue.  Really
noplace in the planner should be using plain makeRelabelType(), for
fear of generating expressions that should be equal() to semantically
equivalent trees, but aren't.

An example is that because canonicalize_ec_expression() failed
to be careful about this, we could end up with an equivalence class
containing both a plain Const, and a Const-with-RelabelType
representing exactly the same value.  So far as I can tell this led to
no visible misbehavior, but we did waste a bunch of cycles generating
and evaluating "Const = Const-with-RelabelType" to prove such entries
are redundant.

Hence, move the support function added by a477bfc1d to where it can
be more generally useful, and use it in the places where planner code
previously used makeRelabelType.

Back-patch to v12, like the previous patch.  While I have no concrete
evidence of any real misbehavior here, it's certainly possible that
I overlooked a case where equivalent expressions that aren't equal()
could cause a user-visible problem.  In any case carrying extra
RelabelType nodes through planning to execution isn't very desirable.

Discussion: https://postgr.es/m/1311836.1597781384@sss.pgh.pa.us
2020-08-19 14:07:49 -04:00
1e7629d2c9 Be more careful about the shape of hashable subplan clauses.
nodeSubplan.c expects that the testexpr for a hashable ANY SubPlan
has the form of one or more OpExprs whose LHS is an expression of the
outer query's, while the RHS is an expression over Params representing
output columns of the subquery.  However, the planner only went as far
as verifying that the clauses were all binary OpExprs.  This works
99.99% of the time, because the clauses have the right shape when
emitted by the parser --- but it's possible for function inlining to
break that, as reported by PegoraroF10.  To fix, teach the planner
to check that the LHS and RHS contain the right things, or more
accurately don't contain the wrong things.  Given that this has been
broken for years without anyone noticing, it seems sufficient to just
give up hashing when it happens, rather than go to the trouble of
commuting the clauses back again (which wouldn't necessarily work
anyway).

While poking at that, I also noticed that nodeSubplan.c had a baked-in
assumption that the number of hash clauses is identical to the number
of subquery output columns.  Again, that's fine as far as parser output
goes, but it's not hard to break it via function inlining.  There seems
little reason for that assumption though --- AFAICS, the only thing
it's buying us is not having to store the number of hash clauses
explicitly.  Adding code to the planner to reject such cases would take
more code than getting nodeSubplan.c to cope, so I fixed it that way.

This has been broken for as long as we've had hashable SubPlans,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/1549209182255-0.post@n3.nabble.com
2020-08-14 22:14:03 -04:00
d6c08e29e7 Add hash_mem_multiplier GUC.
Add a GUC that acts as a multiplier on work_mem.  It gets applied when
sizing executor node hash tables that were previously size constrained
using work_mem alone.

The new GUC can be used to preferentially give hash-based nodes more
memory than the generic work_mem limit.  It is intended to enable admin
tuning of the executor's memory usage.  Overall system throughput and
system responsiveness can be improved by giving hash-based executor
nodes more memory (especially over sort-based alternatives, which are
often much less sensitive to being memory constrained).

The default value for hash_mem_multiplier is 1.0, which is also the
minimum valid value.  This means that hash-based nodes continue to apply
work_mem in the traditional way by default.

hash_mem_multiplier is generally useful.  However, it is being added now
due to concerns about hash aggregate performance stability for users
that upgrade to Postgres 13 (which added disk-based hash aggregation in
commit 1f39bce0).  While the old hash aggregate behavior risked
out-of-memory errors, it is nevertheless likely that many users actually
benefited.  Hash agg's previous indifference to work_mem during query
execution was not just faster; it also accidentally made aggregation
resilient to grouping estimate problems (at least in cases where this
didn't create destabilizing memory pressure).

hash_mem_multiplier can provide a certain kind of continuity with the
behavior of Postgres 12 hash aggregates in cases where the planner
incorrectly estimates that all groups (plus related allocations) will
fit in work_mem/hash_mem.  This seems necessary because hash-based
aggregation is usually much slower when only a small fraction of all
groups can fit.  Even when it isn't possible to totally avoid hash
aggregates that spill, giving hash aggregation more memory will reliably
improve performance (the same cannot be said for external sort
operations, which appear to be almost unaffected by memory availability
provided it's at least possible to get a single merge pass).

The PostgreSQL 13 release notes should advise users that increasing
hash_mem_multiplier can help with performance regressions associated
with hash aggregation.  That can be taken care of by a later commit.

Author: Peter Geoghegan
Reviewed-By: Álvaro Herrera, Jeff Davis
Discussion: https://postgr.es/m/20200625203629.7m6yvut7eqblgmfo@alap3.anarazel.de
Discussion: https://postgr.es/m/CAH2-WzmD%2Bi1pG6rc1%2BCjc4V6EaFJ_qSuKCCHVnH%3DoruqD-zqow%40mail.gmail.com
Backpatch: 13-, where disk-based hash aggregation was introduced.
2020-07-29 14:14:58 -07:00
689696c711 Fix bitmap AND/OR scans on the inside of a nestloop partition-wise join.
reparameterize_path_by_child() failed to reparameterize BitmapAnd
and BitmapOr paths.  This matters only if such a path is chosen as
the inside of a nestloop partition-wise join, where we have to pass
in parameters from the outside of the nestloop.  If that did happen,
we generated a bad plan that would likely lead to crashes at execution.

This is not entirely reparameterize_path_by_child()'s fault though;
it's the victim of an ancient decision (my ancient decision, I think)
to not bother filling in param_info in BitmapAnd/Or path nodes.  That
caused the function to believe that such nodes and their children
contain no parameter references and so need not be processed.

In hindsight that decision looks pretty penny-wise and pound-foolish:
while it saves a few cycles during path node setup, we do commonly
need the information later.  In particular, by reversing the decision
and requiring valid param_info data in all nodes of a bitmap path
tree, we can get rid of indxpath.c's get_bitmap_tree_required_outer()
function, which computed the data on-demand.  It's not unlikely that
that nets out as a savings of cycles in many scenarios.  A couple
of other things in indxpath.c can be simplified as well.

While here, get rid of some cases in reparameterize_path_by_child()
that are visibly dead or useless, given that we only care about
reparameterizing paths that can be on the inside of a parameterized
nestloop.  This case reminds one of the maxim that untested code
probably does not work, so I'm unwilling to leave unreachable code
in this function.  (I did leave the T_Gather case in place even
though it's not reached in the regression tests.  It's not very
clear to me when the planner might prefer to put Gather below
rather than above a nestloop, but at least in principle the case
might be interesting.)

Per bug #16536, originally from Arne Roland but with a test case
by Andrew Gierth.  Back-patch to v11 where this code came in.

Discussion: https://postgr.es/m/16536-2213ee0b3aad41fd@postgresql.org
2020-07-14 18:56:56 -04:00
e07633646a code: replace 'master' with 'leader' where appropriate.
Leader already is the more widely used terminology, but a few places
didn't get the message.

Author: Andres Freund
Reviewed-By: David Steele
Discussion: https://postgr.es/m/20200615182235.x7lch5n6kcjq4aue@alap3.anarazel.de
2020-07-08 12:58:32 -07:00
fa27dd40d5 Run pgindent with new pg_bsd_indent version 2.1.1.
Thomas Munro fixed a longstanding annoyance in pg_bsd_indent, that
it would misformat lines containing IsA() macros on the assumption
that the IsA() call should be treated like a cast.  This improves
some other cases involving field/variable names that match typedefs,
too.  The only places that get worse are a couple of uses of the
OpenSSL macro STACK_OF(); we'll gladly take that trade-off.

Discussion: https://postgr.es/m/20200114221814.GA19630@alvherre.pgsql
2020-05-16 11:54:51 -04:00
17cc133f01 Dial back -Wimplicit-fallthrough to level 3
The additional pain from level 4 is excessive for the gain.

Also revert all the source annotation changes to their original
wordings, to avoid back-patching pain.

Discussion: https://postgr.es/m/31166.1589378554@sss.pgh.pa.us
2020-05-13 15:31:14 -04:00
3e9744465d Add -Wimplicit-fallthrough to CFLAGS and CXXFLAGS
Use it at level 4, a bit more restrictive than the default level, and
tweak our commanding comments to FALLTHROUGH.

(However, leave zic.c alone, since it's external code; to avoid the
warnings that would appear there, change CFLAGS for that file in the
Makefile.)

Author: Julien Rouhaud <rjuju123@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20200412081825.qyo5vwwco3fv4gdo@nol
Discussion: https://postgr.es/m/flat/E1fDenm-0000C8-IJ@gemulon.postgresql.org
2020-05-12 16:07:30 -04:00
2793bbe75e Remove unnecessary #include.
My oversight in commit c8434d64c.
2020-05-12 19:55:55 +09:00
0da06d9faf Get rid of trailing semicolons in C macro definitions.
Writing a trailing semicolon in a macro is almost never the right thing,
because you almost always want to write a semicolon after each macro
call instead.  (Even if there was some reason to prefer not to, pgindent
would probably make a hash of code formatted that way; so within PG the
rule should basically be "don't do it".)  Thus, if we have a semi inside
the macro, the compiler sees "something;;".  Much of the time the extra
empty statement is harmless, but it could lead to mysterious syntax
errors at call sites.  In perhaps an overabundance of neatnik-ism, let's
run around and get rid of the excess semicolons whereever possible.

The only thing worse than a mysterious syntax error is a mysterious
syntax error that only happens in the back branches; therefore,
backpatch these changes where relevant, which is most of them because
most of these mistakes are old.  (The lack of reported problems shows
that this is largely a hypothetical issue, but still, it could bite
us in some future patch.)

John Naylor and Tom Lane

Discussion: https://postgr.es/m/CACPNZCs0qWTqJ2QUSGJ07B7uvAvzMb-KbG2q+oo+J3tsWN5cqw@mail.gmail.com
2020-05-01 17:28:00 -04:00
981643dcdb Allow partitionwise join to handle nested FULL JOIN USING cases.
This case didn't work because columns merged by FULL JOIN USING are
represented in the parse tree by COALESCE expressions, and the logic
for recognizing a partitionable join failed to match upper-level join
clauses to such expressions.  To fix, synthesize suitable COALESCE
expressions and add them to the nullable_partexprs lists.  This is
pretty ugly and brute-force, but it gets the job done.  (I have
ambitions of rethinking the way outer-join output Vars are
represented, so maybe that will provide a cleaner solution someday.
For now, do this.)

Amit Langote, reviewed by Justin Pryzby, Richard Guo, and myself

Discussion: https://postgr.es/m/CA+HiwqG2WVUGmLJqtR0tPFhniO=H=9qQ+Z3L_ZC+Y3-EVQHFGg@mail.gmail.com
2020-04-07 22:12:14 -04:00
c8434d64ce Allow partitionwise joins in more cases.
Previously, the partitionwise join technique only allowed partitionwise
join when input partitioned tables had exactly the same partition
bounds.  This commit extends the technique to some cases when the tables
have different partition bounds, by using an advanced partition-matching
algorithm introduced by this commit.  For both the input partitioned
tables, the algorithm checks whether every partition of one input
partitioned table only matches one partition of the other input
partitioned table at most, and vice versa.  In such a case the join
between the tables can be broken down into joins between the matching
partitions, so the algorithm produces the pairs of the matching
partitions, plus the partition bounds for the join relation, to allow
partitionwise join for computing the join.  Currently, the algorithm
works for list-partitioned and range-partitioned tables, but not
hash-partitioned tables.  See comments in partition_bounds_merge().

Ashutosh Bapat and Etsuro Fujita, most of regression tests by Rajkumar
Raghuwanshi, some of the tests by Mark Dilger and Amul Sul, reviewed by
Dmitry Dolgov and Amul Sul, with additional review at various points by
Ashutosh Bapat, Mark Dilger, Robert Haas, Antonin Houska, Amit Langote,
Justin Pryzby, and Tomas Vondra

Discussion: https://postgr.es/m/CAFjFpRdjQvaUEV5DJX3TW6pU5eq54NCkadtxHX2JiJG_GvbrCA@mail.gmail.com
2020-04-08 10:25:00 +09:00
357889eb17 Support FETCH FIRST WITH TIES
WITH TIES is an option to the FETCH FIRST N ROWS clause (the SQL
standard's spelling of LIMIT), where you additionally get rows that
compare equal to the last of those N rows by the columns in the
mandatory ORDER BY clause.

There was a proposal by Andrew Gierth to implement this functionality in
a more powerful way that would yield more features, but the other patch
had not been finished at this time, so we decided to use this one for
now in the spirit of incremental development.

Author: Surafel Temesgen <surafel3000@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com>
Discussion: https://postgr.es/m/CALAY4q9ky7rD_A4vf=FVQvCGngm3LOes-ky0J6euMrg=_Se+ag@mail.gmail.com
Discussion: https://postgr.es/m/87o8wvz253.fsf@news-spur.riddles.org.uk
2020-04-07 16:22:13 -04:00
d2d8a229bc Implement Incremental Sort
Incremental Sort is an optimized variant of multikey sort for cases when
the input is already sorted by a prefix of the requested sort keys. For
example when the relation is already sorted by (key1, key2) and we need
to sort it by (key1, key2, key3) we can simply split the input rows into
groups having equal values in (key1, key2), and only sort/compare the
remaining column key3.

This has a number of benefits:

- Reduced memory consumption, because only a single group (determined by
  values in the sorted prefix) needs to be kept in memory. This may also
  eliminate the need to spill to disk.

- Lower startup cost, because Incremental Sort produce results after each
  prefix group, which is beneficial for plans where startup cost matters
  (like for example queries with LIMIT clause).

We consider both Sort and Incremental Sort, and decide based on costing.

The implemented algorithm operates in two different modes:

- Fetching a minimum number of tuples without check of equality on the
  prefix keys, and sorting on all columns when safe.

- Fetching all tuples for a single prefix group and then sorting by
  comparing only the remaining (non-prefix) keys.

We always start in the first mode, and employ a heuristic to switch into
the second mode if we believe it's beneficial - the goal is to minimize
the number of unnecessary comparions while keeping memory consumption
below work_mem.

This is a very old patch series. The idea was originally proposed by
Alexander Korotkov back in 2013, and then revived in 2017. In 2018 the
patch was taken over by James Coleman, who wrote and rewrote most of the
current code.

There were many reviewers/contributors since 2013 - I've done my best to
pick the most active ones, and listed them in this commit message.

Author: James Coleman, Alexander Korotkov
Reviewed-by: Tomas Vondra, Andreas Karlsson, Marti Raudsepp, Peter Geoghegan, Robert Haas, Thomas Munro, Antonin Houska, Andres Freund, Alexander Kuzmenkov
Discussion: https://postgr.es/m/CAPpHfdscOX5an71nHd8WSUH6GNOCf=V7wgDaTXdDd9=goN-gfA@mail.gmail.com
Discussion: https://postgr.es/m/CAPpHfds1waRZ=NOmueYq0sx1ZSCnt+5QJvizT8ndT2=etZEeAQ@mail.gmail.com
2020-04-06 21:35:10 +02:00
0568e7a2a4 Cosmetic improvements for code related to partitionwise join.
Move have_partkey_equi_join and match_expr_to_partition_keys to
relnode.c, since they're used only there.  Refactor
build_joinrel_partition_info to split out the code that fills the
joinrel's partition key lists; this doesn't have any non-cosmetic
impact, but it seems like a useful separation of concerns.
Improve assorted nearby comments.

Amit Langote, with a little further editorialization by me

Discussion: https://postgr.es/m/CA+HiwqG2WVUGmLJqtR0tPFhniO=H=9qQ+Z3L_ZC+Y3-EVQHFGg@mail.gmail.com
2020-04-03 17:00:35 -04:00
911e702077 Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing.  These index AMs are GiST, GIN,
SP-GiST and BRIN.  There opclasses define representation of keys, operations on
them and supported search strategies.  So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision.  This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.

This commit doesn't introduce new storage in system catalog.  Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.

In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions.  Options
are set to fn_expr as the constant bytea expression.  It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.

This commit comes with some examples of opclass options usage.  We parametrize
signature length in GiST.  That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops.  Also we parametrize maximum number of integer ranges for
gist__int_ops.  However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.

Catversion is bumped.

Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 19:17:23 +03:00
1f39bce021 Disk-based Hash Aggregation.
While performing hash aggregation, track memory usage when adding new
groups to a hash table. If the memory usage exceeds work_mem, enter
"spill mode".

In spill mode, new groups are not created in the hash table(s), but
existing groups continue to be advanced if input tuples match. Tuples
that would cause a new group to be created are instead spilled to a
logical tape to be processed later.

The tuples are spilled in a partitioned fashion. When all tuples from
the outer plan are processed (either by advancing the group or
spilling the tuple), finalize and emit the groups from the hash
table. Then, create new batches of work from the spilled partitions,
and select one of the saved batches and process it (possibly spilling
recursively).

Author: Jeff Davis
Reviewed-by: Tomas Vondra, Adam Lee, Justin Pryzby, Taylor Vesely, Melanie Plageman
Discussion: https://postgr.es/m/507ac540ec7c20136364b5272acbcd4574aa76ef.camel@j-davis.com
2020-03-18 15:42:02 -07:00
c11cb17dc5 Save calculated transitionSpace in Agg node.
This will be useful in the upcoming Hash Aggregation work to improve
estimates for hash table sizing.

Discussion: https://postgr.es/m/37091115219dd522fd9ed67333ee8ed1b7e09443.camel%40j-davis.com
2020-02-27 11:20:56 -08:00
a477bfc1df Suppress unnecessary RelabelType nodes in more cases.
eval_const_expressions sometimes produced RelabelType nodes that
were useless because they just relabeled an expression to the same
exposed type it already had.  This is worth avoiding because it can
cause two equivalent expressions to not be equal(), preventing
recognition of useful optimizations.  In the test case added here,
an unpatched planner fails to notice that the "sqli = constant" clause
renders a sort step unnecessary, because one code path produces an
extra RelabelType and another doesn't.

Fix by ensuring that eval_const_expressions_mutator's T_RelabelType
case will not add in an unnecessary RelabelType.  Also save some
code by sharing a subroutine with the effectively-equivalent cases
for CollateExpr and CoerceToDomain.  (CollateExpr had no bug, and
I think that the case couldn't arise with CoerceToDomain, but
it seems prudent to do the same check for all three cases.)

Back-patch to v12.  In principle this has been wrong all along,
but I haven't seen a case where it causes visible misbehavior
before v12, so refrain from changing stable branches unnecessarily.

Per investigation of a report from Eric Gillum.

Discussion: https://postgr.es/m/CAMmjdmvAZsUEskHYj=KT9sTukVVCiCSoe_PBKOXsncFeAUDPCQ@mail.gmail.com
2020-02-26 18:14:12 -05:00
9ce77d75c5 Reconsider the representation of join alias Vars.
The core idea of this patch is to make the parser generate join alias
Vars (that is, ones with varno pointing to a JOIN RTE) only when the
alias Var is actually different from any raw join input, that is a type
coercion and/or COALESCE is necessary to generate the join output value.
Otherwise just generate varno/varattno pointing to the relevant join
input column.

In effect, this means that the planner's flatten_join_alias_vars()
transformation is already done in the parser, for all cases except
(a) columns that are merged by JOIN USING and are transformed in the
process, and (b) whole-row join Vars.  In principle that would allow
us to skip doing flatten_join_alias_vars() in many more queries than
we do now, but we don't have quite enough infrastructure to know that
we can do so --- in particular there's no cheap way to know whether
there are any whole-row join Vars.  I'm not sure if it's worth the
trouble to add a Query-level flag for that, and in any case it seems
like fit material for a separate patch.  But even without skipping the
work entirely, this should make flatten_join_alias_vars() faster,
particularly where there are nested joins that it previously had to
flatten recursively.

An essential part of this change is to replace Var nodes'
varnoold/varoattno fields with varnosyn/varattnosyn, which have
considerably more tightly-defined meanings than the old fields: when
they differ from varno/varattno, they identify the Var's position in
an aliased JOIN RTE, and the join alias is what ruleutils.c should
print for the Var.  This is necessary because the varno change
destroyed ruleutils.c's ability to find the JOIN RTE from the Var's
varno.

Another way in which this change broke ruleutils.c is that it's no
longer feasible to determine, from a JOIN RTE's joinaliasvars list,
which join columns correspond to which columns of the join's immediate
input relations.  (If those are sub-joins, the joinaliasvars entries
may point to columns of their base relations, not the sub-joins.)
But that was a horrid mess requiring a lot of fragile assumptions
already, so let's just bite the bullet and add some more JOIN RTE
fields to make it more straightforward to figure that out.  I added
two integer-List fields containing the relevant column numbers from
the left and right input rels, plus a count of how many merged columns
there are.

This patch depends on the ParseNamespaceColumn infrastructure that
I added in commit 5815696bc.  The biggest bit of code change is
restructuring transformFromClauseItem's handling of JOINs so that
the ParseNamespaceColumn data is propagated upward correctly.

Other than that and the ruleutils fixes, everything pretty much
just works, though some processing is now inessential.  I grabbed
two pieces of low-hanging fruit in that line:

1. In find_expr_references, we don't need to recurse into join alias
Vars anymore.  There aren't any except for references to merged USING
columns, which are more properly handled when we scan the join's RTE.
This change actually fixes an edge-case issue: we will now record a
dependency on any type-coercion function present in a USING column's
joinaliasvar, even if that join column has no references in the query
text.  The odds of the missing dependency causing a problem seem quite
small: you'd have to posit somebody dropping an implicit cast between
two data types, without removing the types themselves, and then having
a stored rule containing a whole-row Var for a join whose USING merge
depends on that cast.  So I don't feel a great need to change this in
the back branches.  But in theory this way is more correct.

2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse
into join alias Vars either, because the cases they care about don't
apply to alias Vars for USING columns that are semantically distinct
from the underlying columns.  This removes the only case in which
markVarForSelectPriv could be called with NULL for the RTE, so adjust
the comments to describe that hack as being strictly internal to
markRTEForSelectPriv.

catversion bump required due to changes in stored rules.

Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 11:56:59 -05:00
913bbd88dc Improve the handling of result type coercions in SQL functions.
Use the parser's standard type coercion machinery to convert the
output column(s) of a SQL function's final SELECT or RETURNING
to the type(s) they should have according to the function's declared
result type.  We'll allow any case where an assignment-level
coercion is available.  Previously, we failed unless the required
coercion was a binary-compatible one (and the documentation ignored
this, falsely claiming that the types must match exactly).

Notably, the coercion now accounts for typmods, so that cases where
a SQL function is declared to return a composite type whose columns
are typmod-constrained now behave as one would expect.  Arguably
this aspect is a bug fix, but the overall behavioral change here
seems too large to consider back-patching.

A nice side-effect is that functions can now be inlined in a
few cases where we previously failed to do so because of type
mismatches.

Discussion: https://postgr.es/m/18929.1574895430@sss.pgh.pa.us
2020-01-08 11:07:59 -05:00
7559d8ebfa Update copyrights for 2020
Backpatch-through: update all files in master, backpatch legal files through 9.4
2020-01-01 12:21:45 -05:00
7854e07f25 Revert "Rename files and headers related to index AM"
This follows multiple complains from Peter Geoghegan, Andres Freund and
Alvaro Herrera that this issue ought to be dug more before actually
happening, if it happens.

Discussion: https://postgr.es/m/20191226144606.GA5659@alvherre.pgsql
2019-12-27 08:09:00 +09:00
8ce3aa9b59 Rename files and headers related to index AM
The following renaming is done so as source files related to index
access methods are more consistent with table access methods (the
original names used for index AMs ware too generic, and could be
confused as including features related to table AMs):
- amapi.h -> indexam.h.
- amapi.c -> indexamapi.c.  Here we have an equivalent with
backend/access/table/tableamapi.c.
- amvalidate.c -> indexamvalidate.c.
- amvalidate.h -> indexamvalidate.h.
- genam.c -> indexgenam.c.
- genam.h -> indexgenam.h.

This has been discussed during the development of v12 when table AM was
worked on, but the renaming never happened.

Author: Michael Paquier
Reviewed-by: Fabien Coelho, Julien Rouhaud
Discussion: https://postgr.es/m/20191223053434.GF34339@paquier.xyz
2019-12-25 10:23:39 +09:00
55a1954da1 Fix EXPLAIN's column alias output for mismatched child tables.
If an inheritance/partitioning parent table is assigned some column
alias names in the query, EXPLAIN mapped those aliases onto the
child tables' columns by physical position, resulting in bogus output
if a child table's columns aren't one-for-one with the parent's.

To fix, make expand_single_inheritance_child() generate a correctly
re-mapped column alias list, rather than just copying the parent
RTE's alias node.  (We have to fill the alias field, not just
adjust the eref field, because ruleutils.c will ignore eref in
favor of looking at the real column names.)

This means that child tables will now always have alias fields in
plan rtables, where before they might not have.  That results in
a rather substantial set of regression test output changes:
EXPLAIN will now always show child tables with aliases that match
the parent table (usually with "_N" appended for uniqueness).
But that seems like a net positive for understandability, since
the parent alias corresponds to something that actually appeared
in the original query, while the child table names didn't.
(Note that this does not change anything for cases where an explicit
table alias was written in the query for the parent table; it
just makes cases without such aliases behave similarly to that.)
Hence, while we could avoid these subsidiary changes if we made
inherit.c more complicated, we choose not to.

Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us
2019-12-02 19:08:10 -05:00
ce76c0ba53 Add a reverse-translation column number array to struct AppendRelInfo.
This provides for cheaper mapping of child columns back to parent
columns.  The one existing use-case in examine_simple_variable()
would hardly justify this by itself; but an upcoming bug fix will
make use of this array in a mainstream code path, and it seems
likely that we'll find other uses for it as we continue to build
out the partitioning infrastructure.

Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us
2019-12-02 18:05:29 -05:00