Commit Graph

5245 Commits

Author SHA1 Message Date
ccaa3569f5 Recognize some OR clauses as compatible with functional dependencies
Since commit 8f321bd16c functional dependencies can handle IN clauses,
which however introduced a possible (and surprising) inconsistency,
because IN clauses may be expressed as an OR clause, which are still
considered incompatible. For example

  a IN (1, 2, 3)

may be rewritten as

  (a = 1 OR a = 2 OR a = 3)

The IN clause will work fine with functional dependencies, but the OR
clause will force the estimation to fall back to plain per-column
estimates, possibly introducing significant estimation errors.

This commit recognizes OR clauses equivalent to an IN clause (when all
arugments are compatible and reference the same attribute) as a special
case, compatible with functional dependencies. This allows applying
functional dependencies, just like for IN clauses.

This does not eliminate the difference in estimating the clause itself,
i.e. IN clause and OR clause still use different formulas. It would be
possible to change that (for these special OR clauses), but that's not
really about extended statistics - it was always like this. Moreover the
errors are usually much smaller compared to ignoring dependencies.

Author: Tomas Vondra
Reviewed-by: Dean Rasheed
Discussion: https://www.postgresql.org/message-id/flat/13902317.Eha0YfKkKy%40pierred-pdoc
2020-03-18 16:41:49 +01:00
e6c178b5b7 Refactor our checks for valid function and aggregate signatures.
pg_proc.c and pg_aggregate.c had near-duplicate copies of the logic
to decide whether a function or aggregate's signature is legal.
This seems like a bad thing even without the problem that the
upcoming "anycompatible" patch would roughly double the complexity
of that logic.  Hence, refactor so that the rules are localized
in new subroutines supplied by parse_coerce.c.  (One could quibble
about just where to add that code, but putting it beside
enforce_generic_type_consistency seems not totally unreasonable.)

The fact that the rules are about to change would mandate some
changes in the wording of the associated error messages in any case.
I ended up spelling things out in a fairly literal fashion in the
errdetail messages, eg "A result of type anyelement requires at
least one input of type anyelement, anyarray, anynonarray, anyenum,
or anyrange."  Perhaps this is overkill, but once there's more than
one subgroup of polymorphic types, people might get confused by
more-abstract messages.

Discussion: https://postgr.es/m/24137.1584139352@sss.pgh.pa.us
2020-03-17 19:36:41 -04:00
77ec5affbc Adjust handling of an ANYARRAY actual input for an ANYARRAY argument.
Ordinarily it's impossible for an actual input of a function to have
declared type ANYARRAY, since we'd resolve that to a concrete array
type before doing argument type resolution for the function.  But an
exception arises for functions applied to certain columns of pg_statistic
or pg_stats, since we abuse the "anyarray" pseudotype by using it to
declare those columns.  So parse_coerce.c has to deal with the case.

Previously we allowed an ANYARRAY actual input to match an ANYARRAY
polymorphic argument, but only if no other argument or result was
declared ANYELEMENT.  When that logic was written, those were the only
two polymorphic types, and I fear nobody thought carefully about how it
ought to extend to the other ones that came along later.  But actually
it was wrong even then, because if a function has two ANYARRAY
arguments, it should be able to expect that they have identical element
types, and we'd not be able to ensure that.

The correct generalization is that we can match an ANYARRAY actual input
to an ANYARRAY polymorphic argument only if no other argument or result
is of any polymorphic type, so that no promises are being made about
element type compatibility.  check_generic_type_consistency can't test
that condition, but it seems better anyway to accept such matches there
and then throw an error if needed in enforce_generic_type_consistency.
That way we can produce a specific error message rather than an
unintuitive "function does not exist" complaint.  (There's some risk
perhaps of getting new ambiguous-function complaints, but I think that
any set of functions for which that could happen would be ambiguous
against ordinary array columns as well.)  While we're at it, we can
improve the message that's produced in cases that the code did already
object to, as shown in the regression test changes.

Also, remove a similar test that got cargo-culted in for ANYRANGE;
there are no catalog columns of type ANYRANGE, and I hope we never
create any, so that's not needed.  (It was incomplete anyway.)

While here, update some comments and rearrange the code a bit in
preparation for upcoming additions of more polymorphic types.

In practical situations I believe this is just exchanging one error
message for another, hopefully better, one.  So it doesn't seem
needful to back-patch, even though the mistake is ancient.

Discussion: https://postgr.es/m/21569.1584314271@sss.pgh.pa.us
2020-03-17 18:29:14 -04:00
9d9784c840 Remove bogus assertion about polymorphic SQL function result.
It is possible to reach check_sql_fn_retval() with an unresolved
polymorphic rettype, resulting in an assertion failure as demonstrated
by one of the added test cases.  However, the code following that
throws what seems an acceptable error message, so just remove the
Assert and adjust commentary.

While here, I thought it'd be a good idea to provide some parallel
tests of SQL-function and PL/pgSQL-function polymorphism behavior.
Some of these cases are perhaps duplicative of tests elsewhere,
but we hadn't any organized coverage of the topic AFAICS.

Although that assertion's been wrong all along, it won't have any
effect in production builds, so I'm not bothering to back-patch.

Discussion: https://postgr.es/m/21569.1584314271@sss.pgh.pa.us
2020-03-17 14:54:46 -04:00
b4570d33aa Avoid holding a directory FD open across assorted SRF calls.
This extends the fixes made in commit 085b6b667 to other SRFs with the
same bug, namely pg_logdir_ls(), pgrowlocks(), pg_timezone_names(),
pg_ls_dir(), and pg_tablespace_databases().

Also adjust various comments and documentation to warn against
expecting to clean up resources during a ValuePerCall SRF's final
call.

Back-patch to all supported branches, since these functions were
all born broken.

Justin Pryzby, with cosmetic tweaks by me

Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
2020-03-16 21:05:52 -04:00
70a7b4776b Add backend type to csvlog and optionally log_line_prefix
The backend type, which corresponds to what
pg_stat_activity.backend_type shows, is added as a column to the
csvlog and can optionally be added to log_line_prefix using the new %b
placeholder.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://www.postgresql.org/message-id/flat/c65e5196-4f04-4ead-9353-6088c19615a3@2ndquadrant.com
2020-03-15 11:20:21 +01:00
d8cfa82d51 Improve test coverage for multi-column MCV lists
The regression tests for extended statistics were not testing a couple
of important cases for the MCV lists:

  * IS NOT NULL clauses - We did have queries with IS NULL clauses, but
    not the negative case.

  * clauses with variable on the right - All the clauses had the Var on
    the left, i.e. (Var op Const), so this adds (Const op Var) too.

  * columns with fixed-length types passed by reference - All columns
    were using either by-value or varlena types, so add a test with
    UUID columns too. This matters for (de)serialization.

  * NULL-only dimension - When one of the columns contains only NULL
    values, we treat it a a special case during (de)serialization.

  * arrays containing NULL - When the constant parameter contains NULL
    value, we need to handle it correctly during estimation, for all
    IN, ANY and ALL clauses.

Discussion: https://www.postgresql.org/message-id/flat/20200113230008.g67iyk4cs3xbnjju@development
Author: Tomas Vondra
2020-03-14 23:09:40 +01:00
f9696782c7 Improve test coverage for functional dependencies
The regression tests for functional dependencies were only using clauses
of the form (Var op Const), i.e. with Var on the left side. This adds
a couple of queries with Var on the right, to test other code paths.

It also prints one of the functional dependencies, to test the data type
output function. The functional dependencies are "perfect" with degree
of 1.0 so this should be stable.

Discussion: https://www.postgresql.org/message-id/flat/20200113230008.g67iyk4cs3xbnjju@development
Author: Tomas Vondra
2020-03-14 23:06:23 +01:00
4dbcb3f844 Restructure polymorphic-type resolution in funcapi.c.
resolve_polymorphic_tupdesc() and resolve_polymorphic_argtypes() failed to
cover the case of having to resolve anyarray given only an anyrange input.
The bug was masked if anyelement was also used (as either input or
output), which probably helps account for our not having noticed.

While looking at this I noticed that resolve_generic_type() would produce
the wrong answer if asked to make that same resolution.  ISTM that
resolve_generic_type() is confusingly defined and overly complex, so
rather than fix it, let's just make funcapi.c do the specific lookups
it requires for itself.

With this change, resolve_generic_type() is not used anywhere, so remove
it in HEAD.  In the back branches, leave it alone (complete with bug)
just in case any external code is using it.

While we're here, make some other refactoring adjustments in funcapi.c
with an eye to upcoming future expansion of the set of polymorphic types:

* Simplify quick-exit tests by adding an overall have_polymorphic_result
flag.  This is about a wash now but will be a win when there are more
flags.

* Reduce duplication of code between resolve_polymorphic_tupdesc() and
resolve_polymorphic_argtypes().

* Don't bother to validate correct matching of anynonarray or anyenum;
the parser should have done that, and even if it didn't, just doing
"return false" here would lead to a very confusing, off-point error
message.  (Really, "return false" in these two functions should only
occur if the call_expr isn't supplied or we can't obtain data type
info from it.)

* For the same reason, throw an elog rather than "return false" if
we fail to resolve a polymorphic type.

The bug's been there since we added anyrange, so back-patch to
all supported branches.

Discussion: https://postgr.es/m/6093.1584202130@sss.pgh.pa.us
2020-03-14 14:42:22 -04:00
e83daa7e33 Use multi-variate MCV lists to estimate ScalarArrayOpExpr
Commit 8f321bd16c added support for estimating ScalarArrayOpExpr clauses
(IN/ANY) clauses using functional dependencies. There's no good reason
not to support estimation of these clauses using multi-variate MCV lists
too, so this commits implements that. That makes the behavior consistent
and MCV lists can estimate all variants (ANY/ALL, inequalities, ...).

Author: Tomas Vondra
Review: Dean Rasheed
Discussion: https://www.postgresql.org/message-id/flat/13902317.Eha0YfKkKy%40pierred-pdoc
2020-03-14 16:13:00 +01:00
8f321bd16c Use functional dependencies to estimate ScalarArrayOpExpr
Until now functional dependencies supported only simple equality clauses
and clauses that can be trivially translated to equalities. This commit
allows estimation of some ScalarArrayOpExpr (IN/ANY) clauses.

For IN clauses we can do this thanks to using operator with equality
semantics, which means an IN clause

    WHERE c IN (1, 2, ..., N)

can be translated to

    WHERE (c = 1 OR c = 2 OR ... OR c = N)

IN clauses are now considered compatible with functional dependencies,
and rely on the same assumption of consistency of queries with data
(which is an assumption we already used for simple equality clauses).
This applies also to ALL clauses with an equality operator, which can be
considered equivalent to IN clause.

ALL clauses are still considered incompatible, although there's some
discussion about maybe relaxing this in the future.

Author: Pierre Ducroquet
Reviewed-by: Tomas Vondra, Dean Rasheed
Discussion: https://www.postgresql.org/message-id/flat/13902317.Eha0YfKkKy%40pierred-pdoc
2020-03-14 16:12:41 +01:00
1cc9c2412c Preserve replica identity index across ALTER TABLE rewrite
If an index was explicitly set as replica identity index, this setting
was lost when a table was rewritten by ALTER TABLE.  Because this
setting is part of pg_index but actually controlled by ALTER
TABLE (not part of CREATE INDEX, say), we have to do some extra work
to restore it.

Based-on-patch-by: Quan Zongliang <quanzongliang@gmail.com>
Reviewed-by: Euler Taveira <euler.taveira@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c70fcab2-4866-0d9f-1d01-e75e189db342@gmail.com
2020-03-13 11:57:06 +01:00
a029a0641c Fix test case instability introduced in 085b6b667.
I forgot that the WAL directory might hold other files besides WAL
segments, notably including new segments still being filled.
That means a blind test for the first file's size being 16MB can
fail.  Restrict based on file name length to make it more robust.

Per buildfarm.
2020-03-11 18:24:11 -04:00
b08dee24a5 Add pg_dump support for ALTER obj DEPENDS ON EXTENSION
pg_dump is oblivious to this kind of dependency, so they're lost on
dump/restores (and pg_upgrade).  Have pg_dump emit ALTER lines so that
they're preserved.  Add some pg_dump tests for the whole thing, also.

Reviewed-by: Tom Lane (offlist)
Reviewed-by: Ibrar Ahmed
Reviewed-by: Ahsan Hadi (who also reviewed commit 899a04f5ed61)
Discussion: https://postgr.es/m/20200217225333.GA30974@alvherre.pgsql
2020-03-11 16:54:54 -03:00
085b6b6679 Avoid holding a directory FD open across pg_ls_dir_files() calls.
This coding technique is undesirable because (a) it leaks the FD for
the rest of the transaction if the SRF is not run to completion, and
(b) allocated FDs are a scarce resource, but multiple interleaved
uses of the relevant functions could eat many such FDs.

In v11 and later, a query such as "SELECT pg_ls_waldir() LIMIT 1"
yields a warning about the leaked FD, and the only reason there's
no warning in earlier branches is that fd.c didn't whine about such
leaks before commit 9cb7db3f0.  Even disregarding the warning, it
wouldn't be too hard to run a backend out of FDs with careless use
of these SQL functions.

Hence, rewrite the function so that it reads the directory within
a single call, returning the results as a tuplestore rather than
via value-per-call mode.

There are half a dozen other built-in SRFs with similar problems,
but let's fix this one to start with, just to see if the buildfarm
finds anything wrong with the code.

In passing, fix bogus error report for stat() failure: it was
whining about the directory when it should be fingering the
individual file.  Doubtless a copy-and-paste error.

Back-patch to v10 where this function was added.

Justin Pryzby, with cosmetic tweaks and test cases by me

Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
2020-03-11 15:27:59 -04:00
899a04f5ed Avoid duplicates in ALTER ... DEPENDS ON EXTENSION
If the command is attempted for an extension that the object already
depends on, silently do nothing.

In particular, this means that if a database containing multiple such
entries is dumped, the restore will silently do the right thing and
record just the first one.  (At least, in a world where pg_dump does
dump such entries -- which it doesn't currently, but it will.)

Backpatch to 9.6, where this kind of dependency was introduced.

Reviewed-by: Ibrar Ahmed, Tom Lane (offlist)
Discussion: https://postgr.es/m/20200217225333.GA30974@alvherre.pgsql
2020-03-11 11:04:59 -03:00
cacef17223 Ensure that CREATE TABLE LIKE copies any NO INHERIT constraint property.
Since the documentation about LIKE doesn't say that a copied constraint
has properties different from the original, it seems that ignoring
a NO INHERIT property doesn't meet the principle of least surprise.
So make it copy that.

(Note, however, that we still don't copy a NOT VALID property;
CREATE TABLE offers no way to do that, plus it seems pointless.)

Arguably this is a bug fix; but no back-patch, as it seems barely
possible somebody is depending on the current behavior.

Ildar Musin and Chris Travers; reviewed by Amit Langote and myself

Discussion: https://postgr.es/m/CAONYFtMC6C+3AWCVp7Yd8H87Zn0GxG1_iQG6_bQKbaqYZY0=-g@mail.gmail.com
2020-03-10 14:54:00 -04:00
d01f03a495 Preserve integer and float values accurately in (de)serialize_deflist.
Previously, this code just smashed all types of DefElem values to
strings, cavalierly reasoning that nobody would care.  But in point of
fact, most of the defGetFoo functions do distinguish among different
input syntaxes; for instance defGetBoolean will accept 1 as an integer
but not "1" as a string.  This led to CREATE/ALTER TEXT SEARCH
DICTIONARY accepting 0 and 1 as values for boolean dictionary
properties, only to have the dictionary fail at runtime.

We can upgrade this behavior by teaching serialize_deflist that it
does not need to quote T_Integer or T_Float nodes' values on output,
and then teaching deserialize_deflist to restore unquoted integer or
float values as the appropriate node type.  This should not break
anything using pg_ts_dict.dictinitoption, since that field is just
defined as being something valid to include in CREATE TEXT SEARCH
DICTIONARY.

deserialize_deflist is also used to parse the options arguments
for the ts_headline family of functions, but so far as I can see
this won't cause any problems there either: the only consumer of
that output is prsd_headline which always uses defGetString.
(Really that's a bad idea, but I won't risk changing it here.)

This is surely a bug fix, but given the lack of field complaints
I don't think it's necessary to back-patch.

Discussion: https://postgr.es/m/CAMkU=1xRcs_BUPzR0+V3WndaCAv0E_m3h6aUEJ8NF-sY1nnHsw@mail.gmail.com
2020-03-10 12:30:02 -04:00
17b9e7f9fe Support adding partitioned tables to publication
When a partitioned table is added to a publication, changes of all of
its partitions (current or future) are published via that publication.

This change only affects which tables a publication considers as its
members.  The receiving side still sees the data coming from the
individual leaf partitions.  So existing restrictions that partition
hierarchies can only be replicated one-to-one are not changed by this.

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-03-10 09:09:32 +01:00
b0b5e20cd8 Show opclass and opfamily related information in psql
This commit provides psql commands for listing operator classes, operator
families and its contents in psql.  New commands will be useful for exploring
capabilities of both builtin opclasses/opfamilies as well as
opclasses/opfamilies defined in extensions.

Discussion: https://postgr.es/m/1529675324.14193.5.camel%40postgrespro.ru
Author: Sergey Cherkashin, Nikita Glukhov, Alexander Korotkov
Reviewed-by: Michael Paquier, Alvaro Herrera, Arthur Zakirov
Reviewed-by: Kyotaro Horiguchi, Andres Freund
2020-03-08 13:33:16 +03:00
a6525588b7 Allow Unicode escapes in any server encoding, not only UTF-8.
SQL includes provisions for numeric Unicode escapes in string
literals and identifiers.  Previously we only accepted those
if they represented ASCII characters or the server encoding
was UTF-8, making the conversion to internal form trivial.
This patch adjusts things so that we'll call the appropriate
encoding conversion function in less-trivial cases, allowing
the escape sequence to be accepted so long as it corresponds
to some character available in the server encoding.

This also applies to processing of Unicode escapes in JSONB.
However, the old restriction still applies to client-side
JSON processing, since that hasn't got access to the server's
encoding conversion infrastructure.

This patch includes some lexer infrastructure that simplifies
throwing errors with error cursors pointing into the middle of
a string (or other complex token).  For the moment I only used
it for errors relating to Unicode escapes, but we might later
expand the usage to some other cases.

Patch by me, reviewed by John Naylor.

Discussion: https://postgr.es/m/2393.1578958316@sss.pgh.pa.us
2020-03-06 14:17:43 -05:00
fe30e7ebfa Allow ALTER TYPE to change some properties of a base type.
Specifically, this patch allows ALTER TYPE to:
* Change the default TOAST strategy for a toastable base type;
* Promote a non-toastable type to toastable;
* Add/remove binary I/O functions for a type;
* Add/remove typmod I/O functions for a type;
* Add/remove a custom ANALYZE statistics functions for a type.

The first of these can be done by the type's owner; all the others
require superuser privilege since misuse could cause problems.

The main motivation for this patch is to allow extensions to
upgrade the feature sets of their data types, so the set of
alterable properties is biased towards that use-case.  However
it's also true that changing some other properties would be
a lot harder, as they get baked into physical storage and/or
stored expressions that depend on the type.

Along the way, refactor GenerateTypeDependencies() to make it easier
to call, refactor DefineType's volatility checks so they can be shared
by AlterType, and teach typcache.c that it might have to reload data
from the type's pg_type row, a scenario it never handled before.
Also rearrange alter_type.sgml a bit for clarity (put the
composite-type operations together).

Tomas Vondra and Tom Lane

Discussion: https://postgr.es/m/20200228004440.b23ein4qvmxnlpht@development
2020-03-06 12:19:29 -05:00
bb03010b9f Remove the "opaque" pseudo-type and associated compatibility hacks.
A long time ago, it was necessary to declare datatype I/O functions,
triggers, and language handler support functions in a very type-unsafe
way involving a single pseudo-type "opaque".  We got rid of those
conventions in 7.3, but there was still support in various places to
automatically convert such functions to the modern declaration style,
to be able to transparently re-load dumps from pre-7.3 servers.
It seems unnecessary to continue to support that anymore, so take out
the hacks; whereupon the "opaque" pseudo-type itself is no longer
needed and can be dropped.

This is part of a group of patches removing various server-side kluges
for transparently upgrading pre-8.0 dump files.  Since we've had few
complaints about dropping pg_dump's support for dumping from pre-8.0
servers (commit 64f3524e2), it seems okay to now remove these kluges.

Discussion: https://postgr.es/m/4110.1583255415@sss.pgh.pa.us
2020-03-05 15:48:56 -05:00
3ed2005ff5 Introduce macros for typalign and typstorage constants.
Our usual practice for "poor man's enum" catalog columns is to define
macros for the possible values and use those, not literal constants,
in C code.  But for some reason lost in the mists of time, this was
never done for typalign/attalign or typstorage/attstorage.  It's never
too late to make it better though, so let's do that.

The reason I got interested in this right now is the need to duplicate
some uses of the TYPSTORAGE constants in an upcoming ALTER TYPE patch.
But in general, this sort of change aids greppability and readability,
so it's a good idea even without any specific motivation.

I may have missed a few places that could be converted, and it's even
more likely that pending patches will re-introduce some hard-coded
references.  But that's not fatal --- there's no expectation that
we'd actually change any of these values.  We can clean up stragglers
over time.

Discussion: https://postgr.es/m/16457.1583189537@sss.pgh.pa.us
2020-03-04 10:34:25 -05:00
d677550493 Allow to_date/to_timestamp to recognize non-English month/day names.
to_char() has long allowed the TM (translation mode) prefix to
specify output of translated month or day names; but that prefix
had no effect in input format strings.  Now it does.  to_date()
and to_timestamp() will now recognize the same month or day names
that to_char() would output for the same format code.  Matching is
case-insensitive (per the active collation's notion of what that
means), just as it has always been for English month/day names
without the TM prefix.

(As per the discussion thread, there are lots of cases that this
feature will not handle, such as alternate day names.  But being
able to accept what to_char() will output seems useful enough.)

In passing, fix some shaky English and violations of message
style guidelines in jsonpath errors for the .datetime() method,
which depends on this code.

Juan José Santamaría Flecha, reviewed and modified by me,
with other commentary from Alvaro Herrera, Tomas Vondra,
Arthur Zakirov, Peter Eisentraut, Mark Dilger.

Discussion: https://postgr.es/m/CAC+AXB3u1jTngJcoC1nAHBf=M3v-jrEfo86UFtCqCjzbWS9QhA@mail.gmail.com
2020-03-03 11:06:47 -05:00
0b48f1335d Fix assertion failure with ALTER TABLE ATTACH PARTITION and indexes
Using ALTER TABLE ATTACH PARTITION causes an assertion failure when
attempting to work on a partitioned index, because partitioned indexes
cannot have partition bounds.

The grammar of ALTER TABLE ATTACH PARTITION requires partition bounds,
but not ALTER INDEX, so mixing ALTER TABLE with partitioned indexes is
confusing.  Hence, on HEAD, prevent ALTER TABLE to attach a partition if
the relation involved is a partitioned index.  On back-branches, as
applications may rely on the existing behavior, just remove the
culprit assertion.

Reported-by: Alexander Lakhin
Author: Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/16276-5cd1dcc8fb8be7b5@postgresql.org
Backpatch-through: 11
2020-03-03 13:55:41 +09:00
e65497df8f Report progress of streaming base backup.
This commit adds pg_stat_progress_basebackup view that reports
the progress while an application like pg_basebackup is taking
a base backup. This uses the progress reporting infrastructure
added by c16dc1aca5e0, adding support for streaming base backup.

Bump catversion.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Amit Langote, Sergei Kornilov
Discussion: https://postgr.es/m/9ed8b801-8215-1f3d-62d7-65bff53f6e94@oss.nttdata.com
2020-03-03 12:03:43 +09:00
d79fb88ac7 Preserve pg_index.indisclustered across REINDEX CONCURRENTLY
If the flag value is lost, a CLUSTER query following REINDEX
CONCURRENTLY could fail.  Non-concurrent REINDEX is already handling
this case consistently.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20200229024202.GH29456@telsasoft.com
Backpatch-through: 12
2020-03-03 10:12:28 +09:00
2f9661311b Represent command completion tags as structs
The backend was using strings to represent command tags and doing string
comparisons in multiple places, but that's slow and unhelpful.  Create a
new command list with a supporting structure to use instead; this is
stored in a tag-list-file that can be tailored to specific purposes with
a caller-definable C macro, similar to what we do for WAL resource
managers.  The first first such uses are a new CommandTag enum and a
CommandTagBehavior struct.

Replace numerous occurrences of char *completionTag with a
QueryCompletion struct so that the code no longer stores information
about completed queries in a cstring.  Only at the last moment, in
EndCommand(), does this get converted to a string.

EventTriggerCacheItem no longer holds an array of palloc’d tag strings
in sorted order, but rather just a Bitmapset over the CommandTags.

Author: Mark Dilger, with unsolicited help from Álvaro Herrera
Reviewed-by: John Naylor, Tom Lane
Discussion: https://postgr.es/m/981A9DB4-3F0C-4DA5-88AD-CB9CFF4D6CAD@enterprisedb.com
2020-03-02 18:19:51 -03:00
43a899f41f Fix corner-case loss of precision in numeric ln().
When deciding on the local rscale to use for the Taylor series
expansion, ln_var() neglected to account for the fact that the result
is subsequently multiplied by a factor of 2^(nsqrt+1), where nsqrt is
the number of square root operations performed in the range reduction
step, which can be as high as 22 for very large inputs. This could
result in a loss of precision, particularly when combined with large
rscale values, for which a large number of Taylor series terms is
required (up to around 400).

Fix by computing a few extra digits in the Taylor series, based on the
weight of the multiplicative factor log10(2^(nsqrt+1)). It remains to
be proven whether or not the other 8 extra digits used for the Taylor
series is appropriate, but this at least deals with the obvious
oversight of failing to account for the effects of the final
multiplication.

Per report from Justin AnyhowStep. Reviewed by Tom Lane.

Discussion: https://postgr.es/m/16280-279f299d9c06e56f@postgresql.org
2020-03-01 14:49:25 +00:00
58c47ccfff Correctly re-use hash tables in buildSubPlanHash().
Commit 356687bd8 omitted to remove leftover code for destroying
a hashed subplan's hash tables, with the result that the tables
were always rebuilt not reused; this leads to severe memory
leakage if a hashed subplan is re-executed enough times.
Moreover, the code for reusing the hashnulls table had a typo
that would have made it do the wrong thing if it were reached.

Looking at the code coverage report shows severe under-coverage
of the potential callers of ResetTupleHashTable, so add some test
cases that exercise them.

Andreas Karlsson and Tom Lane, per reports from Ranier Vilela
and Justin Pryzby.

Backpatch to v11, as the faulty commit was.

Discussion: https://postgr.es/m/edb62547-c453-c35b-3ed6-a069e4d6b937@proxel.se
Discussion: https://postgr.es/m/CAEudQAo=DCebm1RXtig9OH+QivpS97sMkikt0A9qHmMUs+g6ZA@mail.gmail.com
Discussion: https://postgr.es/m/20200210032547.GA1412@telsasoft.com
2020-02-29 13:48:09 -05:00
1933ae629e Add PostgreSQL home page to --help output
Per emerging standard in GNU programs and elsewhere.  Autoconf already
has support for specifying a home page, so we can just that.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
2020-02-28 13:12:21 +01:00
864934131e Refer to bug report address by symbol rather than hardcoding
Use the PACKAGE_BUGREPORT macro that is created by Autoconf for
referring to the bug reporting address rather than hardcoding it
everywhere.  This makes it easier to change the address and it reduces
translation work.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
2020-02-28 13:12:21 +01:00
b9b408c487 Record parents of triggers
This let us get rid of a recently introduced ugly hack (commit
1fa846f1c9af).

Author: Álvaro Herrera
Reviewed-by: Amit Langote, Tom Lane
Discussion: https://postgr.es/m/20200217215641.GA29784@alvherre.pgsql
2020-02-27 13:23:33 -03:00
a477bfc1df Suppress unnecessary RelabelType nodes in more cases.
eval_const_expressions sometimes produced RelabelType nodes that
were useless because they just relabeled an expression to the same
exposed type it already had.  This is worth avoiding because it can
cause two equivalent expressions to not be equal(), preventing
recognition of useful optimizations.  In the test case added here,
an unpatched planner fails to notice that the "sqli = constant" clause
renders a sort step unnecessary, because one code path produces an
extra RelabelType and another doesn't.

Fix by ensuring that eval_const_expressions_mutator's T_RelabelType
case will not add in an unnecessary RelabelType.  Also save some
code by sharing a subroutine with the effectively-equivalent cases
for CollateExpr and CoerceToDomain.  (CollateExpr had no bug, and
I think that the case couldn't arise with CoerceToDomain, but
it seems prudent to do the same check for all three cases.)

Back-patch to v12.  In principle this has been wrong all along,
but I haven't seen a case where it causes visible misbehavior
before v12, so refrain from changing stable branches unnecessarily.

Per investigation of a report from Eric Gillum.

Discussion: https://postgr.es/m/CAMmjdmvAZsUEskHYj=KT9sTukVVCiCSoe_PBKOXsncFeAUDPCQ@mail.gmail.com
2020-02-26 18:14:12 -05:00
0d861bbb70 Add deduplication to nbtree.
Deduplication reduces the storage overhead of duplicates in indexes that
use the standard nbtree index access method.  The deduplication process
is applied lazily, after the point where opportunistic deletion of
LP_DEAD-marked index tuples occurs.  Deduplication is only applied at
the point where a leaf page split would otherwise be required.  New
posting list tuples are formed by merging together existing duplicate
tuples.  The physical representation of the items on an nbtree leaf page
is made more space efficient by deduplication, but the logical contents
of the page are not changed.  Even unique indexes make use of
deduplication as a way of controlling bloat from duplicates whose TIDs
point to different versions of the same logical table row.

The lazy approach taken by nbtree has significant advantages over a GIN
style eager approach.  Most individual inserts of index tuples have
exactly the same overhead as before.  The extra overhead of
deduplication is amortized across insertions, just like the overhead of
page splits.  The key space of indexes works in the same way as it has
since commit dd299df8 (the commit that made heap TID a tiebreaker
column).

Testing has shown that nbtree deduplication can generally make indexes
with about 10 or 15 tuples for each distinct key value about 2.5X - 4X
smaller, even with single column integer indexes (e.g., an index on a
referencing column that accompanies a foreign key).  The final size of
single column nbtree indexes comes close to the final size of a similar
contrib/btree_gin index, at least in cases where GIN's posting list
compression isn't very effective.  This can significantly improve
transaction throughput, and significantly reduce the cost of vacuuming
indexes.

A new index storage parameter (deduplicate_items) controls the use of
deduplication.  The default setting is 'on', so all new B-Tree indexes
automatically use deduplication where possible.  This decision will be
reviewed at the end of the Postgres 13 beta period.

There is a regression of approximately 2% of transaction throughput with
synthetic workloads that consist of append-only inserts into a table
with several non-unique indexes, where all indexes have few or no
repeated values.  The underlying issue is that cycles are wasted on
unsuccessful attempts at deduplicating items in non-unique indexes.
There doesn't seem to be a way around it short of disabling
deduplication entirely.  Note that deduplication of items in unique
indexes is fairly well targeted in general, which avoids the problem
there (we can use a special heuristic to trigger deduplication passes in
unique indexes, since we're specifically targeting "version bloat").

Bump XLOG_PAGE_MAGIC because xl_btree_vacuum changed.

No bump in BTREE_VERSION, since the representation of posting list
tuples works in a way that's backwards compatible with version 4 indexes
(i.e. indexes built on PostgreSQL 12).  However, users must still
REINDEX a pg_upgrade'd index to use deduplication, regardless of the
Postgres version they've upgraded from.  This is the only way to set the
new nbtree metapage flag indicating that deduplication is generally
safe.

Author: Anastasia Lubennikova, Peter Geoghegan
Reviewed-By: Peter Geoghegan, Heikki Linnakangas
Discussion:
    https://postgr.es/m/55E4051B.7020209@postgrespro.ru
    https://postgr.es/m/4ab6e2db-bcee-f4cf-0916-3a06e6ccbb55@postgrespro.ru
2020-02-26 13:05:30 -08:00
612a1ab767 Add equalimage B-Tree support functions.
Invent the concept of a B-Tree equalimage ("equality implies image
equality") support function, registered as support function 4.  This
indicates whether it is safe (or not safe) to apply optimizations that
assume that any two datums considered equal by an operator class's order
method must be interchangeable without any loss of semantic information.
This is static information about an operator class and a collation.

Register an equalimage routine for almost all of the existing B-Tree
opclasses.  We only need two trivial routines for all of the opclasses
that are included with the core distribution.  There is one routine for
opclasses that index non-collatable types (which returns 'true'
unconditionally), plus another routine for collatable types (which
returns 'true' when the collation is a deterministic collation).

This patch is infrastructure for an upcoming patch that adds B-Tree
deduplication.

Author: Peter Geoghegan, Anastasia Lubennikova
Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
2020-02-26 11:28:25 -08:00
b2304a7174 Simplify FK-to-partitioned regression test query
Avoid a join between relations having the FK to detect FK violation.
The planner might optimize this considering the PK must exist on the
referenced side at some point, effectively masking a bug this test
tries to detect.

Tom Lane and Jehan-Guillaume de Rorthais
Discussion: https://postgr.es/m/467.1581270529@sss.pgh.pa.us
2020-02-20 14:14:20 -03:00
70a7732007 Remove support for upgrading extensions from "unpackaged" state.
Andres Freund pointed out that allowing non-superusers to run
"CREATE EXTENSION ... FROM unpackaged" has security risks, since
the unpackaged-to-1.0 scripts don't try to verify that the existing
objects they're modifying are what they expect.  Just attaching such
objects to an extension doesn't seem too dangerous, but some of them
do more than that.

We could have resolved this, perhaps, by still requiring superuser
privilege to use the FROM option.  However, it's fair to ask just what
we're accomplishing by continuing to lug the unpackaged-to-1.0 scripts
forward.  None of them have received any real testing since 9.1 days,
so they may not even work anymore (even assuming that one could still
load the previous "loose" object definitions into a v13 database).
And an installation that's trying to go from pre-9.1 to v13 or later
in one jump is going to have worse compatibility problems than whether
there's a trivial way to convert their contrib modules into extension
style.

Hence, let's just drop both those scripts and the core-code support
for "CREATE EXTENSION ... FROM".

Discussion: https://postgr.es/m/20200213233015.r6rnubcvl4egdh5r@alap3.anarazel.de
2020-02-19 16:59:14 -05:00
761a5688b1 Fix confusion about event trigger vs. plain function in plpgsql.
The function hash table keys made by compute_function_hashkey() failed
to distinguish event-trigger call context from regular call context.
This meant that once we'd successfully made a hash entry for an event
trigger (either by validation, or by normal use as an event trigger),
an attempt to call the trigger function as a plain function would
find this hash entry and thereby bypass the you-can't-do-that check in
do_compile().  Thus we'd attempt to execute the function, leading to
strange errors or even crashes, depending on function contents and
server version.

To fix, add an isEventTrigger field to PLpgSQL_func_hashkey,
paralleling the longstanding infrastructure for regular triggers.
This fits into what had been pad space, so there's no risk of an ABI
break, even assuming that any third-party code is looking at these
hash keys.  (I considered replacing isTrigger with a PLpgSQL_trigtype
enum field, but felt that that carried some API/ABI risk.  Maybe we
should change it in HEAD though.)

Per bug #16266 from Alexander Lakhin.  This has been broken since
event triggers were invented, so back-patch to all supported branches.

Discussion: https://postgr.es/m/16266-fcd7f838e97ba5d4@postgresql.org
2020-02-19 14:45:17 -05:00
b7e51b350c Make inherited LOCK TABLE perform access permission checks on parent table only.
Previously, LOCK TABLE command through a parent table checked
the permissions on not only the parent table but also the children
tables inherited from it. This was a bug and inherited queries should
perform access permission checks on the parent table only. This
commit fixes LOCK TABLE so that it does not check the permission
on children tables.

This patch is applied only in the master branch. We decided not to
back-patch because it's not hard to imagine that there are some
applications expecting the old behavior and the change breaks their
security.

Author: Amit Langote
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CAHGQGwE+GauyG7POtRfRwwthAGwTjPQYdFHR97+LzA4RHGnJxA@mail.gmail.com
2020-02-18 13:13:15 +09:00
751c63cea0 Remove pg_regress' --load-language option.
We haven't used this option since inventing extensions.  As of commit
50fc694e4 it's actually formally equivalent to --load-extension, so
let's just drop it.

Discussion: https://postgr.es/m/6853.1581627393@sss.pgh.pa.us
2020-02-14 11:20:07 -05:00
997563dfcb Try to harden insert-conflict-specconflict against autovacuum.
Looks like guaibasaurus had a autovacuum running during the
controller_print_speculative_locks step (just added in
43e08419708). Which does indeed seem quite possible.

Avoid the problem by only looking for the backends participating in
the test.
2020-02-11 21:18:14 -08:00
43e0841970 Test additional speculative conflict scenarios.
Previously, the speculative insert tests did not cover the case when a
tuple t is inserted into a table with a unique index on a column but
before it can insert into the index, a concurrent transaction has
inserted a conflicting value into the index and the insertion of tuple t
must be aborted.

The basic permutation is one session successfully inserts into the table
and an associated unique index while a concurrent session successfully
inserts into the table but discovers a conflict before inserting into
the index and must abort the insertion.

Several variants on this include:
- swap which session is successful
- first session insert transaction does not commit, so second session
must wait on a transaction lock
- first session insert does not "complete", so second session must wait
on a speculative insertion lock

Also, refactor the existing TOAST table upsert test to be in the same
spec and reuse the steps.

Author: Melanie Plageman, Ashwin Agrawal, Andres Freund
Reviewed-by: Andres Freund, Taylor Vesely
Discussion: https://postgr.es/m/CAAKRu_ZRmxy_OEryfY3G8Zp01ouhgw59_-_Cm8n7LzRH5BAvng@mail.gmail.com
2020-02-11 16:32:11 -08:00
55173d2e66 Fix failure to create FKs correctly in partitions
On a multi-level partioned table, when adding a partition not directly
connected to the root table, foreign key constraints referencing the
root were not cloned to the new partition, leading to the FK being
possibly inadvertently violated later on.

This was caused by fuzzy thinking in CloneFkReferenced (commit
f56f8f8da6af): it was skipping constraints marked as having parents on
the theory that cloning those would create duplicates; but that's only
correct for the top level of the partitioning hierarchy.  For levels
below that one, such constraints must still be considered and only
skipped if later on we see that we'd create duplicates.  Apparently, I
(Álvaro) wrote the comments right but the code implemented something
slightly different.

Author: Jehan-Guillaume de Rorthais
Discussion: https://postgr.es/m/20200206004948.238352db@firost
2020-02-07 18:27:18 -03:00
9710d3d4a8 Fix TRUNCATE .. CASCADE on partitions
When running TRUNCATE CASCADE on a child of a partitioned table
referenced by another partitioned table, the truncate was not applied to
partitions of the referencing table; this could leave rows violating the
constraint in the referencing partitioned table.  Repair by walking the
pg_constraint chain all the way up to the topmost referencing table.

Note: any partitioned tables containing FKs that reference other
partitioned tables should be checked for possible violating rows, if
TRUNCATE has occurred in partitions of the referenced table.

Reported-by: Christophe Courtois
Author: Jehan-Guillaume de Rorthais
Discussion: https://postgr.es/m/20200204183906.115f693e@firost
2020-02-07 17:09:36 -03:00
cb5b28613d Fix bug in Tid scan.
Commit 147e3722f7 changed Tid scan so that it calls table_beginscan()
and uses the scan option for seq scan. This change caused two issues.

(1) The change caused Tid scan to take a predicate lock on the entire
       relation in serializable transaction even when relation-level
       lock is not necessary. This could lead to an unexpected
       serialization error.

(2) The change caused Tid scan to increment the number of seq_scan
       in pg_stat_*_tables views even though it's not seq scan. This
       could confuse the users.

This commit adds the scan option for Tid scan and makes Tid scan
use it, to avoid those issues.

Back-patch to v12, where the bug was introduced.

Author: Tatsuhito Kasahara
Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/CAP0=ZVKy+gTbFmB6X_UW0pP3WaeJ-fkUWHoD-pExS=at3CY76g@mail.gmail.com
2020-02-07 22:06:31 +09:00
414c2fd1e1 Revert "Add GUC checks for ssl_min_protocol_version and ssl_max_protocol_version"
This reverts commit 41aadee, as the GUC checks could run on older values
with the new values used, and result in incorrect errors if both
parameters are changed at the same time.

Per complaint from Tom Lane.

Discussion: https://postgr.es/m/27574.1581015893@sss.pgh.pa.us
Backpatch-through: 12
2020-02-07 08:10:40 +09:00
b025f32e0b Add leader_pid to pg_stat_activity
This new field tracks the PID of the group leader used with parallel
query.  For parallel workers and the leader, the value is set to the
PID of the group leader.  So, for the group leader, the value is the
same as its own PID.  Note that this reflects what PGPROC stores in
shared memory, so as leader_pid is NULL if a backend has never been
involved in parallel query.  If the backend is using parallel query or
has used it at least once, the value is set until the backend exits.

Author: Julien Rouhaud
Reviewed-by: Sergei Kornilov, Guillaume Lelarge, Michael Paquier, Tomas
Vondra
Discussion: https://postgr.es/m/CAOBaU_Yy5bt0vTPZ2_LUM6cUcGeqmYNoJ8-Rgto+c2+w3defYA@mail.gmail.com
2020-02-06 09:18:06 +09:00
bf6cc19e34 Force tuple conversion when the source has missing attributes.
Tuple conversion incorrectly concluded that no conversion was needed
as long as all the attributes lined up. But if the source tuple has a
missing attribute (from addition of a column with default), then the
destination tupdesc might not reflect the same default. The typical
symptom was that the affected columns would be unexpectedly NULL.

Repair by always forcing conversion if the source has missing
attributes, which will be filled in by the deform operation. (In
theory we could optimize for when the destination has the same
default, but that seemed overkill.)

Backpatch to 11 where missing attributes were added.

Per bug #16242.

Vik Fearing (discovery, code, testing) and me (analysis, testcase).

Discussion: https://postgr.es/m/16242-d1c9fca28445966b@postgresql.org
2020-02-05 20:21:20 +00:00