Commit Graph

7547 Commits

Author SHA1 Message Date
5262f7a4fc Add optimizer and executor support for parallel index scans.
In combination with 569174f1be92be93f5366212cc46960d28a5c5cd, which
taught the btree AM how to perform parallel index scans, this allows
parallel index scan plans on btree indexes.  This infrastructure
should be general enough to support parallel index scans for other
index AMs as well, if someone updates them to support parallel
scans.

Amit Kapila, reviewed and tested by Anastasia Lubennikova, Tushar
Ahuja, and Haribabu Kommi, and me.
2017-02-15 13:53:24 -05:00
51ee6f3160 Replace min_parallel_relation_size with two new GUCs.
When min_parallel_relation_size was added, the only supported type
of parallel scan was a parallel sequential scan, but there are
pending patches for parallel index scan, parallel index-only scan,
and parallel bitmap heap scan.  Those patches introduce two new
types of complications: first, what's relevant is not really the
total size of the relation but the portion of it that we will scan;
and second, index pages and heap pages shouldn't necessarily be
treated in exactly the same way.  Typically, the number of index
pages will be quite small, but that doesn't necessarily mean that
a parallel index scan can't pay off.

Therefore, we introduce min_parallel_table_scan_size, which works
out a degree of parallelism for scans based on the number of table
pages that will be scanned (and which is therefore equivalent to
min_parallel_relation_size for parallel sequential scans) and also
min_parallel_index_scan_size which can be used to work out a degree
of parallelism based on the number of index pages that will be
scanned.

Amit Kapila and Robert Haas

Discussion: http://postgr.es/m/CAA4eK1KowGSYYVpd2qPpaPPA5R90r++QwDFbrRECTE9H_HvpOg@mail.gmail.com
Discussion: http://postgr.es/m/CAA4eK1+TnM4pXQbvn7OXqam+k_HZqb0ROZUMxOiL6DWJYCyYow@mail.gmail.com
2017-02-15 13:37:24 -05:00
5d40286985 Fix wrong articles in pg_proc descriptions.
This technically should involve a catversion bump, but that seems
pedantic, so I skipped it.

Report and patch by David Christensen.
2017-02-15 12:13:38 -05:00
6d16ecc646 Add CREATE COLLATION IF NOT EXISTS clause
The core of the functionality was already implemented when
pg_import_system_collations was added.  This just exposes it as an
option in the SQL command.
2017-02-15 10:01:28 -05:00
569174f1be btree: Support parallel index scans.
This isn't exposed to the optimizer or the executor yet; we'll add
support for those things in a separate patch.  But this puts the
basic mechanism in place: several processes can attach to a parallel
btree index scan, and each one will get a subset of the tuples that
would have been produced by a non-parallel scan.  Each index page
becomes the responsibility of a single worker, which then returns
all of the TIDs on that page.

Rahila Syed, Amit Kapila, Robert Haas, reviewed and tested by
Anastasia Lubennikova, Tushar Ahuja, and Haribabu Kommi.
2017-02-15 07:41:14 -05:00
5e6d8d2bbb Allow parallel workers to execute subplans.
This doesn't do anything to make Param nodes anything other than
parallel-restricted, so this only helps with uncorrelated subplans,
and it's not necessarily very cheap because each worker will run the
subplan separately (just as a Hash Join will build a separate copy of
the hash table in each participating process), but it's a first step
toward supporting cases that are more likely to help in practice, and
is occasionally useful on its own.

Amit Kapila, reviewed and tested by Rafia Sabih, Dilip Kumar, and
me.

Discussion: http://postgr.es/m/CAA4eK1+e8Z45D2n+rnDMDYsVEb5iW7jqaCH_tvPMYau=1Rru9w@mail.gmail.com
2017-02-14 18:16:03 -05:00
8da9a22636 Split index xlog headers from other private index headers.
The xlog-specific headers need to be included in both frontend code -
specifically, pg_waldump - and the backend, but the remainder of the
private headers for each index are only needed by the backend.  By
splitting the xlog stuff out into separate headers, pg_waldump pulls
in fewer backend headers, which is a good thing.

Patch by me, reviewed by Michael Paquier and Andres Freund, per a
complaint from Dilip Kumar.

Discussion: http://postgr.es/m/CA+TgmoZ=F=GkxV0YEv-A8tb+AEGy_Qa7GSiJ8deBKFATnzfEug@mail.gmail.com
2017-02-14 15:37:59 -05:00
fb47544d0c Minor fixes for WAL consistency checking.
Michael Paquier, reviewed and slightly revised by me.

Discussion: http://postgr.es/m/CAB7nPqRzCQb=vdfHvMtP0HMLBHU6z1aGdo4GJsUP-HP8jx+Pkw@mail.gmail.com
2017-02-14 12:41:01 -05:00
2ea5b06c7a Add CREATE SEQUENCE AS <data type> clause
This stores a data type, required to be an integer type, with the
sequence.  The sequences min and max values default to the range
supported by the type, and they cannot be set to values exceeding that
range.  The internal implementation of the sequence is not affected.

Change the serial types to create sequences of the appropriate type.
This makes sure that the min and max values of the sequence for a serial
column match the range of values supported by the table column.  So the
sequence can no longer overflow the table column.

This also makes monitoring for sequence exhaustion/wraparound easier,
which currently requires various contortions to cross-reference the
sequences with the table columns they are used with.

This commit also effectively reverts the pg_sequence column reordering
in f3b421da5f4addc95812b9db05a24972b8fd9739, because the new seqtypid
column allows us to fill the hole in the struct and create a more
natural overall column ordering.

Reviewed-by: Steve Singer <steve@ssinger.info>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2017-02-10 15:34:35 -05:00
85c11324ca Rename user-facing tools with "xlog" in the name to say "wal".
This means pg_receivexlog because pg_receivewal, pg_resetxlog
becomes pg_resetwal, and pg_xlogdump becomes pg_waldump.
2017-02-09 16:23:46 -05:00
806091c96f Remove all references to "xlog" from SQL-callable functions in pg_proc.
Commit f82ec32ac30ae7e3ec7c84067192535b2ff8ec0e renamed the pg_xlog
directory to pg_wal.  To make things consistent, and because "xlog" is
terrible terminology for either "transaction log" or "write-ahead log"
rename all SQL-callable functions that contain "xlog" in the name to
instead contain "wal".  (Note that this may pose an upgrade hazard for
some users.)

Similarly, rename the xlog_position argument of the functions that
create slots to be called wal_position.

Discussion: https://www.postgresql.org/message-id/CA+Tgmob=YmA=H3DbW1YuOXnFVgBheRmyDkWcD9M8f=5bGWYEoQ@mail.gmail.com
2017-02-09 15:10:09 -05:00
72257f9578 simplehash: Additional tweaks to make specifying an allocator work.
Even if we don't emit definitions for SH_ALLOCATE and SH_FREE, we
still need prototypes.  The user can't define them before including
simplehash.h because SH_TYPE isn't available yet.

For the allocator to be able to access private_data, it needs to
become an argument to SH_CREATE.  Previously we relied on callers
to set that after returning from SH_CREATE, but SH_CREATE calls
SH_ALLOCATE before returning.

Dilip Kumar, reviewed by me.
2017-02-09 14:59:57 -05:00
86d911ec0f Allow index AMs to cache data across aminsert calls within a SQL command.
It's always been possible for index AMs to cache data across successive
amgettuple calls within a single SQL command: the IndexScanDesc.opaque
field is meant for precisely that.  However, no comparable facility
exists for amortizing setup work across successive aminsert calls.
This patch adds such a feature and teaches GIN, GIST, and BRIN to use it
to amortize catalog lookups they'd previously been doing on every call.
(The other standard index AMs keep everything they need in the relcache,
so there's little to improve there.)

For GIN, the overall improvement in a statement that inserts many rows
can be as much as 10%, though it seems a bit less for the other two.
In addition, this makes a really significant difference in runtime
for CLOBBER_CACHE_ALWAYS tests, since in those builds the repeated
catalog lookups are vastly more expensive.

The reason this has been hard up to now is that the aminsert function is
not passed any useful place to cache per-statement data.  What I chose to
do is to add suitable fields to struct IndexInfo and pass that to aminsert.
That's not widening the index AM API very much because IndexInfo is already
within the ken of ambuild; in fact, by passing the same info to aminsert
as to ambuild, this is really removing an inconsistency in the AM API.

Discussion: https://postgr.es/m/27568.1486508680@sss.pgh.pa.us
2017-02-09 11:52:12 -05:00
a507b86900 Add WAL consistency checking facility.
When the new GUC wal_consistency_checking is set to a non-empty value,
it triggers recording of additional full-page images, which are
compared on the standby against the results of applying the WAL record
(without regard to those full-page images).  Allowable differences
such as hints are masked out, and the resulting pages are compared;
any difference results in a FATAL error on the standby.

Kuntal Ghosh, based on earlier patches by Michael Paquier and Heikki
Linnakangas.  Extensively reviewed and revised by Michael Paquier and
by me, with additional reviews and comments from Amit Kapila, Álvaro
Herrera, Simon Riggs, and Peter Eisentraut.
2017-02-08 15:45:30 -05:00
c3c4f6e174 Revise the way the element allocator for a simplehash is specified.
This method is more elegant and more efficient.

Per a suggestion from Andres Freund, who also briefly reviewed
the patch.
2017-02-07 17:10:08 -05:00
ac8eb972f2 Avoid redefining simplehash_allocate/simplehash_free.
There's no generic guard against multiple inclusion in this file,
for good reason.  But these typedefs need one, as per a report
from Jeff Janes.
2017-02-07 16:20:05 -05:00
565903af47 Allow the element allocator for a simplehash to be specified.
This is infrastructure for a pending patch to allow parallel bitmap
heap scans.

Dilip Kumar, reviewed (in earlier versions) by Andres Freund and
(more recently) by me.  Some further renaming by me, also.
2017-02-07 16:01:44 -05:00
293e24e507 Cache hash index's metapage in rel->rd_amcache.
This avoids a very significant amount of buffer manager traffic and
contention when scanning hash indexes, because it's no longer
necessary to lock and pin the metapage for every scan.  We do need
some way of figuring out when the cache is too stale to use any more,
so that when we lock the primary bucket page to which the cached
metapage points us, we can tell whether a split has occurred since we
cached the metapage data.  To do that, we use the hash_prevblkno field
in the primary bucket page, which would otherwise always be set to
InvalidBuffer.

This patch contains code so that it will continue working (although
less efficiently) with hash indexes built before this change, but
perhaps we should consider bumping the hash version and ripping out
the compatibility code.  That decision can be made later, though.

Mithun Cy, reviewed by Jesper Pedersen, Amit Kapila, and by me.
Before committing, I made a number of cosmetic changes to the last
posted version of the patch, adjusted _hash_getcachedmetap to be more
careful about order of operation, and made some necessary updates to
the pageinspect documentation and regression tests.
2017-02-07 12:35:45 -05:00
181bdb90ba Fix typos in comments.
Backpatch to all supported versions, where applicable, to make backpatching
of future fixes go more smoothly.

Josh Soref

Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com
2017-02-06 11:33:58 +02:00
08bf6e5295 pageinspect: Support hash indexes.
Patch by Jesper Pedersen and Ashutosh Sharma, with some error handling
improvements by me.  Tests from Peter Eisentraut.  Reviewed by Álvaro
Herrera, Michael Paquier, Jesper Pedersen, Jeff Janes, Peter
Eisentraut, Amit Kapila, Mithun Cy, and me.

Discussion: http://postgr.es/m/e2ac6c58-b93f-9dd9-f4e6-d6d30add7fdf@redhat.com
2017-02-02 14:19:32 -05:00
f1169ab501 Don't count background workers against a user's connection limit.
Doing so doesn't seem to be within the purpose of the per user
connection limits, and has particularly unfortunate effects in
conjunction with parallel queries.

Backpatch to 9.6 where parallel queries were introduced.

David Rowley, reviewed by Robert Haas and Albe Laurenz.
2017-02-01 18:02:43 -05:00
aedd554f84 Fix CatalogTupleInsert/Update abstraction for case of shared indstate.
Add CatalogTupleInsertWithInfo and CatalogTupleUpdateWithInfo to let
callers use the CatalogTupleXXX abstraction layer even in cases where
we want to share the results of CatalogOpenIndexes across multiple
inserts/updates for efficiency.  This finishes the job begun in commit
2f5c9d9c9, by allowing some remaining simple_heap_insert/update
calls to be replaced.  The abstraction layer is now complete enough
that we don't have to export CatalogIndexInsert at all anymore.

Also, this fixes several places in which 2f5c9d9c9 introduced performance
regressions by using retail CatalogTupleInsert or CatalogTupleUpdate even
though the previous coding had been able to amortize CatalogOpenIndexes
work across multiple tuples.

A possible future improvement is to arrange for the indexing.c functions
to cache the CatalogIndexState somewhere, maybe in the relcache, in which
case we could get rid of CatalogTupleInsertWithInfo and
CatalogTupleUpdateWithInfo again.  But that's a task for another day.

Discussion: https://postgr.es/m/27502.1485981379@sss.pgh.pa.us
2017-02-01 17:18:36 -05:00
ab02896510 Provide CatalogTupleDelete() as a wrapper around simple_heap_delete().
This extends the work done in commit 2f5c9d9c9 to provide a more nearly
complete abstraction layer hiding the details of index updating for catalog
changes.  That commit only invented abstractions for catalog inserts and
updates, leaving nearby code for catalog deletes still calling the
heap-level routines directly.  That seems rather ugly from here, and it
does little to help if we ever want to shift to a storage system in which
indexing work is needed at delete time.

Hence, create a wrapper function CatalogTupleDelete(), and replace calls
of simple_heap_delete() on catalog tuples with it.  There are now very
few direct calls of [simple_]heap_delete remaining in the tree.

Discussion: https://postgr.es/m/462.1485902736@sss.pgh.pa.us
2017-02-01 16:13:30 -05:00
dbd69118c0 Replace isMD5() with a more future-proof way to check if pw is encrypted.
The rule is that if pg_authid.rolpassword begins with "md5" and has the
right length, it's an MD5 hash, otherwise it's a plaintext password. The
idiom has been to use isMD5() to check for that, but that gets awkward,
when we add new kinds of verifiers, like the verifiers for SCRAM
authentication in the pending SCRAM patch set. Replace isMD5() with a new
get_password_type() function, so that when new verifier types are added, we
don't need to remember to modify every place that currently calls isMD5(),
to also recognize the new kinds of verifiers.

Also, use the new plain_crypt_verify function in passwordcheck, so that it
doesn't need to know about MD5, or in the future, about other kinds of
hashes or password verifiers.

Reviewed by Michael Paquier and Peter Eisentraut.

Discussion: https://www.postgresql.org/message-id/2d07165c-1793-e243-a2a9-e45b624c7580@iki.fi
2017-02-01 13:11:37 +02:00
2f5c9d9c9c Tweak catalog indexing abstraction for upcoming WARM
Split the existing CatalogUpdateIndexes into two different routines,
CatalogTupleInsert and CatalogTupleUpdate, which do both the heap
insert/update plus the index update.  This removes over 300 lines of
boilerplate code all over src/backend/catalog/ and src/backend/commands.
The resulting code is much more pleasing to the eye.

Also, by encapsulating what happens in detail during an UPDATE, this
facilitates the upcoming WARM patch, which is going to add a few more
lines to the update case making the boilerplate even more boring.

The original CatalogUpdateIndexes is removed; there was only one use
left, and since it's just three lines, we can as well expand it in place
there.  We could keep it, but WARM is going to break all the UPDATE
out-of-core callsites anyway, so there seems to be no benefit in doing
so.

Author: Pavan Deolasee
Discussion: https://www.postgr.es/m/CABOikdOcFYSZ4vA2gYfs=M2cdXzXX4qGHeEiW3fu9PCfkHLa2A@mail.gmail.com
2017-01-31 18:42:24 -03:00
de16ab7238 Invent pg_hba_file_rules view to show the content of pg_hba.conf.
This view is designed along the same lines as pg_file_settings, to wit
it shows what is currently in the file, not what the postmaster has
loaded as the active settings.  That allows it to be used to pre-vet
edits before issuing SIGHUP.  As with the earlier view, go out of our
way to allow errors in the file to be reflected in the view, to assist
that use-case.

(We might at some point invent a view to show the current active settings,
but this is not that patch; and it's not trivial to do.)

Haribabu Kommi, reviewed by Ashutosh Bapat, Michael Paquier, Simon Riggs,
and myself

Discussion: https://postgr.es/m/CAJrrPGerH4jiwpcXT1-46QXUDmNp2QDrG9+-Tek_xC8APHShYw@mail.gmail.com
2017-01-30 18:00:26 -05:00
e54f75722c Handle ALTER EXTENSION ADD/DROP with pg_init_privs
In commit 6c268df, pg_init_privs was added to track the initial
privileges of catalog objects and extensions.  Unfortunately, that
commit didn't include understanding of ALTER EXTENSION ADD/DROP, which
allows the objects associated with an extension to be changed after the
initial CREATE EXTENSION script has been run.

The result of this meant that ACLs for objects added through
ALTER EXTENSION ADD were not recorded into pg_init_privs and we would
end up including those ACLs in pg_dump when we shouldn't have.

This commit corrects that by making sure to have pg_init_privs updated
when ALTER EXTENSION ADD/DROP is run, recording the permissions as they
are at ALTER EXTENSION ADD time, and removing any if/when ALTER
EXTENSION DROP is called.

This issue was pointed out by Moshe Jacobson as commentary on bug #14456
(which was actually a bug about versions prior to 9.6 not handling
custom ACLs on extensions correctly, an issue now addressed with
pg_init_privs in 9.6).

Back-patch to 9.6 where pg_init_privs was introduced.
2017-01-29 23:05:07 -05:00
da08a65989 Refactor bitmap heap scan estimation of heap pages fetched.
Currently, we only need this logic in order to cost a Bitmap Heap
Scan.  But a pending patch for Parallel Bitmap Heap Scan also uses
it to help figure out how many workers to use for the scan, which
has to be determined prior to costing.  So, move the logic to
a separate function to make that easier.

Dilip Kumar.  The patch series of which this is a part has been
reviewed by Andres Freund, Amit Khendekar, Tushar Ahuja, Rafia
Sabih, Haribabu Kommi, and me; it is not clear from the email
discussion which of those people have looked specifically at this
part.

Discussion: http://postgr.es/m/CAFiTN-v3QYNJEZnnmKCeATuLbN-h9tMVfeEF0+BrouYDqjXgwg@mail.gmail.com
2017-01-27 16:28:47 -05:00
fefb86b147 Orthography fixes for new castNode() macro.
Clean up hastily-composed comment.  Normalize whitespace.

Erik Rijkers and myself
2017-01-27 08:33:58 -05:00
5bcab11142 Add castNode(type, ptr) for safe casting between NodeTag based types.
The new function allows to cast from one NodeTag based type to
another, while asserting that the conversion is valid.  This replaces
the common pattern of doing a cast and a Assert(IsA(ptr, type))
close-by.

As this seems likely to be used pervasively, we decided to backpatch
this change the addition of this macro. Otherwise backpatched fixes
are more likely not to work on back-branches.

On branches before 9.6, where we do not yet rely on inline functions
being available, the type assertion is only performed if PG_USE_INLINE
support is detected. The cast obviously is performed regardless.

For the benefit of verifying the macro compiles in the back-branches,
this commit contains a single use of the new macro. On master, a
somewhat larger conversion will be committed separately.

Author: Peter Eisentraut and Andres Freund
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/c5d387d9-3440-f5e0-f9d4-71d53b9fbe52@2ndquadrant.com
Backpatch: 9.2-
2017-01-26 16:47:03 -08:00
bdadf36eb4 Fix typo in description for pg_replication_origin_advance function. 2017-01-27 00:42:33 +09:00
f7c6246240 Introduce convenience macros to hide JsonbContainer header accesses better.
This improves readability a bit and may make future improvements easier.

In passing, make sure that the JB_ROOT_IS_XXX macros deliver boolean (0/1)
results; the previous coding was a bug hazard, though no actual bugs are
known.

Nikita Glukhov, extended a bit by me

Discussion: https://postgr.es/m/9e21a39c-c1d7-b9b5-44a0-c5345a5029f6@postgrespro.ru
2017-01-25 13:28:38 -05:00
3d9e73ea5f Update copyright years in some recently added files 2017-01-25 12:32:05 -05:00
aebeb4790c Remove vestigial resolveUnknown arguments from transformSortClause etc.
There's really no situation where we don't want these unknown-to-text
conversions to happen.  The alternative is failure anyway, and the one
caller that was passing "false" did so only because it expected the
case could not arise.  Might as well simplify the code.

Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 09:35:00 -05:00
d8d32d9a56 Make UNKNOWN into an actual pseudo-type.
Previously, type "unknown" was labeled as a base type in pg_type, which
perhaps had some sense to it because you were allowed to create tables with
unknown-type columns.  But now that we don't allow that, it makes more
sense to label it a pseudo-type.  This has the additional effects of
forbidding use of "unknown" as a domain base type, cast source or target
type, PL function argument or result type, or plpgsql local variable type;
all of which seem like good holes to plug.

Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 09:27:09 -05:00
1e7c4bb004 Change unknown-type literals to type text in SELECT and RETURNING lists.
Previously, we left such literals alone if the query or subquery had
no properties forcing a type decision to be made (such as an ORDER BY or
DISTINCT clause using that output column).  This meant that "unknown" could
be an exposed output column type, which has never been a great idea because
it could result in strange failures later on.  For example, an outer query
that tried to do any operations on an unknown-type subquery output would
generally fail with some weird error like "failed to find conversion
function from unknown to text" or "could not determine which collation to
use for string comparison".  Also, if the case occurred in a CREATE VIEW's
query then the view would have an unknown-type column, causing similar
failures in queries trying to use the view.

To fix, at the tail end of parse analysis of a query, forcibly convert any
remaining "unknown" literals in its SELECT or RETURNING list to type text.
However, provide a switch to suppress that, and use it in the cases of
SELECT inside a set operation or INSERT command.  In those cases we already
had type resolution rules that make use of context information from outside
the subquery proper, and we don't want to change that behavior.

Also, change creation of an unknown-type column in a relation from a
warning to a hard error.  The error should be unreachable now in CREATE
VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown"
in CREATE TABLE or CREATE (composite) TYPE.  We want to forbid that because
it's nothing but a foot-gun.

This change creates a pg_upgrade failure case: a matview that contains an
unknown-type column can't be pg_upgraded, because reparsing the matview's
defining query will now decide that the column is of type text, which
doesn't match the cstring-like storage that the old materialized column
would actually have.  Add a checking pass to detect that.  While at it,
we can detect tables or composite types that would fail, essentially
for free.  Those would fail safely anyway later on, but we might as
well fail earlier.

This patch is by me, but it owes something to previous investigations
by Rahila Syed.  Also thanks to Ashutosh Bapat and Michael Paquier for
review.

Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 09:17:24 -05:00
d1ecd53947 Add a SHOW command to the replication command language.
This is useful infrastructure for an upcoming proposed patch to
allow the WAL segment size to be changed at initdb time; tools like
pg_basebackup need the ability to interrogate the server setting.
But it also doesn't seem like a bad thing to have independently of
that; it may find other uses in the future.

Robert Haas and Beena Emerson.  (The original patch here was by
Beena, but I rewrote it to such a degree that most of the code
being committed here is mine.)

Discussion: http://postgr.es/m/CA+TgmobNo4qz06wHEmy9DszAre3dYx-WNhHSCbU9SAwf+9Ft6g@mail.gmail.com
2017-01-24 17:04:12 -05:00
a84069d935 Add a new DestReceiver for printing tuples without catalog access.
If you create a DestReciver of type DestRemote and try to use it from
a replication connection that is not bound to a specific daabase, or
any other hypothetical type of backend that is not bound to a specific
database, it will fail because it doesn't have a pg_proc catalog to
look up properties of the types being printed.  In general, that's
an unavoidable problem, but we can hardwire the properties of a few
builtin types in order to support utility commands.  This new
DestReceiver of type DestRemoteSimple does just that.

Patch by me, reviewed by Michael Paquier.

Discussion: http://postgr.es/m/CA+TgmobNo4qz06wHEmy9DszAre3dYx-WNhHSCbU9SAwf+9Ft6g@mail.gmail.com
2017-01-24 16:53:56 -05:00
7b4ac19982 Extend index AM API for parallel index scans.
This patch doesn't actually make any index AM parallel-aware, but it
provides the necessary functions at the AM layer to do so.

Rahila Syed, Amit Kapila, Robert Haas
2017-01-24 16:42:58 -05:00
27cdb3414b Reindent table partitioning code.
We've accumulated quite a bit of stuff with which pgindent is not
quite happy in this code; clean it up to provide a less-annoying base
for future pgindent runs.
2017-01-24 10:20:02 -05:00
96e0ccc2b5 Fix incorrect comment: pgtime's tm_mon is 1-based, not 0-based.
The comments in formatting.c already said that tm_mon was 1-based not
0-based, but the comments here disagreed.

Dmitry Fedin
2017-01-24 09:36:17 -05:00
b1ecb9b3fc Fix interaction of partitioned tables with BulkInsertState.
When copying into a partitioned table, the target heap may change from
one tuple to next.  We must ask ReadBufferBI() to get a new buffer
every time such change occurs.  To do that, use new function
ReleaseBulkInsertStatePin().  This fixes the bug that tuples ended up
being inserted into the wrong partition, which occurred exactly
because the wrong buffer was used.

Amit Langote, per a suggestion from Robert Haas.  Some cosmetic
adjustments by me.

Reports by 高增琦 (Gao Zengqi), Venkata B Nagothi, and
Ragnar Ouchterlony.

Discussion: http://postgr.es/m/CAFmBtr32FDOqofo8yG-4mjzL1HnYHxXK5S9OGFJ%3D%3DcJpgEW4vA%40mail.gmail.com
Discussion: http://postgr.es/m/CAEyp7J9WiX0L3DoiNcRrY-9iyw%3DqP%2Bj%3DDLsAnNFF1xT2J1ggfQ%40mail.gmail.com
Discussion: http://postgr.es/m/16d73804-c9cd-14c5-463e-5caad563ff77%40agama.tv
Discussion: http://postgr.es/m/CA+TgmoaiZpDVUUN8LZ4jv1qFE_QyR+H9ec+79f5vNczYarg5Zg@mail.gmail.com
2017-01-24 08:50:16 -05:00
0bc1207aeb Fix default minimum value for descending sequences
For some reason that is lost in history, a descending sequence would
default its minimum value to -2^63+1 (-PG_INT64_MAX) instead of
-2^63 (PG_INT64_MIN), even though explicitly specifying a minimum value
of -2^63 would work.  Fix this inconsistency by using the full range by
default.

Reported-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2017-01-23 14:00:58 -05:00
b480086760 Add more includes so header files are self-contained 2017-01-21 15:49:53 -05:00
f21a563d25 Move some things from builtins.h to new header files
This avoids that builtins.h has to include additional header files.
2017-01-20 20:29:53 -05:00
e4c27f5bef Bump catversion 2017-01-20 09:07:13 -05:00
665d1fad99 Logical replication
- Add PUBLICATION catalogs and DDL
- Add SUBSCRIPTION catalog and DDL
- Define logical replication protocol and output plugin
- Add logical replication workers

From: Petr Jelinek <petr@2ndquadrant.com>
Reviewed-by: Steve Singer <steve@ssinger.info>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Erik Rijkers <er@xs4all.nl>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
2017-01-20 09:04:49 -05:00
ea15e18677 Remove obsoleted code relating to targetlist SRF evaluation.
Since 69f4b9c plain expression evaluation (and thus normal projection)
can't return sets of tuples anymore. Thus remove code dealing with
that possibility.

This will require adjustments in external code using
ExecEvalExpr()/ExecProject() - that should neither be hard nor very
common.

Author: Andres Freund and Tom Lane
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-19 14:40:41 -08:00
8eace46d34 Fix race condition in reading commit timestamps
If a user requests the commit timestamp for a transaction old enough
that its data is concurrently being truncated away by vacuum at just the
right time, they would receive an ugly internal file-not-found error
message from slru.c rather than the expected NULL return value.

In a primary server, the window for the race is very small: the lookup
has to occur exactly between the two calls by vacuum, and there's not a
lot that happens between them (mostly just a multixact truncate).  In a
standby server, however, the window is larger because the truncation is
executed as soon as the WAL record for it is replayed, but the advance
of the oldest-Xid is not executed until the next checkpoint record.

To fix in the primary, simply reverse the order of operations in
vac_truncate_clog.  To fix in the standby, augment the WAL truncation
record so that the standby is aware of the new oldest-XID value and can
apply the update immediately.  WAL version bumped because of this.

No backpatch, because of the low importance of the bug and its rarity.

Author: Craig Ringer
Reviewed-By: Petr Jelínek, Peter Eisentraut
Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com
2017-01-19 18:24:17 -03:00
05bd889904 Fix RETURNING to work correctly with partition tuple routing.
In ExecInsert(), do not switch back to the root partitioned table
ResultRelInfo until after we finish ExecProcessReturning(), so that
RETURNING projection is done using the partition's descriptor.  For
the projection to work correctly, we must initialize the same for each
leaf partition during ModifyTableState initialization.

Amit Langote
2017-01-19 13:20:11 -05:00