Commit Graph

28507 Commits

Author SHA1 Message Date
ebf7b2e01b Clean up properly error_context_stack in autovacuum worker on exception
Any callback set would have no meaning in the context of an exception.
As an autovacuum worker exits quickly in this context, this could be
only an issue within EmitErrorReport(), where the elog hook is for
example called.  That's unlikely to going to be a problem, but let's be
clean and consistent with other code paths handling exceptions.  This is
present since 2909419, which introduced autovacuum.

Author: Ashwin Agrawal
Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CALfoeisM+_+dgmAdAOHAu0k-ZpEHHqSSG=GRf3pKJGm8OqWX0w@mail.gmail.com
Backpatch-through: 9.4
2019-10-23 10:26:23 +09:00
e3267407e2 Deal with yet another issue related to "Norwegian (Bokmål)" locale.
It emerges that recent versions of Windows (at least 2016 Standard)
spell this locale name as "Norwegian Bokmål_Norway.1252", defeating
our mapping code that translates "Norwegian (Bokmål)_Norway" to
something that's all-ASCII (cf commits db29620d4 and aa1d2fc5e).
Add another mapping entry to handle this spelling.

Per bug #16068 from Robert Ford.  Like the previous patches,
back-patch to all supported branches.

Discussion: https://postgr.es/m/16068-4cb6eeaa7eb46d93@postgresql.org
2019-10-21 14:18:55 -04:00
11330c311a Select CFLAGS_SL at configure time, not in platform-specific Makefiles.
Move the platform-dependent logic that sets CFLAGS_SL from
src/makefiles/Makefile.foo to src/template/foo, so that the value
is determined at configure time and thus is available while running
configure's tests.

On a couple of platforms this might save a few microseconds of build
time by eliminating a test that make otherwise has to do over and over.
Otherwise it's pretty much a wash for build purposes; in particular,
this makes no difference to anyone who might be overriding CFLAGS_SL
via a make option.

This patch in itself does nothing with the value and thus should not
change any behavior, though you'll probably have to re-run configure
to get a correctly updated Makefile.global.  We'll use the new
configure variable in a follow-on patch.

Per gripe from Kyotaro Horiguchi.  Back-patch to all supported branches,
because the follow-on patch is a portability bug fix.

Discussion: https://postgr.es/m/20191010.144533.263180400.horikyota.ntt@gmail.com
2019-10-21 12:32:36 -04:00
62e881946c For PowerPC instruction "addi", use constraint "b".
Without "b", a variant of the tas() code miscompiles on macOS 10.4.
This may also fix a compilation failure involving macOS 10.1.  Today's
compilers have been allocating acceptable registers with or without this
change, but this future-proofs the code by precisely conveying the
acceptable registers.  Back-patch to 9.4 (all supported versions).

Reviewed by Tom Lane.

Discussion: https://postgr.es/m/20191009063900.GA4066266@rfd.leadboat.com
2019-10-18 20:20:32 -07:00
1b2ba8874a Fix failure of archive recovery with recovery_min_apply_delay enabled.
recovery_min_apply_delay parameter is intended for use with streaming
replication deployments. However, the document clearly explains that
the parameter will be honored in all cases if it's specified. So it should
take effect even if in archive recovery. But, previously, archive recovery
with recovery_min_apply_delay enabled always failed, and caused assertion
failure if --enable-caasert is enabled.

The cause of this problem is that; the ownership of recoveryWakeupLatch
that recovery_min_apply_delay uses was taken only when standby mode
is requested. So unowned latch could be used in archive recovery, and
which caused the failure.

This commit changes recovery code so that the ownership of
recoveryWakeupLatch is taken even in archive recovery. Which prevents
archive recovery with recovery_min_apply_delay from failing.

Back-patch to v9.4 where recovery_min_apply_delay was added.

Author: Fujii Masao
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com
2019-10-18 22:35:41 +09:00
b2ab06e025 Fix minor bug in logical-replication walsender shutdown
Logical walsender should exit when it catches up with sending WAL during
shutdown; but there was a rare corner case when it failed to because of
a race condition that puts it back to wait for more WAL instead -- but
since there wasn't any, it'd not shut down immediately.  It would only
continue the shutdown when wal_sender_timeout terminates the sleep,
which causes annoying waits during shutdown procedure.  Restructure the
code so that we no longer forget to set WalSndCaughtUp in that case.

This was an oversight in commit c6c333436.

Backpatch all the way down to 9.4.

Author: Craig Ringer, Álvaro Herrera
Discussion: https://postgr.es/m/CAMsr+YEuz4XwZX_QmnX_-2530XhyAmnK=zCmicEnq1vLr0aZ-g@mail.gmail.com
2019-10-17 15:06:05 +02:00
cdbb392130 When restoring GUCs in parallel workers, show an error context.
Otherwise it can be hard to see where an error is coming from, when
the parallel worker sets all the GUCs that it received from the
leader.  Bug #15726.  Back-patch to 9.5, where RestoreGUCState()
appeared.

Reported-by: Tiago Anastacio
Reviewed-by: Daniel Gustafsson, Tom Lane
Discussion: https://postgr.es/m/15726-6d67e4fa14f027b3%40postgresql.org
2019-10-17 13:50:59 +13:00
c1443eebe5 Fix bug that could try to freeze running multixacts.
Commits 801c2dc7 and 801c2dc7 made it possible for vacuum to
try to freeze a multixact that is still running.  That was
prevented by a check, but raised an error.  Repair.

Back-patch all the way.

Author: Nathan Bossart, Jeremy Schneider
Reported-by: Jeremy Schneider
Reviewed-by: Jim Nasby, Thomas Munro
Discussion: https://postgr.es/m/DAFB8AFF-2F05-4E33-AD7F-FF8B0F760C17%40amazon.com
2019-10-17 10:28:28 +13:00
984aa0ede1 Add missing include to pg_upgrade/version.c
Commit 8d48e6a724 uses RELKIND_ constants when building the query, but
did not include the header defining them. On 10+ this header is already
included, but on 9.6 and earlier it was missing. It compiles just fine,
but then fails during execution

  ERROR:  column "relkind_relation" does not exist

Fix by adding the necessary header file, and backpatch to 9.4-.

Backpatch-to: 9.4-
Discussion: https://postgr.es/m/16045-673e8fa6b5ace196%40postgresql.org
2019-10-16 16:23:09 +02:00
f57b01dd75 Improve the check for pg_catalog.line data type in pg_upgrade
The pg_upgrade check for pg_catalog.line data type when upgrading from
9.3 had a couple of issues with domains and composite types. Firstly, it
triggered false positives for composite types unused in objects with
storage. This was enough to trigger an unnecessary pg_upgrade failure:

  CREATE TYPE line_composite AS (l pg_catalog.line)

On the other hand, this only happened with composite types directly on
the pg_catalog.line data type, but not with a domain. So this was not
detected

  CREATE DOMAIN line_domain AS pg_catalog.line;
  CREATE TYPE line_composite_2 AS (l line_domain);

unlike the first example. These false positives and inconsistencies are
unfortunate, but what's worse we've failed to detected objects using the
pg_catalog.line data type through a domain. So we missed cases like this

  CREATE TABLE t (l line_composite_2);

The consequence is clusters broken after a pg_upgrade.

This fixes these false positives and false negatives by using the same
recursive CTE introduced by eaf900e842 for sql_identifier. 9.3 did not
support domains on composite types, but we can still have multi-level
composite types.

Backpatch all the way to 9.4, where the format for pg_catalog.line data
type changed.

Author: Tomas Vondra
Backpatch-to: 9.4-
Discussion: https://postgr.es/m/16045-673e8fa6b5ace196%40postgresql.org
2019-10-16 13:27:51 +02:00
e40eb31c0d AIX: Stop adding option -qsrcmsg.
With xlc v16.1.0, it causes internal compiler errors.  With xlc versions
not exhibiting that bug, removing -qsrcmsg merely changes the compiler
error reporting format.  Back-patch to 9.4 (all supported versions).

Discussion: https://postgr.es/m/20191003064105.GA3955242@rfd.leadboat.com
2019-10-12 00:21:51 -07:00
c50f95272e Flush logical mapping files with fd opened for read/write at checkpoint
The file descriptor was opened with read-only to fsync a regular file,
which would cause EBADFD errors on some platforms.

This is similar to the recent fix done by a586cc4b (which was broken by
me with 82a5649), except that I noticed this issue while monitoring the
backend code for similar mistakes.  Backpatch to 9.4, as this has been
introduced since logical decoding exists as of b89e151.

Author: Michael Paquier
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/20191006045548.GA14532@paquier.xyz
Backpatch-through: 9.4
2019-10-09 13:31:33 +09:00
8c2910ce5e Check for too many postmaster children before spawning a bgworker.
The postmaster's code path for spawning a bgworker neglected to check
whether we already have the max number of live child processes.  That's
a bit hard to hit, since it would necessarily be a transient condition;
but if we do, AssignPostmasterChildSlot() fails causing a postmaster
crash, as seen in a report from Bhargav Kamineni.

To fix, invoke canAcceptConnections() in the bgworker code path, as we
do in the other code paths that spawn children.  Since we don't want
the same pmState tests in this case, add a child-process-type parameter
to canAcceptConnections() so that it can know what to do.

Back-patch to 9.5.  In principle the same hazard exists in 9.4, but the
code is enough different that this patch wouldn't quite fix it there.
Given the tiny usage of bgworkers in that branch it doesn't seem worth
creating a variant patch for it.

Discussion: https://postgr.es/m/18733.1570382257@sss.pgh.pa.us
2019-10-07 12:39:10 -04:00
57f0060891 Report test_atomic_ops() failures consistently, via macros.
This prints the unexpected value in more failure cases, and it removes
forty-eight hand-maintained error messages.  Back-patch to 9.5, which
introduced these tests.

Reviewed (in an earlier version) by Andres Freund.

Discussion: https://postgr.es/m/20190915160021.GA24376@alvherre.pgsql
2019-10-05 10:06:40 -07:00
6ca51b1552 Handle spaces in OpenSSL install location for MSVC
First, make sure that the .exe name is quoted when trying to get the
version number. Also, don't quote the lib name for using in the project
files if it's already been quoted. This second change applies to all
libraries, not just OpenSSL.

This has clearly been broken forever, so backpatch to all live branches.
2019-10-04 15:38:48 -04:00
8b77f783b7 Fix bitshiftright()'s zero-padding some more.
Commit 5ac0d9360 failed to entirely fix bitshiftright's habit of
leaving one-bits in the pad space that should be all zeroes,
because in a moment of sheer brain fade I'd concluded that only
the code path used for not-a-multiple-of-8 shift distances needed
to be fixed.  Of course, a multiple-of-8 shift distance can also
cause the problem, so we need to forcibly zero the extra bits
in both cases.

Per bug #16037 from Alexander Lakhin.  As before, back-patch to all
supported branches.

Discussion: https://postgr.es/m/16037-1d1ebca564db54f4@postgresql.org
2019-10-04 10:34:21 -04:00
54d641da06 Avoid unnecessary out-of-memory errors during encoding conversion.
Encoding conversion uses the very simplistic rule that the output
can't be more than 4X longer than the input, and palloc's a buffer
of that size.  This results in failure to convert any string longer
than 1/4 GB, which is becoming an annoying limitation.

As a band-aid to improve matters, allow the allocated output buffer
size to exceed 1GB.  We still insist that the final result fit into
MaxAllocSize (1GB), though.  Perhaps it'd be safe to relax that
restriction, but it'd require close analysis of all callers, which
is daunting (not least because external modules might call these
functions).  For the moment, this should allow a 2X to 4X improvement
in the longest string we can convert, which is a useful gain in
return for quite a simple patch.

Also, once we have successfully converted a long string, repalloc
the output down to the actual string length, returning the excess
to the malloc pool.  This seems worth doing since we can usually
expect to give back several MB if we take this path at all.

This still leaves much to be desired, most notably that the assumption
that MAX_CONVERSION_GROWTH == 4 is very fragile, and yet we have no
guard code verifying that the output buffer isn't overrun.  Fixing
that would require significant changes in the encoding conversion
APIs, so it'll have to wait for some other day.

The present patch seems safely back-patchable, so patch all supported
branches.

Alvaro Herrera and Tom Lane

Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql
Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us
2019-10-03 17:34:26 -04:00
1534531fe7 Allow repalloc() to give back space when a large chunk is downsized.
Up to now, if you resized a large (>8K) palloc chunk down to a smaller
size, aset.c made no attempt to return any space to the malloc pool.
That's unpleasant if a really large allocation is resized to a
significantly smaller size.  I think no such cases existed when this
code was designed, and I'm not sure whether they're common even yet,
but an upcoming fix to encoding conversion will certainly create such
cases.  Therefore, fix AllocSetRealloc so that it gives realloc()
a chance to do something with the block.  This doesn't noticeably
increase complexity, we mostly just have to change the order in which
the cases are considered.

Back-patch to all supported branches.

Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql
Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us
2019-10-03 13:56:26 -04:00
d2427f11b2 Selectively include window frames in expression walks/mutates.
query_tree_walker and query_tree_mutator were skipping the
windowClause of the query, without regard for the fact that the
startOffset and endOffset in a WindowClause node are expression trees
that need to be processed. This was an oversight in commit ec4be2ee6
from 2010 which added the expression fields; the main symptom is that
function parameters in window frame clauses don't work in inlined
functions.

Fix (as conservatively as possible since this needs to not break
existing out-of-tree callers) and add tests.

Backpatch all the way, since this has been broken since 9.0.

Per report from Alastair McKinley; fix by me with kibitzing and review
from Tom Lane.

Discussion: https://postgr.es/m/DB6PR0202MB2904E7FDDA9D81504D1E8C68E3800@DB6PR0202MB2904.eurprd02.prod.outlook.com
2019-10-03 11:18:15 +01:00
ae205dfe6c Remove temporary WAL and history files at the end of archive recovery
cbc55da has reworked the order of some actions at the end of archive
recovery.  Unfortunately this overlooked the fact that the startup
process needs to remove RECOVERYXLOG (for temporary WAL segment newly
recovered from archives) and RECOVERYHISTORY (for temporary history
file) at this step, leaving the files around even after recovery ended.

Backpatch to 9.5, like the previous commit.

Author: Sawada Masahiko
Reviewed-by: Fujii Masao, Michael Paquier
Discussion: https://postgr.es/m/CAD21AoBO_eDQub6zojFnWtnmutRBWvYf7=cW4Hsqj+U_R26w3Q@mail.gmail.com
Backpatch-through: 9.5
2019-10-02 15:54:16 +09:00
8364ef925c Fix oversight in commit 4429f6a9e3e12bb4af6e3677fbc78cd80f160252.
The test name and the following test cases suggest the index created
should be hash index, but it forgot to add 'using hash' in the test case.
This in itself won't improve code coverage as there were some other tests
which were covering the corresponding code.  However, it is better if the
added tests serve their actual purpose.

Reported-by: Paul A Jungwirth
Author: Paul A Jungwirth
Reviewed-by: Mahendra Singh
Backpatch-through: 9.4
Discussion: https://postgr.es/m/CA+renyV=Us-5XfMC25bNp-uWSj39XgHHmGE9Rh2cQKMegSj52g@mail.gmail.com
2019-09-27 08:44:28 +05:30
35eb132700 Fix failure to zero-pad the result of bitshiftright().
If the bitstring length is not a multiple of 8, we'd shift the
rightmost bits into the pad space, which must be zeroes --- bit_cmp,
for one, depends on that.  This'd lead to the result failing to
compare equal to what it should compare equal to, as reported in
bug #16013 from Daryl Waycott.

This is, if memory serves, not the first such bug in the bitstring
functions.  In hopes of making it the last one, do a bit more work
than minimally necessary to fix the bug:

* Add assertion checks to bit_out() and varbit_out() to complain if
they are given incorrectly-padded input.  This will improve the
odds that manual testing of any new patch finds problems.

* Encapsulate the padding-related logic in macros to make it
easier to use.

Also, remove unnecessary padding logic from bit_or() and bitxor().
Somebody had already noted that we need not re-pad the result of
bit_and() since the inputs are required to be the same length,
but failed to extrapolate that to the other two.

Also, move a comment block that once was near the head of varbit.c
(but people kept putting other stuff in front of it), to put it in
the header block.

Note for the release notes: if anyone has inconsistent data as a
result of saving the output of bitshiftright() in a table, it's
possible to fix it with something like
UPDATE mytab SET bitcol = ~(~bitcol) WHERE bitcol != ~(~bitcol);

This has been broken since day one, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/16013-c2765b6996aacae9@postgresql.org
2019-09-22 17:46:00 -04:00
128abdef78 Update time zone data files to tzdata release 2019c.
DST law changes in Fiji and Norfolk Island.  Historical corrections
for Alberta, Austria, Belgium, British Columbia, Cambodia, Hong Kong,
Indiana (Perry County), Kaliningrad, Kentucky, Michigan, Norfolk
Island, South Korea, and Turkey.
2019-09-20 19:54:29 -04:00
388939748a Fix oversight in backpatch of 6cae9d2c10
During backpatch of 6cae9d2c10 Float8GetDatum() was accidentally removed.  This
commit turns it back.

Reported-by: Erik Rijkers
Discussion: https://postgr.es/m/6d51305e1159241cabee132f7efc7eff%40xs4all.nl
Author: Tom Lane
Backpatch-through: from 11 to 9.5
2019-09-19 23:39:35 +03:00
ad458d0cd9 Improve handling of NULLs in KNN-GiST and KNN-SP-GiST
This commit improves subject in two ways:

 * It removes ugliness of 02f90879e7, which stores distance values and null
   flags in two separate arrays after GISTSearchItem struct.  Instead we pack
   both distance value and null flag in IndexOrderByDistance struct.  Alignment
   overhead should be negligible, because we typically deal with at most few
   "col op const" expressions in ORDER BY clause.
 * It fixes handling of "col op NULL" expression in KNN-SP-GiST.  Now, these
   expression are not passed to support functions, which can't deal with them.
   Instead, NULL result is implicitly assumed.  It future we may decide to
   teach support functions to deal with NULL arguments, but current solution is
   bugfix suitable for backpatch.

Reported-by: Nikita Glukhov
Discussion: https://postgr.es/m/826f57ee-afc7-8977-c44c-6111d18b02ec%40postgrespro.ru
Author: Nikita Glukhov
Reviewed-by: Alexander Korotkov
Backpatch-through: 9.4
2019-09-19 22:09:51 +03:00
50261e1520 pg_upgrade/test.sh: Quote sed(1) argument
Lack of quotes results in failure to run the test under older Solaris.

Author: Marina Polyakova, Victor Wagner
Discussion: https://postgr.es/m/feba89f89e8925b3535cb7d72b9e05e1@postgrespro.ru
2019-09-18 11:25:32 -03:00
75941f257a Replace xlc __fetch_and_add() with inline asm.
PostgreSQL has been unusable when built with xlc 13 and newer, which are
incompatible with our use of __fetch_and_add().  Back-patch to 9.5,
which introduced pg_atomic_fetch_add_u32().

Reviewed by Tom Lane.

Discussion: https://postgr.es/m/20190831071157.GA3251746@rfd.leadboat.com
2019-09-13 19:34:19 -07:00
4737d3a75b Test pg_atomic_fetch_add_ with variable addend and 16-bit edge cases.
Back-patch to 9.5, which introduced these functions.

Reviewed by Tom Lane.

Discussion: https://postgr.es/m/20190831071157.GA3251746@rfd.leadboat.com
2019-09-13 19:33:47 -07:00
7110f5c377 logical decoding: process ASSIGNMENT during snapshot build
Most WAL records are ignored in early SnapBuild snapshot build phases.
But it's critical to process some of them, so that later messages have
the correct transaction state after the snapshot is completely built; in
particular, XLOG_XACT_ASSIGNMENT messages are critical in order for
sub-transactions to be correctly assigned to their parent transactions,
or at least one assert misbehaves, as reported by Ildar Musin.

Diagnosed-by: Masahiko Sawada
Author: Masahiko Sawada
Discussion: https://postgr.es/m/CAONYFtOv+Er1p3WAuwUsy1zsCFrSYvpHLhapC_fMD-zNaRWxYg@mail.gmail.com
2019-09-13 16:36:28 -03:00
83c2c97114 Fix under-parenthesized macro definitions
Lack of parens in the definitions could cause a statement using these
macros to have unexpected semantics.  In current code no bug is
apparent, but best to fix the definitions to avoid problems down the
line.

Reported-by: Tom Lane
Discussion: https://postgr.es/m/19795.1568400476@sss.pgh.pa.us
2019-09-13 16:26:55 -03:00
9064cc139d Fix nbtree page split rmgr desc routine.
Include newitemoff in rmgr desc output for nbtree page split records.
In passing, correct an obsolete comment that claimed that newitemoff is
only logged for _L variant nbtree page split WAL records.

Both issues were oversights in commit 2c03216d831, which revamped the
WAL format.

Author: Peter Geoghegan
Backpatch: 9.5-, where the WAL format was revamped.
2019-09-12 15:44:59 -07:00
aee5736f15 Fix usage of whole-row variables in WCO and RLS policy expressions.
Since WITH CHECK OPTION was introduced, ExecInitModifyTable has
initialized WCO expressions with the wrong plan node as parent -- that is,
it passed its input subplan not the ModifyTable node itself.  Up to now
we thought this was harmless, but bug #16006 from Vinay Banakar shows it's
not: if the input node is a SubqueryScan then ExecInitWholeRowVar can get
confused into doing the wrong thing.  (The fact that ExecInitWholeRowVar
contains such logic is certainly a horrid kluge that doesn't deserve to
live, but figuring out another way to do that is a task for some other day.)

Andres had already noticed the wrong-parent mistake and fixed it in commit
148e632c0, but not being aware of any user-visible consequences, he quite
reasonably didn't back-patch.  This patch is simply a back-patch of
148e632c0, plus addition of a test case based on bug #16006.  I also added
the test case to v12/HEAD, even though the bug is already fixed there.

Back-patch to all supported branches.  9.4 lacks RLS policies so the
new test case doesn't work there, but I'm pretty sure a test could be
devised based on using a whole-row Var in a plain WITH CHECK OPTION
condition.  (I lack the cycles to do so myself, though.)

Andres Freund and Tom Lane

Discussion: https://postgr.es/m/16006-99290d2e4642cbd5@postgresql.org
Discussion: https://postgr.es/m/20181205225213.hiwa3kgoxeybqcqv@alap3.anarazel.de
2019-09-12 18:29:18 -04:00
bfd6db8560 Expand properly list of TAP tests used for prove in vcregress.pl
Depending on the system used, t/*.pl may not be expanded into a list of
tests which can be consumed by prove when attempting to run TAP tests on
a given path.  Fix that by using glob() directly in the script, to make
sure that a complete list of tests is provided.  This has not proved to
be an issue with MSVC as the list was properly expanded, but it is on
Linux with perl's system().

This is extracted from a larger patch.

Author: Tom Lane
Discussion: https://postgr.es/m/6628.1567958876@sss.pgh.pa.us
Backpatch-through: 9.4
2019-09-11 11:07:44 +09:00
decff80ec8 Prevent msys2 conversion of "cmd /c" switch to a file path
Modern versions of msys2 have changed the treatment of "cmd /c" so that
the runtime will try to convert the switch to a native file path. This
patch adds a setting to inhibit that behaviour.

Discussion: https://postgr.es/m/3227042f-cfcc-745a-57dd-fb8c471f8ddf@2ndQuadrant.com

Backpatch to all live branches.
2019-09-09 09:03:32 -04:00
87ee1587c9 Fix RelationIdGetRelation calls that weren't bothering with error checks.
Some of these are quite old, but that doesn't make them not bugs.
We'd rather report a failure via elog than SIGSEGV.

While at it, uniformly spell the error check as !RelationIsValid(rel)
rather than a bare rel == NULL test.  The machine code is the same
but it seems better to be consistent.

Coverity complained about this today, not sure why, because the
mistake is in fact old.
2019-09-08 17:00:57 -04:00
3c155bafa5 Fix handling of NULL distances in KNN-GiST
In order to implement NULL LAST semantic GiST previously assumed distance to
the NULL value to be Inf.  However, our distance functions can return Inf and
NaN for non-null values.  In such cases, NULL LAST semantic appears to be
broken.  This commit fixes that by introducing separate array of null flags for
distances.

Backpatch to all supported versions.

Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com
Author: Alexander Korotkov
Backpatch-through: 9.4
2019-09-08 21:49:16 +03:00
986319d467 Fix handling Inf and Nan values in GiST pairing heap comparator
Previously plain float comparison was used in GiST pairing heap.  Such
comparison doesn't provide proper ordering for value sets containing Inf and Nan
values.  This commit fixes that by usage of float8_cmp_internal().  Note, there
is remaining problem with NULL distances, which are represented as Inf in
pairing heap.  It would be fixes in subsequent commit.

Backpatch to all supported versions.

Reported-by: Andrey Borodin
Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Heikki Linnakangas
Backpatch-through: 9.4
2019-09-08 21:49:15 +03:00
f697c6396d When performing a base backup, check for read errors.
The old code didn't differentiate between a read error and a
concurrent truncation. fread reports both of these by returning 0;
you have to use feof() or ferror() to distinguish between them,
which this code did not do.

It might be a better idea to use read() rather than fread() here,
so that we can display a less-generic error message, but I'm not
sure that would qualify as a back-patchable bug fix, so just do
this much for now.

Jeevan Chalke, reviewed by Jeevan Ladhe and by me.

Discussion: http://postgr.es/m/CA+TgmobG4ywMzL5oQq2a8YKp8x2p3p1LOMMcGqpS7aekT9+ETA@mail.gmail.com
2019-09-06 09:05:12 -04:00
62724bd952 Handle corner cases correctly in psql's reconnection logic.
After an unexpected connection loss and successful reconnection,
psql neglected to resynchronize its internal state about the server,
such as server version.  Ordinarily we'd be reconnecting to the same
server and so this isn't really necessary, but there are scenarios
where we do need to update --- one example is where we have a list
of possible connection targets and they're not all alike.

Define "resynchronize" as including connection_warnings(), so that
this case acts the same as \connect.  This seems useful; for example,
if the server version did change, the user might wish to know that.
An attuned user might also notice that the new connection isn't
SSL-encrypted, for example, though this approach isn't especially
in-your-face about such changes.  Although this part is a behavioral
change, it only affects interactive sessions, so it should not break
any applications.

Also, in do_connect, make sure that we desynchronize correctly when
abandoning an old connection in non-interactive mode.

These problems evidently are the result of people patching only one
of the two places where psql deals with connection changes, so insert
some cross-referencing comments in hopes of forestalling future bugs
of the same ilk.

Lastly, in Windows builds, issue codepage mismatch warnings only at
startup, not during reconnections.  psql's codepage can't change
during a reconnect, so complaining about it again seems like useless
noise.

Peter Billen and Tom Lane.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAMTXbE8e6U=EBQfNSe01Ej17CBStGiudMAGSOPaw-ALxM-5jXg@mail.gmail.com
2019-09-02 14:02:46 -04:00
cf00ca5228 Fix overflow check and comment in GIN posting list encoding.
The comment did not match what the code actually did for integers with
the 43rd bit set. You get an integer like that, if you have a posting
list with two adjacent TIDs that are more than 2^31 blocks apart.
According to the comment, we would store that in 6 bytes, with no
continuation bit on the 6th byte, but in reality, the code encodes it
using 7 bytes, with a continuation bit on the 6th byte as normal.

The decoding routine also handled these 7-byte integers correctly, except
for an overflow check that assumed that one integer needs at most 6 bytes.
Fix the overflow check, and fix the comment to match what the code
actually does. Also fix the comment that claimed that there are 17 unused
bits in the 64-bit representation of an item pointer. In reality, there
are 64-32-11=21.

Fitting any item pointer into max 6 bytes was an important property when
this was written, because in the old pre-9.4 format, item pointers were
stored as plain arrays, with 6 bytes for every item pointer. The maximum
of 6 bytes per integer in the new format guaranteed that we could convert
any page from the old format to the new format after upgrade, so that the
new format was never larger than the old format. But we hardly need to
worry about that anymore, and running into that problem during upgrade,
where an item pointer is expanded from 6 to 7 bytes such that the data
doesn't fit on a page anymore, is implausible in practice anyway.

Backpatch to all supported versions.

This also includes a little test module to test these large distances
between item pointers, without requiring a 16 TB table. It is not
backpatched, I'm including it more for the benefit of future development
of new posting list formats.

Discussion: https://www.postgresql.org/message-id/33bfc20a-5c86-f50c-f5a5-58e9925d05ff%40iki.fi
Reviewed-by: Masahiko Sawada, Alexander Korotkov
2019-08-28 12:58:10 +03:00
e9dcbc9c3f Disable timeouts when running pg_rewind with online source cluster
In this case, the transfer uses a libpq connection, which is subject to
the timeout parameters set at system level, and this can make the rewind
operation suddenly canceled which is not good for automation.  One
workaround to such issues would be to use PGOPTIONS to enforce the
wanted timeout parameters, but that's annoying, and for example pg_dump,
which can run potentially long-running queries disables all types of
timeouts.

lock_timeout and statement_timeout are the ones which can cause problems
now.  Note that pg_rewind does not use transactions, so disabling
idle_in_transaction_session_timeout is optional, but it feels safer to
do so for the future.

This is back-patched down to 9.5.  idle_in_transaction_session_timeout
is only present since 9.6.

Author: Alexander Kukushkin
Discussion: https://postgr.es/m/CAFh8B=krcVXksxiwVQh1SoY+ziJ-JC=6FcuoBL3yce_40Es5_g@mail.gmail.com
Backpatch-through: 9.5
2019-08-28 11:48:33 +09:00
ef47d284db Reject empty names and recursion in config-file include directives.
An empty file name or subdirectory name leads join_path_components() to
just produce the parent directory name, which leads to weird failures or
recursive inclusions.  Let's throw a specific error for that.  It takes
only slightly more code to detect all-blank names, so do so.

Also, detect direct recursion, ie a file calling itself.  As coded
this will also detect recursion via "include_dir '.'", which is
perhaps more likely than explicitly including the file itself.

Detecting indirect recursion would require API changes for guc-file.l
functions, which seems not worth it since extensions might call them.
The nesting depth limit will catch such cases eventually, just not
with such an on-point error message.

In passing, adjust the example usages in postgresql.conf.sample
to perhaps eliminate the problem at the source: there's no reason
for the examples to suggest that an empty value is valid.

Per a trouble report from Brent Bates.  Back-patch to 9.5; the
issue is old, but the code in 9.4 is enough different that the
patch doesn't apply easily, and it doesn't seem worth the trouble
to fix there.

Ian Barwick and Tom Lane

Discussion: https://postgr.es/m/8c8bcbca-3bd9-dc6e-8986-04a5abdef142@2ndquadrant.com
2019-08-27 14:44:26 -04:00
4ed3bda49d Fix failure of --jobs with vacuumdb on Windows
FD_SETSIZE needs to be declared before winsock2.h, or it is possible to
run into buffer overflow issues when using --jobs.  This is similar to
pgbench's solution done in a23c641.

This has been introduced by 71d84ef, and older versions have been using
the default value of FD_SETSIZE, defined at 64.  While on it, add a
missing newline to the previously-added error message.

Per buildfarm member jacana, but this impacts all Windows animals
running the TAP tests.  I have reproduced the failure locally to check
the patch.

Author: Michael Paquier
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/20190826054000.GE7005@paquier.xyz
Backpatch-through: 9.5
2019-08-27 09:12:14 +09:00
33066b0142 Treat MINGW and MSYS the same in pg_upgrade test script
On msys2, 'uname -s' reports a string starting MSYS instead on MINGW
as happens on msys1. Treat these both the same way. This reverts
608a710195a4b in favor of a more general solution.

Backpatch to all live branches.
2019-08-26 07:47:41 -04:00
a21ec1a959 Fix error handling of vacuumdb when running out of fds
When trying to use a high number of jobs, vacuumdb has only checked for
a maximum number of jobs used, causing confusing failures when running
out of file descriptors when the jobs open connections to Postgres.
This commit changes the error handling so as we do not check anymore for
a maximum number of allowed jobs when parsing the option value with
FD_SETSIZE, but check instead if a file descriptor is within the
supported range when opening the connections for the jobs so as this is
detected at the earliest time possible.

Also, improve the error message to give a hint about the number of jobs
recommended, using a wording given by the reviewers of the patch.

Reported-by: Andres Freund
Author: Michael Paquier
Reviewed-by: Andres Freund, Álvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/20190818001858.ho3ev4z57fqhs7a5@alap3.anarazel.de
Backpatch-through: 9.5
2019-08-26 11:14:44 +09:00
65b1cad5a0 Avoid platform-specific null pointer dereference in psql.
POSIX permits getopt() to advance optind beyond argc when the last
argv entry is an option that requires an argument and hasn't got one.
It seems that no major platforms actually do that, but musl does,
so that something like "psql -f" would crash with that libc.
Add a check that optind is in range before trying to look at the
possibly-bogus option.

Report and fix by Quentin Rameau.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/20190825100617.GA6087@fifth.space
2019-08-25 15:04:04 -04:00
aad9bf336c Fix typo
In early development patches, "replication origins" were called "identifiers";
almost everything was renamed, but these references to the old terminology
went unnoticed.

Reported-by: Craig Ringer
2019-08-21 11:12:44 -04:00
5a32fcdd7c Fix bogus comment
Author: Alexander Lakhin
Discussion: https://postgr.es/m/20190819072244.GE18166@paquier.xyz
2019-08-20 16:04:09 -04:00
c511d367aa Disallow changing an inherited column's type if not all parents changed.
If a table inherits from multiple unrelated parents, we must disallow
changing the type of a column inherited from multiple such parents, else
it would be out of step with the other parents.  However, it's possible
for the column to ultimately be inherited from just one common ancestor,
in which case a change starting from that ancestor should still be
allowed.  (I would not be excited about preserving that option, were
it not that we have regression test cases exercising it already ...)

It's slightly annoying that this patch looks different from the logic
with the same end goal in renameatt(), and more annoying that it
requires an extra syscache lookup to make the test.  However, the
recursion logic is quite different in the two functions, and a
back-patched bug fix is no place to be trying to unify them.

Per report from Manuel Rigger.  Back-patch to 9.5.  The bug exists in
9.4 too (and doubtless much further back); but the way the recursion
is done in 9.4 is a good bit different, so that substantial refactoring
would be needed to fix it in 9.4.  I'm disinclined to do that, or risk
introducing new bugs, for a bug that has escaped notice for this long.

Discussion: https://postgr.es/m/CA+u7OA4qogDv9rz1HAb-ADxttXYPqQdUdPY_yd4kCzywNxRQXA@mail.gmail.com
2019-08-18 17:11:58 -04:00
cb0c79ae62 Prevent possible double-free when update trigger returns old tuple.
This is a variant of the problem fixed in commit 25b692568, which
unfortunately we failed to detect at the time.  If an update trigger
returns the "old" tuple, as it's entitled to do, then a subsequent
iteration of the loop in ExecBRUpdateTriggers would have "oldtuple"
equal to "trigtuple" and would fail to notice that it shouldn't
free that.

In addition to fixing the code, extend the test case added by
25b692568 so that it covers multiple-trigger-iterations cases.

This problem does not manifest in v12/HEAD, as a result of the
relevant code having been largely rewritten for slotification.
However, include the test case into v12/HEAD anyway, since this
is clearly an area that someone could break again in future.

Per report from Piotr Gabriel Kosinski.  Back-patch into all
supported branches, since the bug seems quite old.

Diagnosis and code fix by Thomas Munro, test case by me.

Discussion: https://postgr.es/m/CAFMLSdP0rd7LqC3j-H6Fh51FYSt5A10DDh-3=W4PPc4LLUQ8YQ@mail.gmail.com
2019-08-15 20:04:19 -04:00