Commit Graph

6867 Commits

Author SHA1 Message Date
d73d14c271 Fix incorrect order of lock file removal and failure to close() sockets.
Commit c9b0cbe98bd783e24a8c4d8d8ac472a494b81292 accidentally broke the
order of operations during postmaster shutdown: it resulted in removing
the per-socket lockfiles after, not before, postmaster.pid.  This creates
a race-condition hazard for a new postmaster that's started immediately
after observing that postmaster.pid has disappeared; if it sees the
socket lockfile still present, it will quite properly refuse to start.
This error appears to be the explanation for at least some of the
intermittent buildfarm failures we've seen in the pg_upgrade test.

Another problem, which has been there all along, is that the postmaster
has never bothered to close() its listen sockets, but has just allowed them
to close at process death.  This creates a different race condition for an
incoming postmaster: it might be unable to bind to the desired listen
address because the old postmaster is still incumbent.  This might explain
some odd failures we've seen in the past, too.  (Note: this is not related
to the fact that individual backends don't close their client communication
sockets.  That behavior is intentional and is not changed by this patch.)

Fix by adding an on_proc_exit function that closes the postmaster's ports
explicitly, and (in 9.3 and up) reshuffling the responsibility for where
to unlink the Unix socket files.  Lock file unlinking can stay where it
is, but teach it to unlink the lock files in reverse order of creation.
2015-08-02 14:55:03 -04:00
7039760114 Fix issues around the "variable" support in the lwlock infrastructure.
The lwlock scalability work introduced two race conditions into the
lwlock variable support provided for xlog.c. First, and harmlessly on
most platforms, it set/read the variable without the spinlock in some
places. Secondly, due to the removal of the spinlock, it was possible
that a backend missed changes to the variable's state if it changed in
the wrong moment because checking the lock's state, the variable's state
and the queuing are not protected by a single spinlock acquisition
anymore.

To fix first move resetting the variable's from LWLockAcquireWithVar to
WALInsertLockRelease, via a new function LWLockReleaseClearVar. That
prevents issues around waiting for a variable's value to change when a
new locker has acquired the lock, but not yet set the value. Secondly
re-check that the variable hasn't changed after enqueing, that prevents
the issue that the lock has been released and already re-acquired by the
time the woken up backend checks for the lock's state.

Reported-By: Jeff Janes
Analyzed-By: Heikki Linnakangas
Reviewed-By: Heikki Linnakangas
Discussion: 5592DB35.2060401@iki.fi
Backpatch: 9.5, where the lwlock scalability went in
2015-08-02 18:41:23 +02:00
e8e86fbc8b Fix volatility marking of commit timestamp functions
They are marked stable, but since they act on instantaneous state and it
is possible to consult state of transactions as they commit, the results
could change mid-query.  They need to be marked volatile, and this
commit does so.

There would normally be a catversion bump here, but this is so much a
niche feature and I don't believe there's real damage from the incorrect
marking, that I refrained.

Backpatch to 9.5, where commit timestamps where introduced.

Per note from Fujii Masao.
2015-07-30 15:19:49 -03:00
632cd9f892 Create new ParseExprKind for use by policy expressions.
Policy USING and WITH CHECK expressions were using EXPR_KIND_WHERE for
parse analysis, which results in inappropriate ERROR messages when
the expression contains unsupported constructs such as aggregates.
Create a new ParseExprKind called EXPR_KIND_POLICY and tailor the
related messages to fit.

Reported by Noah Misch. Reviewed by Dean Rasheed, Alvaro Herrera,
and Robert Haas. Back-patch to 9.5 where RLS was introduced.
2015-07-29 15:40:24 -07:00
d824e2800f Disallow converting a table to a view if row security is present.
When DefineQueryRewrite() is about to convert a table to a view, it checks
the table for features unavailable to views.  For example, it rejects tables
having triggers.  It omits to reject tables having relrowsecurity or a
pg_policy record. Fix that. To faciliate the repair, invent
relation_has_policies() which indicates the presence of policies on a
relation even when row security is disabled for that relation.

Reported by Noah Misch. Patch by me, review by Stephen Frost. Back-patch
to 9.5 where RLS was introduced.
2015-07-28 16:24:01 -07:00
f781a0f1d8 Create a pg_shdepend entry for each role in TO clause of policies.
CreatePolicy() and AlterPolicy() omit to create a pg_shdepend entry for
each role in the TO clause. Fix this by creating a new shared dependency
type called SHARED_DEPENDENCY_POLICY and assigning it to each role.

Reported by Noah Misch. Patch by me, reviewed by Alvaro Herrera.
Back-patch to 9.5 where RLS was introduced.
2015-07-28 16:01:53 -07:00
1e2bd43b31 Bump catversion so that HEAD is beyond 9.5
As pointed out by Tom, since HEAD has progressed beyond 9.5 in terms of
its catalog, we need to be sure catversion of HEAD is advanced beyond
that of 9.5. Corrects my mistake in the pg_stats view commit cfa928ff.
2015-07-28 13:59:23 -07:00
7b4bfc87d5 Plug RLS related information leak in pg_stats view.
The pg_stats view is supposed to be restricted to only show rows
about tables the user can read. However, it sometimes can leak
information which could not otherwise be seen when row level security
is enabled. Fix that by not showing pg_stats rows to users that would
be subject to RLS on the table the row is related to. This is done
by creating/using the newly introduced SQL visible function,
row_security_active().

Along the way, clean up three call sites of check_enable_rls(). The second
argument of that function should only be specified as other than
InvalidOid when we are checking as a different user than the current one,
as in when querying through a view. These sites were passing GetUserId()
instead of InvalidOid, which can cause the function to return incorrect
results if the current user has the BYPASSRLS privilege and row_security
has been set to OFF.

Additionally fix a bug causing RI Trigger error messages to unintentionally
leak information when RLS is enabled, and other minor cleanup and
improvements. Also add WITH (security_barrier) to the definition of pg_stats.

Bumped CATVERSION due to new SQL functions and pg_stats view definition.

Back-patch to 9.5 where RLS was introduced. Reported by Yaroslav.
Patch by Joe Conway and Dean Rasheed with review and input by
Michael Paquier and Stephen Frost.
2015-07-28 13:21:22 -07:00
426746b930 Remove ssl renegotiation support.
While postgres' use of SSL renegotiation is a good idea in theory, it
turned out to not work well in practice. The specification and openssl's
implementation of it have lead to several security issues. Postgres' use
of renegotiation also had its share of bugs.

Additionally OpenSSL has a bunch of bugs around renegotiation, reported
and open for years, that regularly lead to connections breaking with
obscure error messages. We tried increasingly complex workarounds to get
around these bugs, but we didn't find anything complete.

Since these connection breakages often lead to hard to debug problems,
e.g. spuriously failing base backups and significant latency spikes when
synchronous replication is used, we have decided to change the default
setting for ssl renegotiation to 0 (disabled) in the released
backbranches and remove it entirely in 9.5 and master.

Author: Andres Freund
Discussion: 20150624144148.GQ4797@alap3.anarazel.de
Backpatch: 9.5 and master, 9.0-9.4 get a different patch
2015-07-28 22:06:31 +02:00
6f2871f12e Centralize decision-making about where to get a backend's PGPROC.
This code was originally written as part of parallel query effort, but
it seems to have independent value, because if we make one decision
about where to get a PGPROC when we allocate and then put it back on a
different list at backend-exit time, bad things happen.  This isn't
just a theoretical risk; we fixed an actual problem of this type in
commit e280c630a87e1b8325770c6073097d109d79a00f.
2015-07-28 14:51:57 -04:00
5533a272dd Don't assume that 'char' is signed.
On some platforms, notably ARM and PowerPC, 'char' is unsigned by
default. This fixes an assertion failure at WAL replay on such platforms.

Reported by Noah Misch. Backpatch to 9.5, where this was broken.
2015-07-27 21:51:25 +03:00
023430abf7 Fix handling of all-zero pages in SP-GiST vacuum.
SP-GiST initialized an all-zeros page at vacuum, but that was not
WAL-logged, which is not safe. You might get a torn page write, when it gets
flushed to disk, and end-up with a half-initialized index page. To fix,
leave it in the all-zeros state, and add it to the FSM. It will be
initialized when reused. Also don't set the page-deleted flag when recycling
an empty page. That was also not WAL-logged, and a torn write of that would
cause the page to have an invalid checksum.

Backpatch to 9.2, where SP-GiST indexes were added.
2015-07-27 12:28:21 +03:00
dd7a8f66ed Redesign tablesample method API, and do extensive code review.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM.  (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.)  Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type.  This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.

Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.

Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.

Back-patch to 9.5 so that we don't have to support the original API
in production.
2015-07-25 14:39:00 -04:00
c1ca3a19df Fix bug around assignment expressions containing indirections.
Handling of assigned-to expressions with indirection (e.g. set f1[1] =
3) was broken for ON CONFLICT DO UPDATE.  The problem was that
ParseState was consulted to determine if an INSERT-appropriate or
UPDATE-appropriate behavior should be used when transforming expressions
with indirections. When the wrong path was taken the old row was
substituted with NULL, leading to wrong results..

To fix remove p_is_update and only use p_is_insert to decide how to
transform the assignment expression, and uset p_is_insert while parsing
the on conflict statement. This isn't particularly pretty, but it's not
any worse than before.

Author: Peter Geoghegan, slightly edited by me
Discussion: CAM3SWZS8RPvA=KFxADZWw3wAHnnbxMxDzkEC6fNaFc7zSm411w@mail.gmail.com
Backpatch: 9.5, where the feature was introduced
2015-07-24 11:52:07 +02:00
434873806a Fix some oversights in BRIN patch.
Remove HeapScanDescData.rs_initblock, which wasn't being used for anything
in the final version of the patch.

Fix IndexBuildHeapScan so that it supports syncscan again; the patch
broke synchronous scanning for index builds by forcing rs_startblk
to zero even when the caller did not care about that and had asked
for syncscan.

Add some commentary and usage defenses to heap_setscanlimits().

Fix heapam so that asking for rs_numblocks == 0 does what you would
reasonably expect.  As coded it amounted to requesting a whole-table
scan, because those "--x <= 0" tests on an unsigned variable would
behave surprisingly.
2015-07-21 13:38:24 -04:00
149b1dd840 Fix omission of OCLASS_TRANSFORM in object_classes[]
This was forgotten in cac76582053e (and its fixup ad89a5d115).  Since it
seems way too easy to miss this, this commit also introduces a mechanism
to enforce that the array is consistent with the enum.

Problem reported independently by Robert Haas and Jaimin Pan.
Patches proposed by Jaimin Pan, Jim Nasby, Michael Paquier and myself,
though I didn't use any of these and instead went with a cleaner
approach suggested by Tom Lane.

Backpatch to 9.5.

Discussion:
https://www.postgresql.org/message-id/CA+Tgmoa6SgDaxW_n_7SEhwBAc=mniYga+obUj5fmw4rU9_mLvA@mail.gmail.com
https://www.postgresql.org/message-id/29788.1437411581@sss.pgh.pa.us
2015-07-21 13:20:53 +02:00
13f2db2ffb Handle AT_ReAddComment in test_ddl_deparse, and add a catch-all default.
In the passing, also move AT_ReAddComment to more logical position in the
enum, after all the Constraint-related subcommands.

This fixes a compiler warning, added by commit e42375fc. Backpatch to 9.5,
like that patch.
2015-07-20 10:25:26 +03:00
e02d44b8a7 Support JSON negative array subscripts everywhere
Previously, there was an inconsistency across json/jsonb operators that
operate on datums containing JSON arrays -- only some operators
supported negative array count-from-the-end subscripting.  Specifically,
only a new-to-9.5 jsonb deletion operator had support (the new "jsonb -
integer" operator).  This inconsistency seemed likely to be
counter-intuitive to users.  To fix, allow all places where the user can
supply an integer subscript to accept a negative subscript value,
including path-orientated operators and functions, as well as other
extraction operators.  This will need to be called out as an
incompatibility in the 9.5 release notes, since it's possible that users
are relying on certain established extraction operators changed here
yielding NULL in the event of a negative subscript.

For the json type, this requires adding a way of cheaply getting the
total JSON array element count ahead of time when parsing arrays with a
negative subscript involved, necessitating an ad-hoc lex and parse.
This is followed by a "conversion" from a negative subscript to its
equivalent positive-wise value using the count.  From there on, it's as
if a positive-wise value was originally provided.

Note that there is still a minor inconsistency here across jsonb
deletion operators.  Unlike the aforementioned new "-" deletion operator
that accepts an integer on its right hand side, the new "#-" path
orientated deletion variant does not throw an error when it appears like
an array subscript (input that could be recognized by as an integer
literal) is being used on an object, which is wrong-headed.  The reason
for not being stricter is that it could be the case that an object pair
happens to have a key value that looks like an integer; in general,
these two possibilities are impossible to differentiate with rhs path
text[] argument elements.  However, we still don't allow the "#-"
path-orientated deletion operator to perform array-style subscripting.
Rather, we just return the original left operand value in the event of a
negative subscript (which seems analogous to how the established
"jsonb/json #> text[]" path-orientated operator may yield NULL in the
event of an invalid subscript).

In passing, make SetArrayPath() stricter about not accepting cases where
there is trailing non-numeric garbage bytes rather than a clean NUL
byte.  This means, for example, that strings like "10e10" are now not
accepted as an array subscript of 10 by some new-to-9.5 path-orientated
jsonb operators (e.g. the new #- operator).  Finally, remove dead code
for jsonb subscript deletion; arguably, this should have been done in
commit b81c7b409.

Peter Geoghegan and Andrew Dunstan
2015-07-17 21:13:47 -04:00
a04bb65f70 Add new function pg_notification_queue_usage.
This tells you what fraction of NOTIFY's queue is currently filled.

Brendan Jurd, reviewed by Merlin Moncure and Gurjeet Singh.  A few
further tweaks by me.
2015-07-17 09:12:03 -04:00
321eed5f0f Add ALTER OPERATOR command, for changing selectivity estimator functions.
Other options cannot be changed, as it's not totally clear if cached plans
would need to be invalidated if one of the other options change. Selectivity
estimator functions only change plan costs, not correctness of plans, so
those should be safe.

Original patch by Uriy Zhuravlev, heavily edited by me.
2015-07-14 18:17:55 +03:00
e42375fc81 Retain comments on indexes and constraints at ALTER TABLE ... TYPE ...
When a column's datatype is changed, ATExecAlterColumnType() rebuilds all
the affected indexes and constraints, and the comments from the old
indexes/constraints were not carried over.

To fix, create a synthetic COMMENT ON command in the work queue, to re-add
any comments on constraints. For indexes, there's a comment field in
IndexStmt that is used.

This fixes bug #13126, reported by Kirill Simonov. Original patch by
Michael Paquier, reviewed by Petr Jelinek and me. This bug is present in
all versions, but only backpatch to 9.5. Given how minor the issue is, it
doesn't seem worth the work and risk to backpatch further than that.
2015-07-14 11:40:22 +03:00
6ba365aa46 Fix obsolete comment regarding NOTICE message level.
By default NOTICE message is not sent to server log because
the default value of log_min_messages is WARNING since 8.4.

Pavel Stehule
2015-07-09 22:52:36 +09:00
1e700e0fa0 Given a gcc-compatible xlc compiler, prefer xlc-style atomics.
This evades a ppc64le "IBM XL C/C++ for Linux" compiler bug.  Back-patch
to 9.5, where the atomics facility was introduced.
2015-07-08 20:44:21 -04:00
0d32d2e693 Finish generic-xlc.h draft atomics implementation.
Back-patch to 9.5, where commit b64d92f1a5602c55ee8b27a7ac474f03b7aee340
introduced this file.
2015-07-08 20:44:21 -04:00
be8b06c364 Revoke support for strxfrm() that write past the specified array length.
This formalizes a decision implicit in commit
4ea51cdfe85ceef8afabceb03c446574daa0ac23 and adds clean detection of
affected systems.  Vendor updates are available for each such known bug.
Back-patch to 9.5, where the aforementioned commit first appeared.
2015-07-08 20:44:21 -04:00
10fb48d66d Add an optional missing_ok argument to SQL function current_setting().
This allows convenient checking for existence of a GUC from SQL, which is
particularly useful when dealing with custom variables.

David Christensen, reviewed by Jeevan Chalke
2015-07-02 16:41:07 -04:00
7261172430 Remove obsolete heap_formtuple/modifytuple/deformtuple functions.
These variants used the old-style 'n'/' ' NULL indicators. The new-style
functions have been available since version 8.1. That should be long enough
that if there is still any old external code using these functions, they
can just switch to the new functions without worrying about backwards
compatibility

Peter Geoghegan
2015-07-02 21:21:23 +03:00
7931622d1d Fix name of argument to pg_stat_file.
It's called "missing_ok" in the docs and in the C code.

I refrained from doing a catversion bump for this, because the name of an
input argument is just documentation, it has no effect on any callers.

Michael Paquier
2015-07-02 12:15:13 +03:00
fb174687f7 Make use of xlog_internal.h's macros in WAL-related utilities.
Commit 179cdd09 added macros to check if a filename is a WAL segment
or other such file. However there were still some instances of the
strlen + strspn combination to check for that in WAL-related utilities
like pg_archivecleanup. Those checks can be replaced with the macros.

This patch makes use of the macros in those utilities and
which would make the code a bit easier to read.

Back-patch to 9.5.

Michael Paquier
2015-07-02 10:35:38 +09:00
cf8d65de10 Stamp HEAD as 9.6devel.
Let the hacking begin ...
2015-06-30 14:01:15 -04:00
302ac7f271 Add assertion to check the special size is sane before dereferencing it.
This seems useful to catch errors of the sort I just fixed, where
PageGetSpecialPointer is called before initializing the page.
2015-06-30 13:44:04 +03:00
f78329d594 Stamp 9.5alpha1. 2015-06-29 15:42:18 -04:00
cbc8d65639 Code + docs review for escaping of option values (commit 11a020eb6).
Avoid memory leak from incorrect choice of how to free a StringInfo
(resetStringInfo doesn't do it).  Now that pg_split_opts doesn't scribble
on the optstr, mark that as "const" for clarity.  Attach the commentary in
protocol.sgml to the right place, and add documentation about the
user-visible effects of this change on postgres' -o option and libpq's
PGOPTIONS option.
2015-06-29 12:42:52 -04:00
07cb8b02ab Replace ia64 S_UNLOCK compiler barrier with a full memory barrier.
_Asm_sched_fence() is just a compiler barrier, not a memory barrier. But
spinlock release on IA64 needs, at the very least, release
semantics. Use a full barrier instead.

This might be the cause for the occasional failures on buildfarm member
anole.

Discussion: 20150629101108.GB17640@alap3.anarazel.de
2015-06-29 14:53:32 +02:00
62d16c7fc5 Improve design and implementation of pg_file_settings view.
As first committed, this view reported on the file contents as they were
at the last SIGHUP event.  That's not as useful as reporting on the current
contents, and what's more, it didn't work right on Windows unless the
current session had serviced at least one SIGHUP.  Therefore, arrange to
re-read the files when pg_show_all_settings() is called.  This requires
only minor refactoring so that we can pass changeVal = false to
set_config_option() so that it won't actually apply any changes locally.

In addition, add error reporting so that errors that would prevent the
configuration files from being loaded, or would prevent individual settings
from being applied, are visible directly in the view.  This makes the view
usable for pre-testing whether edits made in the config files will have the
desired effect, before one actually issues a SIGHUP.

I also added an "applied" column so that it's easy to identify entries that
are superseded by later entries; this was the main use-case for the original
design, but it seemed unnecessarily hard to use for that.

Also fix a 9.4.1 regression that allowed multiple entries for a
PGC_POSTMASTER variable to cause bogus complaints in the postmaster log.
(The issue here was that commit bf007a27acd7b2fb unintentionally reverted
3e3f65973a3c94a6, which suppressed any duplicate entries within
ParseConfigFp.  However, since the original coding of the pg_file_settings
view depended on such suppression *not* happening, we couldn't have fixed
this issue now without first doing something with pg_file_settings.
Now we suppress duplicates by marking them "ignored" within
ProcessConfigFileInternal, which doesn't hide them in the view.)

Lesser changes include:

Drive the view directly off the ConfigVariable list, instead of making a
basically-equivalent second copy of the data.  There's no longer any need
to hang onto the data permanently, anyway.

Convert show_all_file_settings() to do its work in one call and return a
tuplestore; this avoids risks associated with assuming that the GUC state
will hold still over the course of query execution.  (I think there were
probably latent bugs here, though you might need something like a cursor
on the view to expose them.)

Arrange to run SIGHUP processing in a short-lived memory context, to
forestall process-lifespan memory leaks.  (There is one known leak in this
code, in ProcessConfigDirectory; it seems minor enough to not be worth
back-patching a specific fix for.)

Remove mistaken assignment to ConfigFileLineno that caused line counting
after an include_dir directive to be completely wrong.

Add missed failure check in AlterSystemSetConfigFile().  We don't really
expect ParseConfigFp() to fail, but that's not an excuse for not checking.
2015-06-28 18:06:14 -04:00
cb2acb1081 Add missing_ok option to the SQL functions for reading files.
This makes it possible to use the functions without getting errors, if there
is a chance that the file might be removed or renamed concurrently.
pg_rewind needs to do just that, although this could be useful for other
purposes too. (The changes to pg_rewind to use these functions will come in
a separate commit.)

The read_binary_file() function isn't very well-suited for extensions.c's
purposes anymore, if it ever was. So bite the bullet and make a copy of it
in extension.c, tailored for that use case. This seems better than the
accidental code reuse, even if it's a some more lines of code.

Michael Paquier, with plenty of kibitzing by me.
2015-06-28 21:35:46 +03:00
604e99396d Add opaque declaration of HTAB to tqual.h.
Commit b89e151054a05f0f6d356ca52e3b725dd0505e53 added the
ResolveCminCmaxDuringDecoding declaration to tqual.h, which uses an
HTAB parameter, without declaring HTAB.  It accidentally fails to
fail to build with current sources because a declaration happens to
be included, directly or indirectly, in all source files that
currently use tqual.h before tqual.h is first included, but we
shouldn't count on that.  Since an opaque declaration is enough
here, just use that, as was done in snapmgr.h.

Backpatch to 9.4, where the HTAB reference was added to tqual.h.
2015-06-27 09:55:06 -05:00
7d60b2af34 Fix DDL command collection for TRANSFORM
Commit b488c580ae, which added the DDL command collection feature,
neglected to update the code that commit cac76582053e had previously
added two weeks earlier for the TRANSFORM feature.

Reported by Michael Paquier.
2015-06-26 18:17:54 -03:00
8f15f74a44 Be more conservative about removing tablespace "symlinks".
Don't apply rmtree(), which will gleefully remove an entire subtree,
and don't even apply unlink() unless it's symlink or a directory,
the only things that we expect to find.

Amit Kapila, with minor tweaks by me, per extensive discussions
involving Andrew Dunstan, Fujii Masao, and Heikki Linnakangas,
at least some of whom also reviewed the code.
2015-06-26 15:53:13 -04:00
5ca611841b Improve handling of CustomPath/CustomPlan(State) children.
Allow CustomPath to have a list of paths, CustomPlan a list of plans,
and CustomPlanState a list of planstates known to the core system, so
that custom path/plan providers can more reasonably use this
infrastructure for nodes with multiple children.

KaiGai Kohei, per a design suggestion from Tom Lane, with some
further kibitzing by me.
2015-06-26 09:40:47 -04:00
5d1ff6bd55 Fix the logic for putting relations into the relcache init file.
Commit f3b5565dd4e59576be4c772da364704863e6a835 was a couple of bricks shy
of a load; specifically, it missed putting pg_trigger_tgrelid_tgname_index
into the relcache init file, because that index is not used by any
syscache.  However, we have historically nailed that index into cache for
performance reasons.  The upshot was that load_relcache_init_file always
decided that the init file was busted and silently ignored it, resulting
in a significant hit to backend startup speed.

To fix, reinstantiate RelationIdIsInInitFile() as a wrapper around
RelationSupportsSysCache(), which can know about additional relations
that should be in the init file despite being unknown to syscache.c.

Also install some guards against future mistakes of this type: make
write_relcache_init_file Assert that all nailed relations get written to
the init file, and make load_relcache_init_file emit a WARNING if it takes
the "wrong number of nailed relations" exit path.  Now that we remove the
init files during postmaster startup, that case should never occur in the
field, even if we are starting a minor-version update that added or removed
rels from the nailed set.  So the warning shouldn't ever be seen by end
users, but it will show up in the regression tests if somebody breaks this
logic.

Back-patch to all supported branches, like the previous commit.
2015-06-25 14:39:05 -04:00
41d798a139 Fix comment in fmgr.h to refer to actual function used.
FunctionLookup() is long gone if it ever existed, and fmgr_info() is
what's now used, so the comments now reflect that.
2015-06-15 23:21:03 -04:00
b5fe62038f Make postmaster restart archiver soon after it dies, even during recovery.
After the archiver dies, postmaster tries to start a new one immediately.
But previously this could happen only while server was running normally
even though archiving was enabled always (i.e., archive_mode was set to
always). So the archiver running during recovery could not restart soon
after it died. This is an oversight in commit ffd3774.

This commit changes reaper(), postmaster's signal handler to cleanup
after a child process dies, so that it tries to a new archiver even during
recovery if necessary.

Patch by me. Review by Alvaro Herrera.
2015-06-12 23:11:51 +09:00
908e234733 Rename jsonb - text[] operator to #- to avoid ambiguity.
Following recent discussion  on -hackers. The underlying function is
also renamed to jsonb_delete_path. The regression tests now don't need
ugly type casts to avoid the ambiguity, so they are also removed.

Catalog version bumped.
2015-06-11 10:06:58 -04:00
870681017a Fix typo in comment.
Backpatch to 9.4 to minimize possible conflicts.
2015-06-10 17:03:56 -05:00
ea9c4c1e4a Fix typo in comment.
David Rowley
2015-06-10 15:26:02 +09:00
f3b5565dd4 Use a safer method for determining whether relcache init file is stale.
When we invalidate the relcache entry for a system catalog or index, we
must also delete the relcache "init file" if the init file contains a copy
of that rel's entry.  The old way of doing this relied on a specially
maintained list of the OIDs of relations present in the init file: we made
the list either when reading the file in, or when writing the file out.
The problem is that when writing the file out, we included only rels
present in our local relcache, which might have already suffered some
deletions due to relcache inval events.  In such cases we correctly decided
not to overwrite the real init file with incomplete data --- but we still
used the incomplete initFileRelationIds list for the rest of the current
session.  This could result in wrong decisions about whether the session's
own actions require deletion of the init file, potentially allowing an init
file created by some other concurrent session to be left around even though
it's been made stale.

Since we don't support changing the schema of a system catalog at runtime,
the only likely scenario in which this would cause a problem in the field
involves a "vacuum full" on a catalog concurrently with other activity, and
even then it's far from easy to provoke.  Remarkably, this has been broken
since 2002 (in commit 786340441706ac1957a031f11ad1c2e5b6e18314), but we had
never seen a reproducible test case until recently.  If it did happen in
the field, the symptoms would probably involve unexpected "cache lookup
failed" errors to begin with, then "could not open file" failures after the
next checkpoint, as all accesses to the affected catalog stopped working.
Recovery would require manually removing the stale "pg_internal.init" file.

To fix, get rid of the initFileRelationIds list, and instead consult
syscache.c's list of relations used in catalog caches to decide whether a
relation is included in the init file.  This should be a tad more efficient
anyway, since we're replacing linear search of a list with ~100 entries
with a binary search.  It's a bit ugly that the init file contents are now
so directly tied to the catalog caches, but in practice that won't make
much difference.

Back-patch to all supported branches.
2015-06-07 15:32:09 -04:00
3f59be836c Fix planner's cost estimation for SEMI/ANTI joins with inner indexscans.
When the inner side of a nestloop SEMI or ANTI join is an indexscan that
uses all the join clauses as indexquals, it can be presumed that both
matched and unmatched outer rows will be processed very quickly: for
matched rows, we'll stop after fetching one row from the indexscan, while
for unmatched rows we'll have an indexscan that finds no matching index
entries, which should also be quick.  The planner already knew about this,
but it was nonetheless charging for at least one full run of the inner
indexscan, as a consequence of concerns about the behavior of materialized
inner scans --- but those concerns don't apply in the fast case.  If the
inner side has low cardinality (many matching rows) this could make an
indexscan plan look far more expensive than it actually is.  To fix,
rearrange the work in initial_cost_nestloop/final_cost_nestloop so that we
don't add the inner scan cost until we've inspected the indexquals, and
then we can add either the full-run cost or just the first tuple's cost as
appropriate.

Experimentation with this fix uncovered another problem: add_path and
friends were coded to disregard cheap startup cost when considering
parameterized paths.  That's usually okay (and desirable, because it thins
the path herd faster); but in this fast case for SEMI/ANTI joins, it could
result in throwing away the desired plain indexscan path in favor of a
bitmap scan path before we ever get to the join costing logic.  In the
many-matching-rows cases of interest here, a bitmap scan will do a lot more
work than required, so this is a problem.  To fix, add a per-relation flag
consider_param_startup that works like the existing consider_startup flag,
but applies to parameterized paths, and set it for relations that are the
inside of a SEMI or ANTI join.

To make this patch reasonably safe to back-patch, care has been taken to
avoid changing the planner's behavior except in the very narrow case of
SEMI/ANTI joins with inner indexscans.  There are places in
compare_path_costs_fuzzily and add_path_precheck that are not terribly
consistent with the new approach, but changing them will affect planner
decisions at the margins in other cases, so we'll leave that for a
HEAD-only fix.

Back-patch to 9.3; before that, the consider_startup flag didn't exist,
meaning that the second aspect of the patch would be too invasive.

Per a complaint from Peter Holzer and analysis by Tomas Vondra.
2015-06-03 11:59:10 -04:00
37def42245 Rename jsonb_replace to jsonb_set and allow it to add new values
The function is given a fourth parameter, which defaults to true. When
this parameter is true, if the last element of the path is missing
in the original json, jsonb_set creates it in the result and assigns it
the new value. If it is false then the function does nothing unless all
elements of the path are present, including the last.

Based on some original code from Dmitry Dolgov, heavily modified by me.

Catalog version bumped.
2015-05-31 20:34:10 -04:00
1c8c656b3c Check that all aliases of a built-in function have same leakproof property.
opr_sanity.sql has a test checking that relevant properties of built-in
functions match when the same C function is referenced by multiple pg_proc
entries.  The test neglected to check proleakproof, though, and when
I added that condition it exposed that xideqint4 hadn't been updated to
match xideq.  So fix that as well, and in consequence bump catversion.

This isn't very critical, so no need to worry about fixing back branches.
2015-05-29 13:26:21 -04:00