This enables external code to create access methods. This is useful so
that extensions can add their own access methods which can be formally
tracked for dependencies, so that DROP operates correctly. Also, having
explicit support makes pg_dump work correctly.
Currently only index AMs are supported, but we expect different types to
be added in the future.
Authors: Alexander Korotkov, Petr Jelínek
Reviewed-By: Teodor Sigaev, Petr Jelínek, Jim Nasby
Commitfest-URL: https://commitfest.postgresql.org/9/353/
Discussion: https://www.postgresql.org/message-id/CAPpHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA@mail.gmail.com
Previously, we included <xlocale.h> only if necessary to get the definition
of type locale_t. According to notes in PGAC_TYPE_LOCALE_T, this is
important because on some versions of glibc that file supplies an
incompatible declaration of locale_t. (This info may be obsolete, because
on my RHEL6 box that seems to be the *only* definition of locale_t; but
there may still be glibc's in the wild for which it's a live concern.)
It turns out though that on FreeBSD and maybe other BSDen, you can get
locale_t from stdlib.h or locale.h but mbstowcs_l() and friends only from
<xlocale.h>. This was leaving us compiling calls to mbstowcs_l() and
friends with no visible prototype, which causes a warning and could
possibly cause actual trouble, since it's not declared to return int.
Hence, adjust the configure checks so that we'll include <xlocale.h>
either if it's necessary to get type locale_t or if it's necessary to
get a declaration of mbstowcs_l().
Report and patch by Aleksander Alekseev, somewhat whacked around by me.
Back-patch to all supported branches, since we have been using
mbstowcs_l() since 9.1.
This patch widens SPI_processed, EState's es_processed field, PortalData's
portalPos field, FuncCallContext's call_cntr and max_calls fields,
ExecutorRun's count argument, PortalRunFetch's result, and the max number
of rows in a SPITupleTable to uint64, and deals with (I hope) all the
ensuing fallout. Some of these values were declared uint32 before, and
others "long".
I also removed PortalData's posOverflow field, since that logic seems
pretty useless given that portalPos is now always 64 bits.
The user-visible results are that command tags for SELECT etc will
correctly report tuple counts larger than 4G, as will plpgsql's GET
GET DIAGNOSTICS ... ROW_COUNT command. Queries processing more tuples
than that are still not exactly the norm, but they're becoming more
common.
Most values associated with FETCH/MOVE distances, such as PortalRun's count
argument and the count argument of most SPI functions that have one, remain
declared as "long". It's not clear whether it would be worth promoting
those to int64; but it would definitely be a large dollop of additional
API churn on top of this, and it would only help 32-bit platforms which
seem relatively less likely to see any benefit.
Andreas Scherbaum, reviewed by Christian Ullrich, additional hacking by me
emit_log_hook could only see the translated text, making it harder to identify
which message was being sent. Pass original text to allow the exact message to
be identified, whichever language is used for logging.
Discussion: 20160216.184755.59721141.horiguchi.kyotaro@lab.ntt.co.jp
Author: Kyotaro Horiguchi
Earlier version committed in 9.0 caused spurious waits in some cases.
New infrastructure for lock waits in 9.3 used to correct and improve this.
Jeff Janes based upon a proposal by Simon Riggs, who also reviewed
Additional review comments from Amit Kapila
Add four new SQL accessible functions: pg_control_system(),
pg_control_checkpoint(), pg_control_recovery(), and pg_control_init()
which expose a subset of the control file data.
Along the way move the code to read and validate the control file to
src/common, where it can be shared by the new backend functions
and the original pg_controldata frontend program.
Patch by me, significant input, testing, and review by Michael Paquier.
This patch introduces "pg_blocking_pids(int) returns int[]", which returns
the PIDs of any sessions that are blocking the session with the given PID.
Historically people have obtained such information using a self-join on
the pg_locks view, but it's unreasonably tedious to do it that way with any
modicum of correctness, and the addition of parallel queries has pretty
much broken that approach altogether. (Given some more columns in the view
than there are today, you could imagine handling parallel-query cases with
a 4-way join; but ugh.)
The new function has the following behaviors that are painful or impossible
to get right via pg_locks:
1. Correctly understands which lock modes block which other ones.
2. In soft-block situations (two processes both waiting for conflicting lock
modes), only the one that's in front in the wait queue is reported to
block the other.
3. In parallel-query cases, reports all sessions blocking any member of
the given PID's lock group, and reports a session by naming its leader
process's PID, which will be the pg_backend_pid() value visible to
clients.
The motivation for doing this right now is mostly to fix the isolation
tests. Commit 38f8bdcac4982215beb9f65a19debecaf22fd470 lobotomized
isolationtester's is-it-waiting query by removing its ability to recognize
nonconflicting lock modes, as a crude workaround for the inability to
handle soft-block situations properly. But even without the lock mode
tests, the old query was excessively slow, particularly in
CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new
deadlock-hard test because the deadlock timeout elapses before they can
probe the waiting status of all eight sessions. Replacing the pg_locks
self-join with use of pg_blocking_pids() is not only much more correct, but
a lot faster: I measure it at about 9X faster in a typical dev build with
Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds. That should provide
enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the
test, without having to lengthen deadlock_timeout yet more and thus slow
down the test for everyone else.
This will parse strings in the format produced by pg_size_pretty() and
return sizes in bytes. This allows queries to be written with clauses
like "pg_total_relation_size(oid) > pg_size_bytes('10 GB')".
Author: Pavel Stehule with various improvements by Vitaly Burovoy
Discussion: http://www.postgresql.org/message-id/CAFj8pRD-tGoDKnxdYgECzA4On01_uRqPrwF-8LdkSE-6bDHp0w@mail.gmail.com
Reviewed-by: Vitaly Burovoy, Oleksandr Shulgin, Kyotaro Horiguchi,
Michael Paquier and Robert Haas
Move and refactor the underlying code for the pg_config client
application to src/common in support of sharing it with a new
system information SRF called pg_config() which makes the same
information available via SQL. Additionally wrap the SRF with a
new system view, as called pg_config.
Patch by me with extensive input and review by Michael Paquier
and additional review by Alvaro Herrera.
When processing ordered aggregates following a sort that could make use
of the abbreviated key optimization, only call the equality operator to
compare successive pairs of tuples when their abbreviated keys were not
equal.
Peter Geoghegan, reviewd by Andreas Karlsson and by me.
It seems that sprintf(), at least in glibc's version, is unreasonably slow
compared to hand-rolled code for printing integers. Replacing most uses of
sprintf() in the datetime.c output functions with special-purpose code
turns out to give more than a 2X speedup in COPY of a table with a single
timestamp column; which is pretty impressive considering all the other
logic in that code path.
David Rowley and Andres Freund, reviewed by Peter Geoghegan and myself
An example use-case is "CHECK(num_nonnulls(a,b,c) = 1)" to assert that
exactly one of a,b,c isn't NULL. The functions are variadic, so they
can also be pressed into service to count the number of null or nonnull
elements in an array.
Marko Tiikkaja, reviewed by Pavel Stehule
Have varlena.c expose an interface that allows the char(n), bytea, and
bpchar types to piggyback on a now-generalized SortSupport for text.
This pushes a little more knowledge of the bpchar/char(n) type into
varlena.c than might be preferred, but that seems like the approach
that creates least friction. Also speed things up for index builds
that use text_pattern_ops or varchar_pattern_ops.
This patch does quite a bit of renaming, but it seems likely to be
worth it, so as to avoid future confusion about the fact that this code
is now more generally used than the old names might have suggested.
Peter Geoghegan, reviewed by Álvaro Herrera and Andreas Karlsson,
with small tweaks by me.
Previously, the foreign join pushdown infrastructure left the question
of security entirely up to individual FDWs, but it would be easy for
a foreign data wrapper to inadvertently open up subtle security holes
that way. So, make it the core code's job to determine which user
mapping OID is relevant, and don't attempt join pushdown unless it's
the same for all relevant relations.
Per a suggestion from Tom Lane. Shigeru Hanada and Ashutosh Bapat,
reviewed by Etsuro Fujita and KaiGai Kohei, with some further
changes by me.
The implementations go to some lengths to deliver exact results for values
where an exact result can be expected, such as sind(30) = 0.5 exactly.
Dean Rasheed, reviewed by Michael Paquier
Commit e529cd4ffa605c6f introduced an Assert requiring NAMEDATALEN to be
less than MAX_LEVENSHTEIN_STRLEN, which has been 255 for a long time.
Since up to that instant we had always allowed NAMEDATALEN to be
substantially more than that, this was ill-advised.
It's debatable whether we need MAX_LEVENSHTEIN_STRLEN at all (versus
putting a CHECK_FOR_INTERRUPTS into the loop), or whether it has to be
so tight; but this patch takes the narrower approach of just not applying
the MAX_LEVENSHTEIN_STRLEN limit to calls from the parser.
Trusting the parser for this seems reasonable, first because the strings
are limited to NAMEDATALEN which is unlikely to be hugely more than 256,
and second because the maximum distance is tightly constrained by
MAX_FUZZY_DISTANCE (though we'd forgotten to make use of that limit in one
place). That means the cost is not really O(mn) but more like O(max(m,n)).
Relaxing the limit for user-supplied calls is left for future research;
given the lack of complaints to date, it doesn't seem very high priority.
In passing, fix confusion between lengths-in-bytes and lengths-in-chars
in comments and error messages.
Per gripe from Kevin Day; solution suggested by Robert Haas. Back-patch
to 9.5 where the unwanted restriction was introduced.
The amvalidate functions added in commit 65c5fcd353a859da were on the
crude side. Improve them in a few ways:
* Perform signature checking for operators and support functions.
* Apply more thorough checks for missing operators and functions,
where possible.
* Instead of reporting problems as ERRORs, report most problems as INFO
messages and make the amvalidate function return FALSE. This allows
more than one problem to be discovered per run.
* Report object names rather than OIDs, and work a bit harder on making
the messages understandable.
Also, remove a few more opr_sanity regression test queries that are
now superseded by the amvalidate checks.
The conventions specified by the GiST SGML documentation were widely
ignored. For example, the strategy-number argument for "consistent" and
"distance" functions is specified to be a smallint, but most of the
built-in support functions declared it as an integer, and for that matter
the core code passed it using Int32GetDatum not Int16GetDatum. None of
that makes any real difference at runtime, but it's quite confusing for
newcomers to the code, and it makes it very hard to write an amvalidate()
function that checks support function signatures. So let's try to instill
some consistency here.
Another similar issue is that the "query" argument is not of a single
well-defined type, but could have different types depending on the strategy
(corresponding to search operators with different righthand-side argument
types). Some of the functions threw up their hands and declared the query
argument as being of "internal" type, which surely isn't right ("any" would
have been more appropriate); but the majority position seemed to be to
declare it as being of the indexed data type, corresponding to a search
operator with both input types the same. So I've specified a convention
that that's what to do always.
Also, the result of the "union" support function actually must be of the
index's storage type, but the documentation suggested declaring it to
return "internal", and some of the functions followed that. Standardize
on telling the truth, instead.
Similarly, standardize on declaring the "same" function's inputs as
being of the storage type, not "internal".
Also, somebody had forgotten to add the "recheck" argument to both
the documentation of the "distance" support function and all of their
SQL declarations, even though the C code was happily using that argument.
Clean that up too.
Fix up some other omissions in the docs too, such as documenting that
union's second input argument is vestigial.
So far as the errors in core function declarations go, we can just fix
pg_proc.h and bump catversion. Adjusting the erroneous declarations in
contrib modules is more debatable: in principle any change in those
scripts should involve an extension version bump, which is a pain.
However, since these changes are purely cosmetic and make no functional
difference, I think we can get away without doing that.
This patch reduces pg_am to just two columns, a name and a handler
function. All the data formerly obtained from pg_am is now provided
in a C struct returned by the handler function. This is similar to
the designs we've adopted for FDWs and tablesample methods. There
are multiple advantages. For one, the index AM's support functions
are now simple C functions, making them faster to call and much less
error-prone, since the C compiler can now check function signatures.
For another, this will make it far more practical to define index access
methods in installable extensions.
A disadvantage is that SQL-level code can no longer see attributes
of index AMs; in particular, some of the crosschecks in the opr_sanity
regression test are no longer possible from SQL. We've addressed that
by adding a facility for the index AM to perform such checks instead.
(Much more could be done in that line, but for now we're content if the
amvalidate functions more or less replace what opr_sanity used to do.)
We might also want to expose some sort of reporting functionality, but
this patch doesn't do that.
Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily
editorialized on by me.
Omitted boundaries represent the upper or lower limit of the corresponding
array subscript. This allows simpler specification of many common
use-cases.
(Revised version of commit 9246af6799819847faa33baf441251003acbb8fe)
YUriy Zhuravlev
In commit 8abb3cda0ddc00a0ab98977a1633a95b97068d4e I attempted to cache
the expression state trees constructed for domain CHECK constraints for
the life of the backend (assuming the domain's constraints don't get
redefined). However, this turns out not to work very well, because
execQual.c will run those state trees with ecxt_per_query_memory pointing
to a query-lifespan context, and in some situations we'll end up with
pointers into that context getting stored into the state trees. This
happens in particular with SQL-language functions, as reported by
Emre Hasegeli, but there are many other cases.
To fix, keep only the expression plan trees for domain CHECK constraints
in the typcache's data structure, and revert to performing ExecInitExpr
(at least) once per query to set up expression state trees in the query's
context.
Eventually it'd be nice to undo this, but that will require some careful
thought about memory management for expression state trees, and it seems
far too late for any such redesign in 9.5. This way is still much more
efficient than what happened before 8abb3cda0.
The previous way of reconstructing check constraints was to do a separate
"ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance
hierarchy. However, that way has no hope of reconstructing the check
constraints' own inheritance properties correctly, as pointed out in
bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do
a regular "ALTER TABLE", allowing recursion, at the topmost table that
has a particular constraint, and then suppress the work queue entries
for inherited instances of the constraint.
Annoyingly, we'd tried to fix this behavior before, in commit 5ed6546cf,
but we failed to notice that it wasn't reconstructing the pg_constraint
field values correctly.
As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to
always schema-qualify the target table name; this seems like useful backup
to the protections installed by commit 5f173040.
In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused.
(I could alternatively have modified it to also return conislocal, but that
seemed like a pretty single-purpose API, so let's not pretend it has some
other use.) It's unused in the back branches as well, but I left it in
place just in case some third-party code has decided to use it.
In HEAD/9.5, also rename pg_get_constraintdef_string to
pg_get_constraintdef_command, as the previous name did nothing to explain
what that entry point did differently from others (and its comment was
equally useless). Again, that change doesn't seem like material for
back-patching.
I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well.
Otherwise, back-patch to all supported branches.
Add a new flag, consider_parallel, to each RelOptInfo, indicating
whether a plan for that relation could conceivably be run inside of
a parallel worker. Right now, we're pretty conservative: for example,
it might be possible to defer applying a parallel-restricted qual
in a worker, and later do it in the leader, but right now we just
don't try to parallelize access to that relation. That's probably
the right decision in most cases, anyway.
Using the new flag, generate parallel sequential scan plans for plain
baserels, meaning that we now have parallel sequential scan in
PostgreSQL. The logic here is pretty unsophisticated right now: the
costing model probably isn't right in detail, and we can't push joins
beneath Gather nodes, so the number of plans that can actually benefit
from this is pretty limited right now. Lots more work is needed.
Nevertheless, it seems time to enable this functionality so that all
this code can actually be tested easily by users and developers.
Note that, if you wish to test this functionality, it will be
necessary to set max_parallel_degree to a value greater than the
default of 0. Once a few more loose ends have been tidied up here, we
might want to consider changing the default value of this GUC, but
I'm leaving it alone for now.
Along the way, fix a bug in cost_gather: the previous coding thought
that a Gather node's transfer overhead should be costed on the basis of
the relation size rather than the number of tuples that actually need
to be passed off to the leader.
Patch by me, reviewed in earlier versions by Amit Kapila.
This introduces a simple encoding scheme to produce abbreviated keys:
pack as many bytes of each UUID as will fit into a Datum. On
little-endian machines, a byteswap is also performed; the abbreviated
comparator can therefore just consist of a simple 3-way unsigned integer
comparison.
The purpose of this change is to speed up sorting data on a column
of type UUID.
Peter Geoghegan
The user can whitelist specified extension(s) in the foreign server's
options, whereupon we will treat immutable functions and operators of those
extensions as candidates to be sent for remote execution.
Whitelisting an extension in this way basically promises that the extension
exists on the remote server and behaves compatibly with the local instance.
We have no way to prove that formally, so we have to rely on the user to
get it right. But this seems like something that people can usually get
right in practice.
We might in future allow functions and operators to be whitelisted
individually, but extension granularity is a very convenient special case,
so it got done first.
The patch as-committed lacks any regression tests, which is unfortunate,
but introducing dependencies on other extensions for testing purposes
would break "make installcheck" scenarios, which is worse. I have some
ideas about klugy ways around that, but it seems like material for a
separate patch. For the moment, leave the problem open.
Paul Ramsey, hacked up a bit more by me
Commit bda76c1c8cfb1d11751ba6be88f0242850481733 caused both plus and
minus infinity to be rendered as "infinity", which is not only wrong
but inconsistent with the pre-9.4 behavior of to_json(). Fix that by
duplicating the coding in date_out/timestamp_out/timestamptz_out more
closely. Per bug #13687 from Stepan Perlov. Back-patch to 9.4, like
the previous commit.
In passing, also re-pgindent json.c, since it had gotten a bit messed up by
recent patches (and I was already annoyed by indentation-related problems
in back-patching this fix ...)
This code provides infrastructure for a parallel leader to start up
parallel workers to execute subtrees of the plan tree being executed
in the master. User-supplied parameters from ParamListInfo are passed
down, but PARAM_EXEC parameters are not. Various other constructs,
such as initplans, subplans, and CTEs, are also not currently shared.
Nevertheless, there's enough here to support a basic implementation of
parallel query, and we can lift some of the current restrictions as
needed.
Amit Kapila and Robert Haas
Previously, a function call appearing at the top level of WHERE had a
hard-wired selectivity estimate of 0.3333333, a kludge conveniently dated
in the source code itself to July 1992. The expectation at the time was
that somebody would soon implement estimator support functions analogous
to those for operators; but no such code has appeared, nor does it seem
likely to in the near future. We do have an alternative solution though,
at least for immutable functions on single relations: creating an
expression index on the function call will allow ANALYZE to gather stats
about the function's selectivity. But the code in clause_selectivity()
failed to make use of such data even if it exists.
Refactor so that that will happen. I chose to make it try this technique
for any clause type for which clause_selectivity() doesn't have a special
case, not just functions. To avoid adding unnecessary overhead in the
common case where we don't learn anything new, make selfuncs.c provide an
API that hooks directly to examine_variable() and then var_eq_const(),
rather than the previous coding which laboriously constructed an OpExpr
only so that it could be expensively deconstructed again.
I preserved the behavior that the default estimate for a function call
is 0.3333333. (For any other expression node type, it's 0.5, as before.)
I had originally thought to make the default be 0.5 across the board, but
changing a default estimate that's survived for twenty-three years seems
like something not to do without a lot more testing than I care to put
into it right now.
Per a complaint from Jehan-Guillaume de Rorthais. Back-patch into 9.5,
but not further, at least for the moment.
This commit's parent made superfluous the bit's sole usage. Referential
integrity checks have long run as the subject table's owner, and that
now implies RLS bypass. Safe use of the bit was tricky, requiring
strict control over the SQL expressions evaluating therein. Back-patch
to 9.5, where the bit was introduced.
Based on a patch by Stephen Frost.
Every query of a single ENABLE ROW SECURITY table has two meanings, with
the row_security GUC selecting between them. With row_security=force
available, every function author would have been advised to either set
the GUC locally or test both meanings. Non-compliance would have
threatened reliability and, for SECURITY DEFINER functions, security.
Authors already face an obligation to account for search_path, and we
should not mimic that example. With this change, only BYPASSRLS roles
need exercise the aforementioned care. Back-patch to 9.5, where the
row_security GUC was introduced.
Since this narrows the domain of pg_db_role_setting.setconfig and
pg_proc.proconfig, one might bump catversion. A row_security=force
setting in one of those columns will elicit a clear message, so don't.
Commit 924bcf4f16d54c55310b28f77686608684734f42 introduced a framework
for parallel computation in PostgreSQL that makes most but not all
built-in functions safe to execute in parallel mode. In order to have
parallel query, we'll need to be able to determine whether that query
contains functions (either built-in or user-defined) that cannot be
safely executed in parallel mode. This requires those functions to be
labeled, so this patch introduces an infrastructure for that. Some
functions currently labeled as safe may need to be revised depending on
how pending issues related to heavyweight locking under paralllelism
are resolved.
Parallel plans can't be used except for the case where the query will
run to completion. If portal execution were suspended, the parallel
mode restrictions would need to remain in effect during that time, but
that might make other queries fail. Therefore, this patch introduces
a framework that enables consideration of parallel plans only when it
is known that the plan will be run to completion. This probably needs
some refinement; for example, at bind time, we do not know whether a
query run via the extended protocol will be execution to completion or
run with a limited fetch count. Having the client indicate its
intentions at bind time would constitute a wire protocol break. Some
contexts in which parallel mode would be safe are not adjusted by this
patch; the default is not to try parallel plans except from call sites
that have been updated to say that such plans are OK.
This commit doesn't introduce any parallel paths or plans; it just
provides a way to determine whether they could potentially be used.
I'm committing it on the theory that the remaining parallel sequential
scan patches will also get committed to this release, hopefully in the
not-too-distant future.
Robert Haas and Amit Kapila. Reviewed (in earlier versions) by Noah
Misch.
Per discussion, nowadays it is possible to have tablespaces that have
wildly different I/O characteristics from others. Setting different
effective_io_concurrency parameters for those has been measured to
improve performance.
Author: Julien Rouhaud
Reviewed by: Andres Freund
Formerly, we treated only portals created in the current subtransaction as
having failed during subtransaction abort. However, if the error occurred
while running a portal created in an outer subtransaction (ie, a cursor
declared before the last savepoint), that has to be considered broken too.
To allow reliable detection of which ones those are, add a bookkeeping
field to struct Portal that tracks the innermost subtransaction in which
each portal has actually been executed. (Without this, we'd end up
failing portals containing functions that had called the subtransaction,
thereby breaking plpgsql exception blocks completely.)
In addition, when we fail an outer-subtransaction Portal, transfer its
resources into the subtransaction's resource owner, so that they're
released early in cleanup of the subxact. This fixes a problem reported by
Jim Nasby in which a function executed in an outer-subtransaction cursor
could cause an Assert failure or crash by referencing a relation created
within the inner subtransaction.
The proximate cause of the Assert failure is that AtEOSubXact_RelationCache
assumed it could blow away a relcache entry without first checking that the
entry had zero refcount. That was a bad idea on its own terms, so add such
a check there, and to the similar coding in AtEOXact_RelationCache. This
provides an independent safety measure in case there are still ways to
provoke the situation despite the Portal-level changes.
This has been broken since subtransactions were invented, so back-patch
to all supported branches.
Tom Lane and Michael Paquier
We had a report from Stefan Kaltenbrunner of a case in which postmaster
log files overran available disk space because multiple backends spewed
enormous context stats dumps upon hitting an out-of-memory condition.
Given the lack of similar reports, this isn't a common problem, but it
still seems worth doing something about. However, we don't want to just
blindly truncate the output, because that might prevent diagnosis of OOM
problems. What seems like a workable compromise is to limit the dump to
100 child contexts per parent, and summarize the space used within any
additional child contexts. That should help because practical cases where
the dump gets long will typically be huge numbers of siblings under the
same parent context; while the additional debugging value from seeing
details about individual siblings beyond 100 will not be large, we hope.
Anyway it doesn't take much code or memory space to do this, so let's try
it like this and see how things go.
Since the summarization mechanism requires passing totals back up anyway,
I took the opportunity to add a "grand total" line to the end of the
printout.
So far we have worked around the fact that some very old compilers do
not support 'inline' functions by only using inline functions
conditionally (or not at all). Since such compilers are very rare by
now, we have decided to rely on inline functions from 9.6 onwards.
To avoid breaking these old compilers inline is defined away when not
supported. That'll cause "function x defined but not used" type of
warnings, but since nobody develops on such compilers anymore that's
ok.
This change in policy will allow us to more easily employ inline
functions.
I chose to remove code previously conditional on PG_USE_INLINE as it
seemed confusing to have code dependent on a define that's always
defined.
Blacklisting of compilers, like in c53f73879f, now has to be done
differently. A platform template can define PG_FORCE_DISABLE_INLINE to
force inline to be defined empty.
Discussion: 20150701161447.GB30708@awork2.anarazel.de
The tuplesort/tuplestore memory management logic assumed that the chunk
allocation overhead for its memtuples array could not increase when
increasing the array size. This is and always was true for tuplesort,
but we (I, I think) blindly copied that logic into tuplestore.c without
noticing that the assumption failed to hold for the much smaller array
elements used by tuplestore. Given rather small work_mem, this could
result in an improper complaint about "unexpected out-of-memory situation",
as reported by Brent DeSpain in bug #13530.
The easiest way to fix this is just to increase tuplestore's initial
array size so that the assumption holds. Rather than relying on magic
constants, though, let's export a #define from aset.c that represents
the safe allocation threshold, and make tuplestore's calculation depend
on that.
Do the same in tuplesort.c to keep the logic looking parallel, even though
tuplesort.c isn't actually at risk at present. This will keep us from
breaking it if we ever muck with the allocation parameters in aset.c.
Back-patch to all supported versions. The error message doesn't occur
pre-9.3, not so much because the problem can't happen as because the
pre-9.3 tuplestore code neglected to check for it. (The chance of
trouble is a great deal larger as of 9.3, though, due to changes in the
array-size-increasing strategy.) However, allowing LACKMEM() to become
true unexpectedly could still result in less-than-desirable behavior,
so let's patch it all the way back.
The pg_stats view is supposed to be restricted to only show rows
about tables the user can read. However, it sometimes can leak
information which could not otherwise be seen when row level security
is enabled. Fix that by not showing pg_stats rows to users that would
be subject to RLS on the table the row is related to. This is done
by creating/using the newly introduced SQL visible function,
row_security_active().
Along the way, clean up three call sites of check_enable_rls(). The second
argument of that function should only be specified as other than
InvalidOid when we are checking as a different user than the current one,
as in when querying through a view. These sites were passing GetUserId()
instead of InvalidOid, which can cause the function to return incorrect
results if the current user has the BYPASSRLS privilege and row_security
has been set to OFF.
Additionally fix a bug causing RI Trigger error messages to unintentionally
leak information when RLS is enabled, and other minor cleanup and
improvements. Also add WITH (security_barrier) to the definition of pg_stats.
Bumped CATVERSION due to new SQL functions and pg_stats view definition.
Back-patch to 9.5 where RLS was introduced. Reported by Yaroslav.
Patch by Joe Conway and Dean Rasheed with review and input by
Michael Paquier and Stephen Frost.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM. (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.) Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type. This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.
Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.
Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.
Back-patch to 9.5 so that we don't have to support the original API
in production.
Previously, there was an inconsistency across json/jsonb operators that
operate on datums containing JSON arrays -- only some operators
supported negative array count-from-the-end subscripting. Specifically,
only a new-to-9.5 jsonb deletion operator had support (the new "jsonb -
integer" operator). This inconsistency seemed likely to be
counter-intuitive to users. To fix, allow all places where the user can
supply an integer subscript to accept a negative subscript value,
including path-orientated operators and functions, as well as other
extraction operators. This will need to be called out as an
incompatibility in the 9.5 release notes, since it's possible that users
are relying on certain established extraction operators changed here
yielding NULL in the event of a negative subscript.
For the json type, this requires adding a way of cheaply getting the
total JSON array element count ahead of time when parsing arrays with a
negative subscript involved, necessitating an ad-hoc lex and parse.
This is followed by a "conversion" from a negative subscript to its
equivalent positive-wise value using the count. From there on, it's as
if a positive-wise value was originally provided.
Note that there is still a minor inconsistency here across jsonb
deletion operators. Unlike the aforementioned new "-" deletion operator
that accepts an integer on its right hand side, the new "#-" path
orientated deletion variant does not throw an error when it appears like
an array subscript (input that could be recognized by as an integer
literal) is being used on an object, which is wrong-headed. The reason
for not being stricter is that it could be the case that an object pair
happens to have a key value that looks like an integer; in general,
these two possibilities are impossible to differentiate with rhs path
text[] argument elements. However, we still don't allow the "#-"
path-orientated deletion operator to perform array-style subscripting.
Rather, we just return the original left operand value in the event of a
negative subscript (which seems analogous to how the established
"jsonb/json #> text[]" path-orientated operator may yield NULL in the
event of an invalid subscript).
In passing, make SetArrayPath() stricter about not accepting cases where
there is trailing non-numeric garbage bytes rather than a clean NUL
byte. This means, for example, that strings like "10e10" are now not
accepted as an array subscript of 10 by some new-to-9.5 path-orientated
jsonb operators (e.g. the new #- operator). Finally, remove dead code
for jsonb subscript deletion; arguably, this should have been done in
commit b81c7b409.
Peter Geoghegan and Andrew Dunstan