Commit Graph

1647 Commits

Author SHA1 Message Date
fc5cf383d9 Don't allow data_directory to be set in postgresql.auto.conf by ALTER SYSTEM.
data_directory could be set both in postgresql.conf and postgresql.auto.conf so far.
This could cause some problematic situations like circular definition. To avoid such
situations, this commit forbids a user to set data_directory in postgresql.auto.conf.

Backpatch this to 9.4 where ALTER SYSTEM command was introduced.

Amit Kapila, reviewed by Abhijit Menon-Sen, with minor adjustments by me.
2014-06-19 20:32:55 +09:00
4c8ab1b91d Add btree and hash opclasses for pg_lsn.
This is needed to allow ORDER BY, DISTINCT, etc to work as expected for
pg_lsn values.

We had previously decided to put this off for 9.5, but in view of commit
eeca4cd35e284c72b2ea1b4494e64e7738896e81 there's no reason to avoid a
catversion bump for 9.4beta2, and this does make a pretty significant
usability difference for pg_lsn.

Michael Paquier, with fixes from Andres Freund and Tom Lane
2014-06-04 20:45:56 -04:00
06db9cce22 Fix typo in comment.
Erik Rijkers
2014-05-22 16:31:55 +09:00
b23b0f5588 Code review for recent changes in relcache.c.
rd_replidindex should be managed the same as rd_oidindex, and rd_keyattr
and rd_idattr should be managed like rd_indexattr.  Omissions in this area
meant that the bitmapsets computed for rd_keyattr and rd_idattr would be
leaked during any relcache flush, resulting in a slow but permanent leak in
CacheMemoryContext.  There was also a tiny probability of relcache entry
corruption if we ran out of memory at just the wrong point in
RelationGetIndexAttrBitmap.  Otherwise, the fields were not zeroed where
expected, which would not bother the code any AFAICS but could greatly
confuse anyone examining the relcache entry while debugging.

Also, create an API function RelationGetReplicaIndex rather than letting
non-relcache code be intimate with the mechanisms underlying caching of
that value (we won't even mention the memory leak there).

Also, fix a relcache flush hazard identified by Andres Freund:
RelationGetIndexAttrBitmap must not assume that rd_replidindex stays valid
across index_open.

The aspects of this involving rd_keyattr date back to 9.3, so back-patch
those changes.
2014-05-14 14:56:08 -04:00
12e611d43e Rename jsonb_hash_ops to jsonb_path_ops.
There's no longer much pressure to switch the default GIN opclass for
jsonb, but there was still some unhappiness with the name "jsonb_hash_ops",
since hashing is no longer a distinguishing property of that opclass,
and anyway it seems like a relatively minor detail.  At the suggestion of
Heikki Linnakangas, we'll use "jsonb_path_ops" instead; that captures the
important characteristic that each index entry depends on the entire path
from the document root to the indexed value.

Also add a user-facing explanation of the implementation properties of
these two opclasses.
2014-05-11 12:06:04 -04:00
d9daff0e0c More jsonb cleanup.
Fix JSONB_MAX_ELEMS and JSONB_MAX_PAIRS macros to use CB_MASK in the
calculation. JENTRY_POSMASK happens to have the same value at the moment,
but that's just coincidental.

Refactor jsonb iterator functions, for readability.

Get rid of the JENTRY_ISFIRST flag. Whenever we handle JEntrys, we have
access to the whole array and have enough context information to know
which entry is the first. This frees up one bit in the JEntry header for
future use. While we're at it, shuffle the JEntry bits so that boolean
true and false go together, for aesthetic reasons.

Bump catalog version as this changes the on-disk format slightly.
2014-05-09 15:55:56 +03:00
46dddf7673 Improve key representation for GIN jsonb_ops, and fix existence-search bug.
Change the key representation so that values that would exceed 127 bytes
are hashed into short strings, and so that the original JSON datatype of
each value is recorded in the index.  The hashing rule eliminates the major
objection to having this opclass be the default for jsonb, namely that it
could fail for plausible input data (due to GIN's restrictions on maximum
key length).  Preserving datatype information doesn't really buy us much
right now, but it requires no extra space compared to the previous way,
and it might be useful later.

Also, change the consistency-checking functions to request recheck for
exists (jsonb ? text) and related operators.  The original analysis that
this is an exactly checkable query was incorrect, since the index does
not preserve information about whether a key appears at top level in
the indexed JSON object.  Add a test case demonstrating the problem.

Make some other, mostly cosmetic improvements to the code in jsonb_gin.c
as well.

catversion bump due to on-disk data format change in jsonb_ops indexes.
2014-05-09 08:41:26 -04:00
d3c72e23df Avoid some pnstrdup()s when constructing jsonb
This speeds up text to jsonb parsing and hstore to jsonb conversions
somewhat.
2014-05-09 12:46:21 +03:00
a16d421ca4 Revert "Auto-tune effective_cache size to be 4x shared buffers"
This reverts commit ee1e5662d8d8330726eaef7d3110cb7add24d058, as well as
a remarkably large number of followup commits, which were mostly concerned
with the fact that the implementation didn't work terribly well.  It still
doesn't: we probably need some rather basic work in the GUC infrastructure
if we want to fully support GUCs whose default varies depending on the
value of another GUC.  Meanwhile, it also emerged that there wasn't really
consensus in favor of the definition the patch tried to implement (ie,
effective_cache_size should default to 4 times shared_buffers).  So whack
it all back to where it was.  In a followup commit, I'll do what was
recently agreed to, which is to simply change the default to a higher
value.
2014-05-08 20:49:38 -04:00
364ddc3e5c Clean up jsonb code.
The main target of this cleanup is the convertJsonb() function, but I also
touched a lot of other things that I spotted into in the process.

The new convertToJsonb() function uses an output buffer that's resized on
demand, so the code to estimate of the size of JsonbValue is removed.

The on-disk format was not changed, even though I refactored the structs
used to handle it. The term "superheader" is replaced with "container".

The jsonb_exists_any and jsonb_exists_all functions no longer sort the input
array. That was a premature optimization, the idea being that if there are
duplicates in the input array, you only need to check them once. Also,
sorting the array saves some effort in the binary search used to find a key
within an object. But there were drawbacks too: the sorting and
deduplicating obviously isn't free, and in the typical case there are no
duplicates to remove, and the gain in the binary search was minimal. Remove
all that, which makes the code simpler too.

This includes a bug-fix; the total length of the elements in a jsonb array
or object mustn't exceed 2^28. That is now checked.
2014-05-07 23:16:19 +03:00
0a78320057 pgindent run for 9.4
This includes removing tabs after periods in C comments, which was
applied to back branches, so this change should not effect backpatching.
2014-05-06 12:12:18 -04:00
a9baeb361d Can't completely get rid of #ifndef FRONTEND in palloc.h :-(
pg_controldata includes postgres.h not postgres_fe.h, so utils/palloc.h
must be able to compile in a "#define FRONTEND" context.  It appears that
Solaris Studio is smart enough to persuade us to define PG_USE_INLINE,
but not smart enough to not make a copy of unreferenced static functions;
which leads to an unsatisfied reference to CurrentMemoryContext.  So we
need an #ifndef FRONTEND around that declaration.  Per buildfarm.
2014-04-27 21:24:19 -04:00
528c454b2a Don't #include utils/palloc.h in common/fe_memutils.h.
This breaks the principle that common/ ought not depend on anything in the
server, not only code-wise but in the headers.  The only arguable advantage
is avoidance of duplication of half a dozen extern declarations, and even
that is rather dubious, considering that the previous coding was wrong
about which declarations to duplicate: it exposed pnstrdup() to frontend
code even though no such function is provided in fe_memutils.c.

On the same principle, don't #include utils/memutils.h in the frontend
build of psprintf.c.  This requires duplicating the definition of
MaxAllocSize, but that seems fine to me: there's no a-priori reason why
frontend code should use the same size limit as the backend anyway.

In passing, clean up some rather odd layout and ordering choices that
were imposed on palloc.h to reduce the number of #ifdefs required by
the previous approach.

Per gripe from Christoph Berg.  There's still more work to do to make
include/common/ clean, but this part seems reasonably noncontroversial.
2014-04-26 14:14:28 -04:00
dfc0219f64 Add to_regprocedure() and to_regoperator().
These are natural complements to the functions added by commit
0886fc6a5c75b294544263ea979b9cf6195407d9, but they weren't included
in the original patch for some reason.  Add them.

Patch by me, per a complaint by Tom Lane.  Review by Tatsuo
Ishii.
2014-04-16 12:21:43 -04:00
e0c91a7ff0 Improve some O(N^2) behavior in window function evaluation.
Repositioning the tuplestore seek pointer in window_gettupleslot() turns
out to be a very significant expense when the window frame is sizable and
the frame end can move.  To fix, introduce a tuplestore function for
skipping an arbitrary number of tuples in one call, parallel to the one we
introduced for tuplesort objects in commit 8d65da1f.  This reduces the cost
of window_gettupleslot() to O(1) if the tuplestore has not spilled to disk.
As in the previous commit, I didn't try to do any real optimization of
tuplestore_skiptuples for the case where the tuplestore has spilled to
disk.  There is probably no practical way to get the cost to less than O(N)
anyway, but perhaps someone can think of something later.

Also fix PersistHoldablePortal() to make use of this API now that we have
it.

Based on a suggestion by Dean Rasheed, though this turns out not to look
much like his patch.
2014-04-13 13:59:17 -04:00
d95425c8b9 Provide moving-aggregate support for boolean aggregates.
David Rowley and Florian Pflug, reviewed by Dean Rasheed
2014-04-13 00:01:46 -04:00
9d229f399e Provide moving-aggregate support for a bunch of numerical aggregates.
First installment of the promised moving-aggregate support in built-in
aggregates: count(), sum(), avg(), stddev() and variance() for
assorted datatypes, though not for float4/float8.

In passing, remove a 2001-vintage kluge in interval_accum(): interval
array elements have been properly aligned since around 2003, but
nobody remembered to take out this workaround.  Also, fix a thinko
in the opr_sanity tests for moving-aggregate catalog entries.

David Rowley and Florian Pflug, reviewed by Dean Rasheed
2014-04-12 20:33:09 -04:00
f23a5630eb Add an in-core GiST index opclass for inet/cidr types.
This operator class can accelerate subnet/supernet tests as well as
btree-equivalent ordered comparisons.  It also handles a new network
operator inet && inet (overlaps, a/k/a "is supernet or subnet of"),
which is expected to be useful in exclusion constraints.

Ideally this opclass would be the default for GiST with inet/cidr data,
but we can't mark it that way until we figure out how to do a more or
less graceful transition from the current situation, in which the
really-completely-bogus inet/cidr opclasses in contrib/btree_gist are
marked as default.  Having the opclass in core and not default is better
than not having it at all, though.

While at it, add new documentation sections to allow us to officially
document GiST/GIN/SP-GiST opclasses, something there was never a clear
place to do before.  I filled these in with some simple tables listing
the existing opclasses and the operators they support, but there's
certainly scope to put more information there.

Emre Hasegeli, reviewed by Andreas Karlsson, further hacking by me
2014-04-08 15:46:43 -04:00
0886fc6a5c Add new to_reg* functions for error-free OID lookups.
These functions won't throw an error if the object doesn't exist,
or if (for functions and operators) there's more than one matching
object.

Yugo Nagata and Nozomi Anzai, reviewed by Amit Khandekar, Marti
Raudsepp, Amit Kapila, and me.
2014-04-08 10:27:56 -04:00
59202fae04 Fix some compiler warnings that clang emits with -pedantic.
Andres Freund
2014-04-04 11:29:50 -04:00
f33a71a786 De-anonymize the union in JsonbValue.
Needed for strict C89 compliance.
2014-04-02 14:30:08 -04:00
f9c6d72cbf Cleanup around json_to_record/json_to_recordset
Set function parameter names and defaults. Add jsonb versions (which the
code already provided for so the actual new code is trivial). Add jsonb
regression tests and docs.

Bump catalog version (which I apparently forgot to do when jsonb was
committed).
2014-03-26 10:18:24 -04:00
d9134d0a35 Introduce jsonb, a structured format for storing json.
The new format accepts exactly the same data as the json type. However, it is
stored in a format that does not require reparsing the orgiginal text in order
to process it, making it much more suitable for indexing and other operations.
Insignificant whitespace is discarded, and the order of object keys is not
preserved. Neither are duplicate object keys kept - the later value for a given
key is the only one stored.

The new type has all the functions and operators that the json type has,
with the exception of the json generation functions (to_json, json_agg etc.)
and with identical semantics. In addition, there are operator classes for
hash and btree indexing, and two classes for GIN indexing, that have no
equivalent in the json type.

This feature grew out of previous work by Oleg Bartunov and Teodor Sigaev, which
was intended to provide similar facilities to a nested hstore type, but which
in the end proved to have some significant compatibility issues.

Authors: Oleg Bartunov,  Teodor Sigaev, Peter Geoghegan and Andrew Dunstan.
Review: Andres Freund
2014-03-23 16:40:19 -04:00
588fb50715 Show PIDs of lock holders and waiters in log_lock_waits log message.
Christian Kruse, reviewed by Kumar Rajeev Rastogi.
2014-03-13 03:26:47 +09:00
c5608ea26a Allow opclasses to provide tri-valued GIN consistent functions.
With the GIN "fast scan" feature, GIN can skip items without fetching all
the keys for them, if it can prove that they don't match regardless of
those keys. So far, it has done the proving by calling the boolean
consistent function with all combinations of TRUE/FALSE for the unfetched
keys, but since that's O(n^2), it becomes unfeasible with more than a few
keys. We can avoid calling consistent with all the combinations, if we can
tell the operator class implementation directly which keys are unknown.

This commit includes a triConsistent function for the built-in array and
tsvector opclasses.

Alexander Korotkov, with some changes by me.
2014-03-12 17:51:30 +02:00
84df54b22e Constructors for interval, timestamp, timestamptz
Author: Pavel Stěhule, editorialized somewhat by Álvaro Herrera
Reviewed-by: Tomáš Vondra, Marko Tiikkaja
With input from Fabrízio de Royes Mello, Jim Nasby
2014-03-04 15:09:43 -03:00
7e8db2dc42 Minor corrections to logical decoding patch. 2014-03-04 11:07:54 -05:00
b89e151054 Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables.  The output format is controlled by a
so-called "output plugin"; an example is included.  To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.

Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.

Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 16:32:18 -05:00
694e3d139a Further code review for pg_lsn data type.
Change input function error messages to be more consistent with what is
done elsewhere.  Remove a bunch of redundant type casts, so that the
compiler will warn us if we screw up.  Don't pass LSNs by value on
platforms where a Datum is only 32 bytes, per buildfarm.  Move macros
for packing and unpacking LSNs to pg_lsn.h so that we can include
access/xlogdefs.h, to avoid an unsatisfied dependency on XLogRecPtr.
2014-02-19 10:06:59 -05:00
7d03a83f4d Add a pg_lsn data type, to represent an LSN.
Robert Haas and Michael Paquier
2014-02-19 08:35:23 -05:00
31400a6733 Predict integer overflow to avoid buffer overruns.
Several functions, mostly type input functions, calculated an allocation
size such that the calculation wrapped to a small positive value when
arguments implied a sufficiently-large requirement.  Writes past the end
of the inadvertent small allocation followed shortly thereafter.
Coverity identified the path_in() vulnerability; code inspection led to
the rest.  In passing, add check_stack_depth() to prevent stack overflow
in related functions.

Back-patch to 8.4 (all supported versions).  The non-comment hstore
changes touch code that did not exist in 8.4, so that part stops at 9.0.

Noah Misch and Heikki Linnakangas, reviewed by Tom Lane.

Security: CVE-2014-0064
2014-02-17 09:33:31 -05:00
4318daecc9 Fix handling of wide datetime input/output.
Many server functions use the MAXDATELEN constant to size a buffer for
parsing or displaying a datetime value.  It was much too small for the
longest possible interval output and slightly too small for certain
valid timestamp input, particularly input with a long timezone name.
The long input was rejected needlessly; the long output caused
interval_out() to overrun its buffer.  ECPG's pgtypes library has a copy
of the vulnerable functions, which bore the same vulnerabilities along
with some of its own.  In contrast to the server, certain long inputs
caused stack overflow rather than failing cleanly.  Back-patch to 8.4
(all supported versions).

Reported by Daniel Schüssler, reviewed by Tom Lane.

Security: CVE-2014-0063
2014-02-17 09:33:31 -05:00
801c2dc72c Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.

Therefore, we now have multixact-specific freezing parameters:

vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)

vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids).  Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.

autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids).  This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.

Backpatch to 9.3 where multixacts were made to persist enough to require
freezing.  To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.

Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 19:36:31 -03:00
5264d91541 Add json_array_elements_text function.
This was a notable omission from the json functions added in 9.3 and
there have been numerous complaints about its absence.

Laurence Rowe.
2014-01-29 15:39:01 -05:00
105639900b New json functions.
json_build_array() and json_build_object allow for the construction of
arbitrarily complex json trees. json_object() turns a one or two
dimensional array, or two separate arrays, into a json_object of
name/value pairs, similarly to the hstore() function.
json_object_agg() aggregates its two arguments into a single json object
as name value pairs.

Catalog version bumped.

Andrew Dunstan, reviewed by Marko Tiikkaja.
2014-01-28 17:48:21 -05:00
2850896961 Code review for auto-tuned effective_cache_size.
Fix integer overflow issue noted by Magnus Hagander, as well as a bunch
of other infelicities in commit ee1e5662d8d8330726eaef7d3110cb7add24d058
and its unreasonably large number of followups.
2014-01-27 00:05:56 -05:00
01f7808b3e Add a cardinality function for arrays.
Unlike our other array functions, this considers the total number of
elements across all dimensions, and returns 0 rather than NULL when the
array has no elements.  But it seems that both of those behaviors are
almost universally disliked, so hopefully that's OK.

Marko Tiikkaja, reviewed by Dean Rasheed and Pavel Stehule
2014-01-21 12:38:53 -05:00
0d79c0a8cc Make various variables const (read-only).
These changes should generally improve correctness/maintainability.
A nice side benefit is that several kilobytes move from initialized
data to text segment, allowing them to be shared across processes and
probably reducing copy-on-write overhead while forking a new backend.
Unfortunately this doesn't seem to help libpq in the same way (at least
not when it's compiled with -fpic on x86_64), but we can hope the linker
at least collects all nominally-const data together even if it's not
actually part of the text segment.

Also, make pg_encname_tbl[] static in encnames.c, since there seems
no very good reason for any other code to use it; per a suggestion
from Wim Lewis, who independently submitted a patch that was mostly
a subset of this one.

Oskari Saarenmaa, with some editorialization by me
2014-01-18 16:04:32 -05:00
7e04792a1c Update copyright for 2014
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
2014-01-07 16:05:30 -05:00
8d65da1f01 Support ordered-set (WITHIN GROUP) aggregates.
This patch introduces generic support for ordered-set and hypothetical-set
aggregate functions, as well as implementations of the instances defined in
SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(),
percent_rank(), cume_dist()).  We also added mode() though it is not in the
spec, as well as versions of percentile_cont() and percentile_disc() that
can compute multiple percentile values in one pass over the data.

Unlike the original submission, this patch puts full control of the sorting
process in the hands of the aggregate's support functions.  To allow the
support functions to find out how they're supposed to sort, a new API
function AggGetAggref() is added to nodeAgg.c.  This allows retrieval of
the aggregate call's Aggref node, which may have other uses beyond the
immediate need.  There is also support for ordered-set aggregates to
install cleanup callback functions, so that they can be sure that
infrastructure such as tuplesort objects gets cleaned up.

In passing, make some fixes in the recently-added support for variadic
aggregates, and make some editorial adjustments in the recent FILTER
additions for aggregates.  Also, simplify use of IsBinaryCoercible() by
allowing it to succeed whenever the target type is ANY or ANYELEMENT.
It was inconsistent that it dealt with other polymorphic target types
but not these.

Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing,
and rather heavily editorialized upon by Tom Lane
2013-12-23 16:11:35 -05:00
65d6e4cb5c Add ALTER SYSTEM command to edit the server configuration file.
Patch contributed by Amit Kapila. Reviewed by Hari Babu, Masao Fujii,
Boszormenyi Zoltan, Andres Freund, Greg Smith and others.
2013-12-18 23:42:44 +09:00
66abc2608c Add a new reloption, user_catalog_table.
When this reloption is set and wal_level=logical is configured,
we'll record the CIDs stamped by inserts, updates, and deletes to
the table just as we would for an actual catalog table.  This will
allow logical decoding to use historical MVCC snapshots to access
such tables just as they access ordinary catalog tables.

Replication solutions built around the logical decoding machinery
will likely need to set this operation for their configuration
tables; it might also be needed by extensions which perform table
access in their output functions.

Andres Freund, reviewed by myself and others.
2013-12-10 19:17:34 -05:00
e55704d8b2 Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65.  This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all.  Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.

Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.

Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-10 19:01:40 -05:00
16e1b7a1b7 Fix assorted race conditions in the new timeout infrastructure.
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions).  That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.

We could still lose control if an asynchronous SIGINT arrives just as the
function is entered.  This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction.  We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.

Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return.  When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely.  Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.

These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored.  So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues.  (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)

Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.

Per reports from Dan Wood.

Back-patch to 9.3 where the new timeout handling infrastructure was
introduced.  We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
2013-11-29 16:41:00 -05:00
85ed91ee7d Implement information_schema.parameters.parameter_default column
Reviewed-by: Ali Dar <ali.munir.dar@gmail.com>
Reviewed-by: Amit Khandekar <amit.khandekar@enterprisedb.com>
Reviewed-by: Rodolfo Campero <rodolfo.campero@anachronics.com>
2013-11-26 23:21:35 -05:00
f901bb50e3 Add make_date() and make_time() functions.
Pavel Stehule, reviewed by Jeevan Chalke and Atri Sharma
2013-11-17 15:06:50 -05:00
69c8fbac20 Improve performance of numeric sum(), avg(), stddev(), variance(), etc.
This patch improves performance of most built-in aggregates that formerly
used a NUMERIC or NUMERIC array as their transition type; this includes
not only aggregates on numeric inputs, but some aggregates on integer
inputs where overflow of an int8 value is a possibility.  The code now
uses a special-purpose data structure to avoid array construction and
deconstruction overhead, as well as packing and unpacking overhead for
numeric values.

These aggregates' transition type is now declared as INTERNAL, since
it doesn't correspond to any SQL data type.  To keep the planner from
thinking that that means a lot of storage will be used, we make use
of the just-added pg_aggregate.aggtransspace feature.  The space estimate
is set to 128 bytes, which is at least in the right ballpark.

Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra
2013-11-16 18:46:34 -05:00
07cacba983 Add the notion of REPLICA IDENTITY for a table.
Pending patches for logical replication will use this to determine
which columns of a tuple ought to be considered as its candidate key.

Andres Freund, with minor, mostly cosmetic adjustments by me
2013-11-08 12:30:43 -05:00
3147acd63e Use improved vsnprintf calling logic in more places.
When we are using a C99-compliant vsnprintf implementation (which should be
most places, these days) it is worth the trouble to make use of its report
of how large the buffer needs to be to succeed.  This patch adjusts
stringinfo.c and some miscellaneous usages in pg_dump to do that, relying
on the logic recently added in libpgcommon's psprintf.c.  Since these
places want to know the number of bytes written once we succeed, modify the
API of pvsnprintf() to report that.

There remains near-duplicate logic in pqexpbuffer.c, but since that code
is in libpq, psprintf.c's approach of exit()-on-error isn't appropriate
for use there.  Also note that I didn't bother touching the multitude
of places that call (v)snprintf without any attempt to provide a resizable
buffer.

Release-note-worthy incompatibility: the API of appendStringInfoVA()
changed.  If there's any third-party code that's calling that directly,
it will need tweaking along the same lines as in this patch.

David Rowley and Tom Lane
2013-10-24 21:43:57 -04:00
09a89cb5fc Get rid of use of asprintf() in favor of a more portable implementation.
asprintf(), aside from not being particularly portable, has a fundamentally
badly-designed API; the psprintf() function that was added in passing in
the previous patch has a much better API choice.  Moreover, the NetBSD
implementation that was borrowed for the previous patch doesn't work with
non-C99-compliant vsnprintf, which is something we still have to cope with
on some platforms; and it depends on va_copy which isn't all that portable
either.  Get rid of that code in favor of an implementation similar to what
we've used for many years in stringinfo.c.  Also, move it into libpgcommon
since it's not really libpgport material.

I think this patch will be enough to turn the buildfarm green again, but
there's still cosmetic work left to do, namely get rid of pg_asprintf()
in favor of using psprintf().  That will come in a followon patch.
2013-10-22 18:42:13 -04:00