initial values and runtime changes in selected parameters. This gets
rid of the need for an initial 'select pg_client_encoding()' query in
libpq, bringing us back to one message transmitted in each direction
for a standard connection startup. To allow server version to be sent
using the same GUC mechanism that handles other parameters, invent the
concept of a never-settable GUC parameter: you can 'show server_version'
but it's not settable by any GUC input source. Create 'lc_collate' and
'lc_ctype' never-settable parameters so that people can find out these
settings without need for pg_controldata. (These side ideas were all
discussed some time ago in pgsql-hackers, but not yet implemented.)
rewritten and the protocol is changed, but most elog calls are still
elog calls. Also, we need to contemplate mechanisms for controlling
all this functionality --- eg, how much stuff should appear in the
postmaster log? And what API should libpq expose for it?
expressions, ARRAY(sub-SELECT) expressions, some array functions.
Polymorphic functions using ANYARRAY/ANYELEMENT argument and return
types. Some regression tests in place, documentation is lacking.
Joe Conway, with some kibitzing from Tom Lane.
(materialization into a tuple store) discussed on pgsql-hackers earlier.
I've updated the documentation and the regression tests.
Notes on the implementation:
- I needed to change the tuple store API slightly -- it assumes that it
won't be used to hold data across transaction boundaries, so the temp
files that it uses for on-disk storage are automatically reclaimed at
end-of-transaction. I added a flag to tuplestore_begin_heap() to control
this behavior. Is changing the tuple store API in this fashion OK?
- in order to store executor results in a tuple store, I added a new
CommandDest. This works well for the most part, with one exception: the
current DestFunction API doesn't provide enough information to allow the
Executor to store results into an arbitrary tuple store (where the
particular tuple store to use is chosen by the call site of
ExecutorRun). To workaround this, I've temporarily hacked up a solution
that works, but is not ideal: since the receiveTuple DestFunction is
passed the portal name, we can use that to lookup the Portal data
structure for the cursor and then use that to get at the tuple store the
Portal is using. This unnecessarily ties the Portal code with the
tupleReceiver code, but it works...
The proper fix for this is probably to change the DestFunction API --
Tom suggested passing the full QueryDesc to the receiveTuple function.
In that case, callers of ExecutorRun could "subclass" QueryDesc to add
any additional fields that their particular CommandDest needed to get
access to. This approach would work, but I'd like to think about it for
a little bit longer before deciding which route to go. In the mean time,
the code works fine, so I don't think a fix is urgent.
- (semi-related) I added a NO SCROLL keyword to DECLARE CURSOR, and
adjusted the behavior of SCROLL in accordance with the discussion on
-hackers.
- (unrelated) Cleaned up some SGML markup in sql.sgml, copy.sgml
Neil Conway
some of the algorithms for higher functions. I see about a factor of ten
speedup on the 'numeric' regression test, but it's unlikely that that test
is representative of real-world applications.
initdb forced due to change of on-disk representation for NUMERIC.
utility statement (DeclareCursorStmt) with a SELECT query dangling from
it, rather than a SELECT query with a few unusual fields in it. Add
code to determine whether a planned query can safely be run backwards.
If DECLARE CURSOR specifies SCROLL, ensure that the plan can be run
backwards by adding a Materialize plan node if it can't. Without SCROLL,
you get an error if you try to fetch backwards from a cursor that can't
handle it. (There is still some discussion about what the exact
behavior should be, but this is necessary infrastructure in any case.)
Along the way, make EXPLAIN DECLARE CURSOR work.
entire contents of the subplan into the tuplestore before we can return
any tuples. Instead, the tuplestore holds what we've already read, and
we fetch additional rows from the subplan as needed. Random access to
the previously-read rows works with the tuplestore, and doesn't affect
the state of the partially-read subplan. This is a step towards fixing
the problems with cursors over complex queries --- we don't want to
stick in Materialize nodes if they'll prevent quick startup for a cursor.
answer when SET TIMEZONE has been done since the start of the current
transaction. Per bug report from Robert Haas.
I plan some futher cleanup in HEAD, but this is a low-risk patch for
the immediate issue in 7.3.
functions which limited the maximum date for a timestamp to AD 1465001.
The new limit is AD 5874897.
The files affected are:
doc/src/sgml/datatype.sgml:
Documentation change due to patch. Included is a notice about
the reduced range when using an eight-byte integer for timestamps.
src/backend/utils/adt/datetime.c:
Replacement functions for j2date() and date2j() functions.
src/include/utils/datetime.h:
Corrected a bug with the limit on the earliest possible date,
Nov 23,-4713 has a Julian day count of -1. The earliest possible
date should be Nov 24, -4713 with a day count of 0.
src/test/regress/expected/horology-no-DST-before-1970.out:
src/test/regress/expected/horology-solaris-1947.out:
src/test/regress/expected/horology.out:
Copies of expected output for regression testing.
Note: Only horology.out has been physically tested. I do not have access
to a Solaris box and I don't know how to provoke the "pre-1970" test.
src/test/regress/sql/horology.sql:
Added some test cases to check extended range.
John Cochran
takes two parameters, an OID x and an integer y, and returns "true" with
probability 1/y (the OID argument is ignored). This can be useful -- for
example, it can be used to select a random sampling of the rows in a
table (which is what the "random" regression test uses it for).
This patch removes that function, because it was old and messy. The old
function had the following problems:
- it was undocumented
- it was poorly named
- it was designed to workaround an optimizer bug that no longer exists
(the OID argument is to ensure that the optimizer won't optimize away
calls to the function; AFAIK marking the function as 'volatile' suffices
nowadays)
- it used a different random-number generation technique than the other
PSRNG-related functions in the backend do (it called random() like they
do, but it had its own logic for setting a set and deciding when to
reseed the RNG).
Ok, this patch removes oidrand(), oidsrand(), and userfntest(), and
improves the SGML docs a little bit (un-commenting the setseed()
documentation).
Neil Conway
expression accepted by the regex operators, per discussion yesterday.
Along the way, reduce deadlock_timeout from PGC_POSTMASTER to PGC_SIGHUP
category. It is probably best to insist that all backends share the same
setting, but that doesn't mean it has to be frozen at startup.
startup, not in the parser; this allows ALTER DOMAIN to work correctly
with domain constraint operations stored in rules. Rod Taylor;
code review by Tom Lane.
value of MAX_TIME_PRECISION in floating-point-timestamp-storage case
from 13 to 10, which is as much as time_out is actually willing to print.
(The alternative of increasing the number of digits we are willing to
print looks risky; we might find ourselves printing roundoff garbage.)
passed to join selectivity estimators. Make use of this in eqjoinsel
to derive non-bogus selectivity for IN clauses. Further tweaking of
cost estimation for IN.
initdb forced because of pg_proc.h changes.
There are two implementation techniques: the executor understands a new
JOIN_IN jointype, which emits at most one matching row per left-hand row,
or the result of the IN's sub-select can be fed through a DISTINCT filter
and then joined as an ordinary relation.
Along the way, some minor code cleanup in the optimizer; notably, break
out most of the jointree-rearrangement preprocessing in planner.c and
put it in a new file prep/prepjointree.c.
datetime token tables. Even more embarrassing, the regression tests
revealed some of the problems --- but evidently the bogus output wasn't
questioned. Add code to postmaster startup to directly check the tables
for correct ordering, in hopes of not being embarrassed like this again.
containing a volatile function), rather than only on 'Var = Var' clauses
as before. This makes it practical to do flatten_join_alias_vars at the
start of planning, which in turn eliminates a bunch of klugery inside the
planner to deal with alias vars. As a free side effect, we now detect
implied equality of non-Var expressions; for example in
SELECT ... WHERE a.x = b.y and b.y = 42
we will deduce a.x = 42 and use that as a restriction qual on a. Also,
we can remove the restriction introduced 12/5/02 to prevent pullup of
subqueries whose targetlists contain sublinks.
Still TODO: make statistical estimation routines in selfuncs.c and costsize.c
smarter about expressions that are more complex than plain Vars. The need
for this is considerably greater now that we have to be able to estimate
the suitability of merge and hash join techniques on such expressions.
practice of evaluating MemSet's arguments multiple times, except for
the special case of newNode(), where we can assume the argument is
a constant sizeof() operator.
Also, add GetMemoryChunkContext() to mcxt.c's API, in preparation for
fixing recent GEQO breakage.
given any malloc block until something is first allocated in it; but
thereafter, MemoryContextReset won't release that first malloc block.
This preserves the quick-reset property of the original policy, without
forcing 8K to be allocated to every context whether any of it is ever
used or not. Also, remove some more no-longer-needed explicit freeing
during ExecEndPlan.
execution state trees, and ExecEvalExpr takes an expression state tree
not an expression plan tree. The plan tree is now read-only as far as
the executor is concerned. Next step is to begin actually exploiting
this property.
documentation and regression test mods. It seemed small and unobtrusive enough
to not require a specific proposal on the hackers list -- but if not, let me
know and I'll make a pitch. Otherwise, if there are no objections please apply.
Joe Conway
to plan nodes, not vice-versa. All executor state nodes now inherit from
struct PlanState. Copying of plan trees has been simplified by not
storing a list of SubPlans in Plan nodes (eliminating duplicate links).
The executor still needs such a list, but it can build it during
ExecutorStart since it has to scan the plan tree anyway.
No initdb forced since no stored-on-disk structures changed, but you
will need a full recompile because of node-numbering changes.
('SELECT expression') inline, like macros, during the constant-folding
phase of planning. The actual expansion is not difficult, but checking
that we're not changing the semantics of the call turns out to be more
subtle than one might think; in particular must pay attention to
permissions issues, strictness, and volatility.
-hackers a couple days ago.
Notes/caveats:
- added regression tests for the new functionality, all
regression tests pass on my machine
- added pg_dump support
- updated PL/PgSQL to support per-statement triggers; didn't
look at the other procedural languages.
- there's (even) more code duplication in trigger.c than there
was previously. Any suggestions on how to refactor the
ExecXXXTriggers() functions to reuse more code would be
welcome -- I took a brief look at it, but couldn't see an
easy way to do it (there are several subtly-different
versions of the code in question)
- updated the documentation. I also took the liberty of
removing a big chunk of duplicated syntax documentation in
the Programmer's Guide on triggers, and moving that
information to the CREATE TRIGGER reference page.
- I also included some spelling fixes and similar small
cleanups I noticed while making the changes. If you'd like
me to split those into a separate patch, let me know.
Neil Conway
of groups produced by GROUP BY. This improves the accuracy of planning
estimates for grouped subselects, and is needed to check whether a
hashed aggregation plan risks memory overflow.
precision for float4, float8, and geometric types. Set it in pg_dump
so that float data can be dumped/reloaded exactly (at least on platforms
where the float I/O support is properly implemented). Initial patch by
Pedro Ferreira, some additional work by Tom Lane.
now)" item on the open items, and subsequent plpgsql function I sent in,
made me realize it was too hard to get the upper and lower bound of an
array. The attached creates two functions that I think will be very
useful when combined with the ability of plpgsql to return sets.
array_lower(array, dim_num)
- and -
array_upper(array, dim_num)
They return the value (as an int) of the upper and lower bound of the
requested dim in the provided array.
Joe Conway
where it's safe to do database access. Along the way, fix core dump
for 'DEFAULT' parameters to CREATE DATABASE. initdb forced due to
change in pg_proc entry.
Ray Ontko 28-June-02. Also, fix prefix_selectivity for NAME lefthand
variables (it was bogusly assuming binary compatibility), and adjust
make_greater_string() to not call pg_mbcliplen() with invalid multibyte
data (this last per bug report that I can't find at the moment, but it
was in July '02).