conditions. There are some pretty bogus heuristics in prepqual.c that
try to decide whether to output CNF or DNF format; they need to be replaced,
likely. Right now the code is probably too willing to choose DNF form,
which might hurt performance in some cases that used to work OK.
But at least we have a foundation to build on.
in or_normalize, remove detection of duplicate subexpressions (since it's
highly unlikely to be worth the amount of time it takes), and introduce
a dnfify() entry point so that unintelligible backwards logic in UNION
processing can be eliminated. This is just an intermediate step ---
next thing is to look at not forcing the qual into CNF form when it would
be better off in DNF form.
This change seems necessary in conjunction with long queries, and it
cleans up some bogosity in connection with long EXPLAIN texts anyway.
Note that current libpq will accept any length error message (at least
until it runs out of memory); prior versions have a limit of 8K, but
will cleanly discard excess error text, so there shouldn't be any
big compatibility problems with old clients.
transaction abort --- before it only worked if there was exactly one level
of allocation context stacked in the blank portal. Now it does the right
thing for any depth, including zero...
before comparison; if fields being joined are different widths then hashing
will yield wrong answer. Also, remove hashjoinable mark from all uses of
array_eq, because array structures may have padding bytes between elements
and the pad bytes are of uncertain content. This could be revisited if
array code is cleaned up.
Modify opr_sanity regress test to complain if array_eq operator is marked
hashjoinable.
offended my aesthestic sensibility that there was so much unreadable code
doing so little. Rewritten code is about half the size, faster, and
(I hope) much more intelligible.
has positive refcount, it is rebuilt from pg_class data. This ensures
that relcache entries will track changes made by other backends. Formerly,
a shared inval report would just be ignored if it happened to arrive while
the relcache entry was in use. Also, fix relcache to reset ref counts
to zero during transaction abort. Finally, change LockRelation() so that
it checks for shared inval reports after obtaining the lock. In this way,
once any kind of lock has been obtained on a rel, we can trust the relcache
entry to be up-to-date.
the SInval spinlock while it is calling the passed invalFunction or
resetFunction. This is necessary to avoid deadlock with lmgr change;
InvalidateSharedInvalid can be called recursively now. It should be
a good performance improvement anyway --- holding a spinlock for more
than a very short interval is a no-no.
and 1370 (timestamp(datetime)). This does not force an initdb, exactly,
but you won't see the effects of the bug fix until you do one.
BTW, OID 1358 for timespan(time) is still broken:
select timespan('21:11:26'::time);
ERROR: No such function 'time_timespan' with the specified attributes
But I couldn't figure out what it ought to be defined as, so I left it be.
Most parts of the planner should ignore, or indeed never even see, uplevel
Vars because they will be or have been replaced by Params. There were a
couple of places that got it wrong though, probably my fault from recent
changes...
documented intepretation of the lefthand and oper fields. Fix a number of
obscure problems while at it --- for example, the old code failed if the parser
decided to insert a type-coercion function just below the operator of a
SubLink.
CAUTION: this will break stored rules that contain subplans. You may
need to initdb.
It will keep track the number of pages allocated so that
vacuum could allocate twice of the previous allocation.
This will greatly reduce the total memory consumption of
vacuum.
ALLOC_BIGCHUNK_LIMIT are always allocated as separate malloc() blocks,
and are free()d immediately upon pfree(). Also, if such a chunk is enlarged
with repalloc(), translate the operation into a realloc() so as to
minimize memory usage. Of course, these large chunks still get freed
automatically if the alloc set is reset.
I have set ALLOC_BIGCHUNK_LIMIT at 64K for now, but perhaps another
size would be better?
match then it tried for a self-commutative operator with the reversed input
data types. This is pretty silly; there could never be such an operator,
except maybe in binary-compatible-type scenarios, and we have oper_inexact
for that. Besides which, the oprsanity regress test would complain about
such an operator. Remove nonfunctional code and simplify routine calling
convention accordingly.
and fix_opids processing to a single recursive pass over the plan tree
executed at the very tail end of planning, rather than haphazardly here
and there at different places. Now that tlist Vars do not get modified
until the very end, it's possible to get rid of the klugy var_equal and
match_varid partial-matching routines, and just use plain equal()
throughout the optimizer. This is a step towards allowing merge and
hash joins to be done on expressions instead of only Vars ...
sort order down into planner, instead of handling it only at the very top
level of the planner. This fixes many things. An explicit sort is now
avoided if there is a cheaper alternative (typically an indexscan) not
only for ORDER BY, but also for the internal sort of GROUP BY. It works
even when there is no other reason (such as a WHERE condition) to consider
the indexscan. It works for indexes on functions. It works for indexes
on functions, backwards. It's just so cool...
CAUTION: I have changed the representation of SortClause nodes, therefore
THIS UPDATE BREAKS STORED RULES. You will need to initdb.
store all ordering information in pathkeys lists (which are now lists of
lists of PathKeyItem nodes, not just lists of lists of vars). This was
a big win --- the code is smaller and IMHO more understandable than it
was, even though it handles more cases. I believe the node changes will
not force an initdb for anyone; planner nodes don't show up in stored
rules.
commuted (ie, the index var appears on the right). These are now handled
the same way as merge and hash join quals that need to be commuted: the
actual reversing of the clause only happens if we actually choose the path
and generate a plan from it. Furthermore, the clause is only reversed in
the 'indexqual' field of the plan, not in the 'indxqualorig' field. This
allows the clause to still be recognized and removed from qpquals of upper
level join plans. Also, simplify and generalize match_clause_to_indexkey;
now it recognizes binary-compatible indexes for join as well as restriction
clauses.
> >
> > was implemented by Jan Wieck.
> > His work is for ascending order cases.
> >
> > Here is a patch to prevent sorting also in descending
> > order cases.
> > Because I had already changed _bt_first() to position
> > backward correctly before v6.5,this patch would work.
> >
Hiroshi Inoue
Inoue@tpf.co.jp
to go along with expression_tree_walker. (_walker is not suitable for
routines that need to alter the tree structure significantly.) Other minor
cleanups in clauses.c.
Also, move responsibility for calling vc_abort into main xact.c list of
things-to-call-at-abort. What in the world was it doing down inside of
TransactionIdAbort()?
hashjoinable clause, not one path for a randomly-chosen element of each
set of clauses with the same join operator. That is, if you wrote
SELECT ... WHERE t1.f1 = t2.f2 and t1.f3 = t2.f4,
and both '=' ops were the same opcode (say, all four fields are int4),
then the system would either consider hashing on f1=f2 or on f3=f4,
but it would *not* consider both possibilities. Boo hiss.
Also, revise estimation of hashjoin costs to include a penalty when the
inner join var has a high disbursion --- ie, the most common value is
pretty common. This tends to lead to badly skewed hash bucket occupancy
and way more comparisons than you'd expect on average.
I imagine that the cost calculation still needs tweaking, but at least
it generates a more reasonable plan than before on George Young's example.
neqsel now behave as per my suggestions in pghackers a few days ago.
selectivity for < > <= >= should work OK for integral types as well, but
still need work for nonintegral types. Since these routines have never
actually executed before :-(, this may result in some significant changes
in the optimizer's choices of execution plans. Let me know if you see
any serious misbehavior.
CAUTION: THESE CHANGES REQUIRE INITDB. pg_statistic table has changed.
rels that the inner path needs to join to, but it was only checking for
the first one. Failure could only have been observed with an OR-clause
that mentions 3 or more tables, and then only if the bogus path was
actually selected as cheapest ...
optimizer rather than parser. This has many advantages, such as not
getting fooled by chance uses of operator names ~ and ~~ (the operators
are identified by OID now), and not creating useless comparison operations
in contexts where the comparisons will not actually be used as indexquals.
The new code also recognizes exact-match LIKE and regex patterns, and
produces an = indexqual instead of >= and <=.
This change does NOT fix the problem with non-ASCII locales: the code
still doesn't know how to generate an upper bound indexqual for non-ASCII
collation order. But it's no worse than before, just the same deficiency
in a different place...
Also, dike out loc_restrictinfo fields in Plan nodes. These were doing
nothing useful in the absence of 'expensive functions' optimization,
and they took a considerable amount of processing to fill in.
The only place it was being used was as temporary storage in indxpath.c,
and the logic was wrong: the same restrictinfo node could get chosen to
carry the info for two different joins. Right fix is to return a second
list of unjoined-relids parallel to the list of clause groups.
identified by Hiroshi (incorrect cost attributed to OR clauses
after multiple passes through set_rest_selec()). I think the code
was trying to allow selectivities of OR subclauses to be passed in
from outside, but noplace was actually passing any useful data, and
set_rest_selec() was passing wrong data.
Restructure representation of "indexqual" in IndexPath nodes so that
it is the same as for indxqual in completed IndexScan nodes: namely,
a toplevel list with an entry for each pass of the index scan, having
sublists that are implicitly-ANDed index qual conditions for that pass.
You don't want to know what the old representation was :-(
Improve documentation of OR-clause indexscan functions.
Remove useless 'notclause' field from RestrictInfo nodes. (This might
force an initdb for anyone who has stored rules containing RestrictInfos,
but I do not think that RestrictInfo ever appears in completed plans.)