postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2026-02-23 06:47:14 +08:00

Author	SHA1	Message	Date
Richard Guo	34740b90bc	Fix unsafe pushdown of quals referencing grouping Vars When checking a subquery's output expressions to see if it's safe to push down an upper-level qual, check_output_expressions() previously treated grouping Vars as opaque Vars. This implicitly assumed they were stable and scalar. However, a grouping Var's underlying expression corresponds to the grouping clause, which may be volatile or set-returning. If an upper-level qual references such an output column, pushing it down into the subquery is unsafe. This can cause strange results due to multiple evaluation of a volatile function, or introduce SRFs into the subquery's WHERE/HAVING quals. This patch teaches check_output_expressions() to look through grouping Vars to their underlying expressions. This ensures that any volatility or set-returning properties in the grouping clause are detected, preventing the unsafe pushdown. We do not need to recursively examine the Vars contained in these underlying expressions. Even if they reference outputs from lower-level subqueries (at any depth), those references are guaranteed not to expand to volatile or set-returning functions, because subqueries containing such functions in their targetlists are never pulled up. Backpatch to v18, where this issue was introduced. Reported-by: Eric Ridge <eebbrr@gmail.com> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/7900964C-F99E-481E-BEE5-4338774CEB9F@gmail.com Backpatch-through: 18	2026-01-19 11:13:23 +09:00
Andres Freund	75609fded3	Fix buggy interaction between array subscripts and subplan params In a7f107df2 I changed subplan param evaluation to happen within the containing expression. As part of that, ExecInitSubPlanExpr() was changed to evaluate parameters via a new EEOP_PARAM_SET expression step. These parameters were temporarily stored into ExprState->resvalue/resnull, with some reasoning why that would be fine. Unfortunately, that analysis was wrong - ExecInitSubscriptionRef() evaluates the input array into "resv"/"resnull", which will often point to ExprState->resvalue/resnull. This means that the EEOP_PARAM_SET, if inside an array subscript, would overwrite the input array to array subscript. The fix is fairly simple - instead of evaluating into ExprState->resvalue/resnull, store the temporary result of the subplan in the subplan's return value. Bug: #19370 Reported-by: Zepeng Zhang <redraiment@gmail.com> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Diagnosed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/19370-7fb7a5854b7618f1@postgresql.org Backpatch-through: 18	2026-01-06 19:51:10 -05:00
Robert Haas	014f9a831a	Don't reset the pathlist of partitioned joinrels. apply_scanjoin_target_to_paths wants to avoid useless work and platform-specific dependencies by throwing away the path list created prior to applying the final scan/join target and constructing a whole new one using the final scan/join target. However, this is only valid when we'll consider all the same strategies after the pathlist reset as before. After resetting the path list, we reconsider Append and MergeAppend paths with the modified target list; therefore, it's only valid for a partitioned relation. However, what the previous coding missed is that it cannot be a partitioned join relation, because that also has paths that are not Append or MergeAppend paths and will not be reconsidered. Thus, before this patch, we'd sometimes choose a partitionwise strategy with a higher total cost than cheapest non-partitionwise strategy, which is not good. We had a surprising number of tests cases that were relying on this bug to work as they did. A big part of the reason for this is that row counts in regression test cases tend to be low, which brings the cost of partitionwise and non-partitionwise strategies very close together, especially for merge joins, where the real and perceived advantages of a partitionwise approach are minimal. In addition, one test case included a row-count-inflating join. In such cases, a partitionwise join can easily be a loser on cost, because the total number of tuples passing through an Append node is much higher than it is with a non-partitionwise strategy. That test case is adjusted by adding additional join clauses to avoid the row count inflation. Although the failure of the planner to choose the lowest-cost path is a bug, we generally do not back-patch fixes of this type, because planning is not an exact science and there is always a possibility that some user will end up with a plan that has a lower estimated cost but actually runs more slowly. Hence, no backpatch here, either. The code change here is exactly what was originally proposed by Ashutosh, but the changes to the comments and test cases have been very heavily rewritten by me, helped along by some very useful advice from Richard Guo. Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Robert Haas <rhaas@postgresql.org> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Arne Roland <arne.roland@malkut.net> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: http://postgr.es/m/CAExHW5toze58+jL-454J3ty11sqJyU13Sz5rJPQZDmASwZgWiA@mail.gmail.com	2025-12-05 12:00:18 -05:00
Richard Guo	b63a822452	Treat JsonConstructorExpr as non-strict JsonConstructorExpr can produce non-NULL output with a NULL input, so it should be treated as a non-strict construct. Failing to do so can lead to incorrect query behavior. For example, in the reported case, when pulling up a subquery that is under an outer join, if the subquery's target list contains a JsonConstructorExpr that uses subquery variables and it is mistakenly treated as strict, it will be pulled up without being wrapped in a PlaceHolderVar. As a result, the expression will be evaluated at the wrong place and will not be forced to null when the outer join should do so. Back-patch to v16 where JsonConstructorExpr was introduced. Bug: #19046 Reported-by: Runyuan He <runyuan@berkeley.edu> Author: Tender Wang <tndrwang@gmail.com> Co-authored-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/19046-765b6602b0a8cfdf@postgresql.org Backpatch-through: 16	2025-09-16 18:42:20 +09:00
Richard Guo	aba8f61c30	Fix planner error when estimating SubPlan cost SubPlan nodes are typically built very early, before any RelOptInfos have been constructed for the parent query level. As a result, the simple_rel_array in the parent root has not yet been initialized. Currently, during cost estimation of a SubPlan's testexpr, we may call examine_variable() to look up statistical data about the expressions. This can lead to "no relation entry for relid" errors. To fix, pass root as NULL to cost_qual_eval() in cost_subplan(), since the root does not yet contain enough information to safely consult statistics. One exception is SubPlan nodes built for the initplans of MIN/MAX aggregates from indexes. In this case, having a NULL root is safe because testexpr will be NULL. Additionally, an initplan will by definition not consult anything from the parent plan. Backpatch to all supported branches. Although the reported call path that triggers this error is not reachable prior to v17, there's no guarantee that other code paths -- especially in extensions -- could not encounter the same issue when cost_qual_eval() is called with a root that lacks a valid simple_rel_array. The test case is not included in pre-v17 branches though. Bug: #19037 Reported-by: Alexander Lakhin <exclusion@gmail.com> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19037-3d1c7bb553c7ce84@postgresql.org Backpatch-through: 13	2025-09-03 16:00:38 +09:00
Richard Guo	24225ad9aa	Pathify RHS unique-ification for semijoin planning There are two implementation techniques for semijoins: one uses the JOIN_SEMI jointype, where the executor emits at most one matching row per left-hand side (LHS) row; the other unique-ifies the right-hand side (RHS) and then performs a plain inner join. The latter technique currently has some drawbacks related to the unique-ification step. * Only the cheapest-total path of the RHS is considered during unique-ification. This may cause us to miss some optimization opportunities; for example, a path with a better sort order might be overlooked simply because it is not the cheapest in total cost. Such a path could help avoid a sort at a higher level, potentially resulting in a cheaper overall plan. * We currently rely on heuristics to choose between hash-based and sort-based unique-ification. A better approach would be to generate paths for both methods and allow add_path() to decide which one is preferable, consistent with how path selection is handled elsewhere in the planner. * In the sort-based implementation, we currently pay no attention to the pathkeys of the input subpath or the resulting output. This can result in redundant sort nodes being added to the final plan. This patch improves semijoin planning by creating a new RelOptInfo for the RHS rel to represent its unique-ified version. It then generates multiple paths that represent elimination of distinct rows from the RHS, considering both a hash-based implementation using the cheapest total path of the original RHS rel, and sort-based implementations that either exploit presorted input paths or explicitly sort the cheapest total path. All resulting paths compete in add_path(), and those deemed worthy of consideration are added to the new RelOptInfo. Finally, the unique-ified rel is joined with the other side of the semijoin using a plain inner join. As a side effect, most of the code related to the JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER jointypes -- used to indicate that the LHS or RHS path should be made unique -- has been removed. Besides, the T_Unique path now has the same meaning for both semijoins and upper DISTINCT clauses: it represents adjacent-duplicate removal on presorted input. This patch unifies their handling by sharing the same data structures and functions. This patch also removes the UNIQUE_PATH_NOOP related code along the way, as it is dead code -- if the RHS rel is provably unique, the semijoin should have already been simplified to a plain inner join by analyzejoins.c. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-EBnaRvEs7frTLbsXiweSTUXifsteF-d3rvv01FKO86w@mail.gmail.com	2025-08-19 09:35:40 +09:00
Tom Lane	931766aaec	Simplify COALESCE() with one surviving argument. If, after removal of useless null-constant arguments, a CoalesceExpr has exactly one remaining argument, we can just take that argument as the result, without bothering to wrap a new CoalesceExpr around it. This isn't likely to produce any great improvement in runtime per se, but it can lead to better plans since the planner no longer has to treat the expression as non-strict. However, there were a few regression test cases that intentionally wrote COALESCE(x) as a shorthand way of creating a non-strict subexpression. To avoid ruining the intent of those tests, write COALESCE(x,x) instead. (If anyone ever proposes de-duplicating COALESCE arguments, we'll need another iteration of this arms race. But it seems pretty unlikely that such an optimization would be worthwhile.) Author: Maksim Milyutin <maksim.milyutin@tantorlabs.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8e8573c3-1411-448d-877e-53258b7b2be0@tantorlabs.ru	2025-07-03 17:39:53 -04:00
Tom Lane	3ba2cdaa45	Stabilize regression test from c0962a113. Per buildfarm. Co-authored-by: Alena Rybakina <a.rybakina@postgrespro.ru> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/srnuqlttuimzmvoulhsrbgvj4vnul6b65osswvua7sfkqsvmuy@yg7apybpxp34	2025-04-04 11:57:26 -04:00
Alexander Korotkov	c0962a113d	Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate This commit implements the automatic conversion of 'x IN (VALUES ...)' into ScalarArrayOpExpr. That simplifies the query tree, eliminating the appearance of an unnecessary join. Since VALUES describes a relational table, and the value of such a list is a table row, the optimizer will likely face an underestimation problem due to the inability to estimate cardinality through MCV statistics. The cardinality evaluation mechanism can work with the array inclusion check operation. If the array is small enough (< 100 elements), it will perform a statistical evaluation element by element. We perform the transformation in the convert_ANY_sublink_to_join() if VALUES RTE is proper and the transformation is convertible. The conversion is only possible for operations on scalar values, not rows. Also, we currently support the transformation only when it ends up with a constant array. Otherwise, the evaluation of non-hashed SAOP might be slower than the corresponding Hash Join with VALUES. Discussion: https://postgr.es/m/0184212d-1248-4f1f-a42d-f5cb1c1976d2%40tantorlabs.com Author: Alena Rybakina <a.rybakina@postgrespro.ru> Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Ivan Kush <ivan.kush@tantorlabs.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-04-04 16:01:50 +03:00
Tom Lane	8d96f57d5c	Improve planner's handling of SetOp plans. Remove the code for inserting flag columns in the inputs of a SetOp. That was the only reason why there would be resjunk columns in a set-operations plan tree, so we can get rid of some code that supported that, too. Get rid of choose_hashed_setop() in favor of building Paths for the hashed and sorted alternatives, and letting them fight it out within add_path(). Remove set_operation_ordered_results_useful(), which was giving wrong answers due to examining the wrong ancestor node: we need to examine the immediate SetOperationStmt parent not the topmost node. Instead make each caller of recurse_set_operations() pass down the relevant parent node. (This thinko seems to have led only to wasted planning cycles and possibly-inferior plans, not wrong query answers. Perhaps we should back-patch it, but I'm not doing so right now.) Teach generate_nonunion_paths() to consider pre-sorted inputs for sorted SetOps, rather than always generating a Sort node. Patch by me; thanks to Richard Guo and David Rowley for review. Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us	2024-12-19 17:02:25 -05:00
Richard Guo	60be3f9f0a	Avoid unnecessary wrapping for more complex expressions When pulling up a subquery that is under an outer join, if the subquery's target list contains a strict expression that uses a subquery variable, it's okay to pull up the expression without wrapping it in a PlaceHolderVar: if the subquery variable is forced to NULL by the outer join, the expression result will come out as NULL too. If the strict expression does not contain any subquery variables, the current code always wraps it in a PlaceHolderVar. While this is not incorrect, the analysis could be tighter: if the strict expression contains any variables of rels that are under the same lowest nulling outer join as the subquery, we can also avoid wrapping it. This is safe because if the subquery variable is forced to NULL by the outer join, the variables of rels that are under the same lowest nulling outer join will also be forced to NULL, resulting in the expression evaluating to NULL as well. Therefore, it's not necessary to force the expression to be evaluated below the outer join. It could be beneficial to get rid of such PHVs because they could imply lateral dependencies, which force us to resort to nestloop joins. This patch checks if the lateral references in the strict expression contain any variables of rels under the same lowest nulling outer join as the subquery, and avoids wrapping the expression in that case. This is fundamentally a generalization of the optimizations for bare Vars and PHVs introduced in commit f64ec81a8. No backpatch as this could result in plan changes. Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs4_ENtfRdLaM_bXAxiKRYO7DmwDBDG4_2=VTDi0mJP-jAw@mail.gmail.com	2024-12-17 19:53:01 +09:00
David Rowley	c2a4078eba	Enable BUFFERS with EXPLAIN ANALYZE by default The topic of turning EXPLAIN's BUFFERS option on with the ANALYZE option has come up a few times over the past few years. In many ways, doing this seems like a good idea as it may be more obvious to users why a given query is running more slowly than they might expect. Also, from my own (David's) personal experience, I've seen users posting to the mailing lists with two identical plans, one slow and one fast asking why their query is sometimes slow. In many cases, this is due to additional reads. Having BUFFERS on by default may help reduce some of these questions, and if not, make it more obvious to the user before they post, or save a round-trip to the mailing list when additional I/O effort is the cause of the slowness. The general consensus is that we want BUFFERS on by default with ANALYZE. However, there were more than zero concerns raised with doing so. The primary reason against is the additional verbosity, making it harder to read large plans. Another concern was that buffer information isn't always useful so may not make sense to have it on by default. It's currently December, so let's commit this to see if anyone comes forward with a strong objection against making this change. We have over half a year remaining in the v18 cycle where we could still easily consider reverting this if someone were to come forward with a convincing enough reason as to why doing this is a bad idea. There were two patches independently submitted to achieve this goal, one by me and the other by Guillaume. This commit is a mix of both of these patches with some additional work done by me to adjust various additional places in the documentation which include EXPLAIN ANALYZE output. Author: Guillaume Lelarge, David Rowley Reviewed-by: Robert Haas, Greg Sabino Mullane, Michael Christofides Discussion: https://postgr.es/m/CANNMO++W7MM8T0KyXN3ZheXXt-uLVM3aEtZd+WNfZ=obxffUiA@mail.gmail.com	2024-12-11 22:35:11 +13:00
Richard Guo	f64ec81a81	Avoid unnecessary wrapping for Vars and PHVs When pulling up a lateral subquery that is under an outer join, the current code always wraps a Var or PHV in the subquery's targetlist into a new PlaceHolderVar if it is a lateral reference to something outside the subquery. This is necessary when the Var/PHV references the non-nullable side of the outer join from the nullable side: we need to ensure that it is evaluated at the right place and hence is forced to null when the outer join should do so. However, if the referenced rel is under the same lowest nulling outer join, we can actually omit the wrapping. That's safe because if the subquery variable is forced to NULL by the outer join, the lateral reference variable will come out as NULL too. It could be beneficial to get rid of such PHVs because they imply lateral dependencies, which force us to resort to nestloop joins. This patch leverages the newly introduced nullingrel_info to check if the nullingrels of the subquery RTE are a subset of those of the laterally referenced rel, in order to determine if the referenced rel is under the same lowest nulling outer join. No backpatch as this could result in plan changes. Author: Richard Guo Reviewed-by: James Coleman, Dmitry Dolgov, Andrei Lepikhov Discussion: https://postgr.es/m/CAMbWs48uk6C7Z9m_FNT8_21CMCk68hrgAsz=z6zpP1PNZMkeoQ@mail.gmail.com	2024-12-09 20:38:22 +09:00
Tom Lane	e032e4c7dd	Avoid mislabeling of lateral references, redux. As I'd feared, commit 5c9d8636d was still a few bricks shy of a load. We can't just leave pulled-up lateral-reference Vars with no new nullingrels: we have to carefully compute what subset of the to-be-replaced Var's nullingrels apply to them, else we still get "wrong varnullingrels" errors. This is a bit tedious, but it looks like we can use the nullingrel data this patch computes for other purposes, enabling better optimization. We don't want to inject unnecessary plan changes into stable branches though, so leave that idea for a later HEAD-only patch. Patch by me, but thanks to Richard Guo for devising a test case that broke 5c9d8636d, and for preliminary investigation about how to fix it. As before, back-patch to v16. Discussion: https://postgr.es/m/E1tGn4j-0003zi-MP@gemulon.postgresql.org	2024-11-30 12:42:19 -05:00
Tom Lane	5c9d8636d3	Avoid mislabeling of lateral references when pulling up a subquery. If we are pulling up a subquery that's under an outer join, and the subquery's target list contains a strict expression that uses both a subquery variable and a lateral-reference variable, it's okay to pull up the expression without wrapping it in a PlaceHolderVar. That's safe because if the subquery variable is forced to NULL by the outer join, the expression result will come out as NULL too, so we don't have to force that outcome by evaluating the expression below the outer join. It'd be correct to wrap in a PHV, but that can lead to very significantly worse plans, since we'd then have to use a nestloop plan to pass down the lateral reference to where the expression will be evaluated. However, when we do that, we should not mark the lateral reference variable as being nulled by the outer join, because it isn't after we pull up the expression in this way. So the marking logic added by cb8e50a4a was incorrect in this detail, leading to "wrong varnullingrels" errors from the consistency-checking logic in setrefs.c. It seems to be sufficient to just not mark lateral references at all in this case. (I have a nagging feeling that more complexity may be needed in cases where there are several levels of outer join, but some attempts to break it with that didn't succeed.) Per report from Bertrand Mamasam. Back-patch to v16, as the previous patch was. Discussion: https://postgr.es/m/CACZ67_UA_EVrqiFXJu9XK50baEpH=ofEPJswa2kFxg6xuSw-ww@mail.gmail.com	2024-11-28 17:33:16 -05:00
Tom Lane	cb8e50a4a0	Avoid inserting PlaceHolderVars in cases where pre-v16 PG did not. Commit 2489d76c4 removed some logic from pullup_replace_vars() that avoided wrapping a PlaceHolderVar around a pulled-up subquery output expression if the expression could be proven to go to NULL anyway (because it contained Vars or PHVs of the pulled-up relation and did not contain non-strict constructs). But removing that logic turns out to cause performance regressions in some cases, because the extra PHV blocks subexpression folding, and will do so even if outer-join reduction later turns it into a no-op with no phnullingrels bits. This can for example prevent an expression from being matched to an index. The reason for always adding a PHV was to ensure we had someplace to put the varnullingrels marker bits of the Var being replaced. However, it turns out we can optimize in exactly the same cases that the previous code did, because we can instead attach the needed varnullingrels bits to the contained Var(s)/PHV(s). This is not a complete solution --- it would be even better if we could remove PHVs after reducing them to no-ops. It doesn't look practical to back-patch such an improvement, but this change seems safe and at least gets rid of the performance-regression cases. Per complaint from Nikhil Raj. Back-patch to v16 where the problem appeared. Discussion: https://postgr.es/m/CAG1ps1xvnTZceKK24OUfMKLPvDP2vjT-d+F2AOCWbw_v3KeEgg@mail.gmail.com	2024-08-30 12:42:12 -04:00
Tom Lane	c2df2ed90a	Try to stabilize flappy test result. This recently-added test case checks the plan of an inner join between two identical tables. It's just chance which join order the planner will pick, and in the presence of any variation in the underlying statistics, the displayed plan might change. Add a WHERE condition to break the cost symmetry and hopefully stabilize matters. (We're still trying to understand exactly why the underlying statistics aren't as stable as intended, but this seems like a good change anyway, since this test would surely bite us again in future.) While here, clean up assorted comment spelling, grammar, and whitespace problems. Discussion: https://postgr.es/m/4168116.1711720146@sss.pgh.pa.us	2024-03-29 10:40:31 -04:00
Tom Lane	fd0398fcb0	Improve EXPLAIN's display of SubPlan nodes and output parameters. Historically we've printed SubPlan expression nodes as "(SubPlan N)", which is pretty uninformative. Trying to reproduce the original SQL for the subquery is still as impractical as before, and would be mighty verbose as well. However, we can still do better than that. Displaying the "testexpr" when present, and adding a keyword to indicate the SubLinkType, goes a long way toward showing what's really going on. In addition, this patch gets rid of EXPLAIN's use of "$n" to represent subplan and initplan output Params. Instead we now print "(SubPlan N).colX" or "(InitPlan N).colX" to represent the X'th output column of that subplan. This eliminates confusion with the use of "$n" to represent PARAM_EXTERN Params, and it's useful for the first part of this change because it eliminates needing some other indication of which subplan is referenced by a SubPlan that has a testexpr. In passing, this adds simple regression test coverage of the ROWCOMPARE_SUBLINK code paths, which were entirely unburdened by testing before. Tom Lane and Dean Rasheed, reviewed by Aleksander Alekseev. Thanks to Chantal Keller for raising the question of whether this area couldn't be improved. Discussion: https://postgr.es/m/2838538.1705692747@sss.pgh.pa.us	2024-03-19 18:19:24 -04:00
Alexander Korotkov	9f13376396	Pull up ANY-SUBLINK with the necessary lateral support. For ANY-SUBLINK, we adopted a two-stage pull-up approach to handle different types of scenarios. In the first stage, the sublink is pulled up as a subquery. Because of this, when writing this code, we did not have the ability to perform lateral joins, and therefore, we were unable to pull up Var with varlevelsup=1. Now that we have the ability to use lateral joins, we can eliminate this limitation. Author: Andy Fan <zhihui.fan1213@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru> Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru>	2024-02-15 12:06:12 +02:00
Daniel Gustafsson	1f9e3a9be5	Fix typo in test comment s/currect/correct/, accidentally introduced in 608b167f9f9.	2023-09-23 09:56:38 +02:00
Dean Rasheed	bcedd8f5fc	Make subquery aliases optional in the FROM clause. This allows aliases for sub-SELECTs and VALUES clauses in the FROM clause to be omitted. This is an extension of the SQL standard, supported by some other database systems, and so eases the transition from such systems, as well as removing the minor inconvenience caused by requiring these aliases. Patch by me, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCUCGCf82=hxd9N5n6xGHPyYpQnxW8HneeH+uP7yNALkWA@mail.gmail.com	2022-07-20 09:29:42 +01:00
Tom Lane	6478896675	Teach hash_ok_operator() that record_eq is only sometimes hashable. The need for this was foreseen long ago, but when record_eq actually became hashable (in commit 01e658fa7), we missed updating this spot. Per bug #17363 from Elvis Pranskevichus. Back-patch to v14 where the faulty commit came in. Discussion: https://postgr.es/m/17363-f6d42fd0d726be02@postgresql.org	2022-01-16 16:39:26 -05:00
Tom Lane	e8638d78a2	Fix planner error with multiple copies of an AlternativeSubPlan. It's possible for us to copy an AlternativeSubPlan expression node into multiple places, for example the scan quals of several partition children. Then it's possible that we choose a different one of the alternatives as optimal in each place. Commit 41efb8340 failed to consider this scenario, so its attempt to remove "unused" subplans could remove subplans that were still used elsewhere. Fix by delaying the removal logic until we've examined all the AlternativeSubPlans in a given query level. (This does assume that AlternativeSubPlans couldn't get copied to other query levels, but for the foreseeable future that's fine; cf qual_is_pushdown_safe.) Per report from Rajkumar Raghuwanshi. Back-patch to v14 where the faulty logic came in. Discussion: https://postgr.es/m/CAKcux6==O3NNZC3bZ2prRYv3cjm3_Zw1GfzmOjEVqYN4jub2+Q@mail.gmail.com	2021-09-14 15:11:21 -04:00
Peter Eisentraut	c06d6aa4c3	Clean up ancient test style Many older tests where written in a style like SELECT '' AS two, i.* FROM INT2_TBL where the first column indicated the number of expected result rows. This has gotten increasingly out of date, as the test data fixtures have expanded, so a lot of these were wrong and misleading. Moreover, this style isn't really necessary, since the psql output already shows the number of result rows. To clean this up, remove all those extra columns. Discussion: https://www.postgresql.org/message-id/flat/1a25312b-2686-380d-3c67-7a69094a999f%40enterprisedb.com	2020-12-15 22:03:39 +01:00
Tom Lane	41efb83408	Move resolution of AlternativeSubPlan choices to the planner. When commit bd3daddaf introduced AlternativeSubPlans, I had some ambitions towards allowing the choice of subplan to change during execution. That has not happened, or even been thought about, in the ensuing twelve years; so it seems like a failed experiment. So let's rip that out and resolve the choice of subplan at the end of planning (in setrefs.c) rather than during executor startup. This has a number of positive benefits: * Removal of a few hundred lines of executor code, since AlternativeSubPlans need no longer be supported there. * Removal of executor-startup overhead (particularly, initialization of subplans that won't be used). * Removal of incidental costs of having a larger plan tree, such as tree-scanning and copying costs in the plancache; not to mention setrefs.c's own costs of processing the discarded subplans. * EXPLAIN no longer has to print a weird (and undocumented) representation of an AlternativeSubPlan choice; it sees only the subplan actually used. This should mean less confusion for users. * Since setrefs.c knows which subexpression of a plan node it's working on at any instant, it's possible to adjust the estimated number of executions of the subplan based on that. For example, we should usually estimate more executions of a qual expression than a targetlist expression. The implementation used here is pretty simplistic, because we don't want to expend a lot of cycles on the issue; but it's better than ignoring the point entirely, as the executor had to. That last point might possibly result in shifting the choice between hashed and non-hashed EXISTS subplans in a few cases, but in general this patch isn't meant to change planner choices. Since we're doing the resolution so late, it's really impossible to change any plan choices outside the AlternativeSubPlan itself. Patch by me; thanks to David Rowley for review. Discussion: https://postgr.es/m/1992952.1592785225@sss.pgh.pa.us	2020-09-27 12:51:28 -04:00
Tom Lane	1e7629d2c9	Be more careful about the shape of hashable subplan clauses. nodeSubplan.c expects that the testexpr for a hashable ANY SubPlan has the form of one or more OpExprs whose LHS is an expression of the outer query's, while the RHS is an expression over Params representing output columns of the subquery. However, the planner only went as far as verifying that the clauses were all binary OpExprs. This works 99.99% of the time, because the clauses have the right shape when emitted by the parser --- but it's possible for function inlining to break that, as reported by PegoraroF10. To fix, teach the planner to check that the LHS and RHS contain the right things, or more accurately don't contain the wrong things. Given that this has been broken for years without anyone noticing, it seems sufficient to just give up hashing when it happens, rather than go to the trouble of commuting the clauses back again (which wouldn't necessarily work anyway). While poking at that, I also noticed that nodeSubplan.c had a baked-in assumption that the number of hash clauses is identical to the number of subquery output columns. Again, that's fine as far as parser output goes, but it's not hard to break it via function inlining. There seems little reason for that assumption though --- AFAICS, the only thing it's buying us is not having to store the number of hash clauses explicitly. Adding code to the planner to reject such cases would take more code than getting nodeSubplan.c to cope, so I fixed it that way. This has been broken for as long as we've had hashable SubPlans, so back-patch to all supported branches. Discussion: https://postgr.es/m/1549209182255-0.post@n3.nabble.com	2020-08-14 22:14:03 -04:00
Tom Lane	a742ecf9c6	Cope with lateral references in the quals of a subquery RTE. The qual pushdown logic assumed that all Vars in a restriction clause must be Vars referencing subquery outputs; but since we introduced LATERAL, it's possible for such a Var to be a lateral reference instead. This led to an assertion failure in debug builds. In a non-debug build, there might be no ill effects (if qual_is_pushdown_safe decided the qual was unsafe anyway), or we could get failures later due to construction of an invalid plan. I've not gone to much length to characterize the possible failures, but at least segfaults in the executor have been observed. Given that this has been busted since 9.3 and it took this long for anybody to notice, I judge that the case isn't worth going to great lengths to optimize. Hence, fix by just teaching qual_is_pushdown_safe that such quals are unsafe to push down, matching the previous behavior when it accidentally didn't fail. Per report from Tom Ellis. Back-patch to all supported branches. Discussion: https://postgr.es/m/20200713175124.GQ8220@cloudinit-builder	2020-07-13 20:38:20 -04:00
Tom Lane	58c47ccfff	Correctly re-use hash tables in buildSubPlanHash(). Commit 356687bd8 omitted to remove leftover code for destroying a hashed subplan's hash tables, with the result that the tables were always rebuilt not reused; this leads to severe memory leakage if a hashed subplan is re-executed enough times. Moreover, the code for reusing the hashnulls table had a typo that would have made it do the wrong thing if it were reached. Looking at the code coverage report shows severe under-coverage of the potential callers of ResetTupleHashTable, so add some test cases that exercise them. Andreas Karlsson and Tom Lane, per reports from Ranier Vilela and Justin Pryzby. Backpatch to v11, as the faulty commit was. Discussion: https://postgr.es/m/edb62547-c453-c35b-3ed6-a069e4d6b937@proxel.se Discussion: https://postgr.es/m/CAEudQAo=DCebm1RXtig9OH+QivpS97sMkikt0A9qHmMUs+g6ZA@mail.gmail.com Discussion: https://postgr.es/m/20200210032547.GA1412@telsasoft.com	2020-02-29 13:48:09 -05:00
Tom Lane	41c6f9db25	Repair more failures with SubPlans in multi-row VALUES lists. Commit 9b63c13f0 turns out to have been fundamentally misguided: the parent node's subPlan list is by no means the only way in which a child SubPlan node can be hooked into the outer execution state. As shown in bug #16213 from Matt Jibson, we can also get short-lived tuple table slots added to the outer es_tupleTable list. At this point I have little faith that there aren't other possible connections as well; the long time it took to notice this problem shows that this isn't a heavily-exercised situation. Therefore, revert that fix, returning to the coding that passed a NULL parent plan pointer down to the transiently-built subexpressions. That gives us a pretty good guarantee that they won't hook into the outer executor state in any way. But then we need some other solution to make SubPlans work. Adopt the solution speculated about in the previous commit's log message: do expression initialization at plan startup for just those VALUES rows containing SubPlans, abandoning the goal of reclaiming memory intra-query for those rows. In practice it seems unlikely that queries containing a vast number of VALUES rows would be using SubPlans in them, so this should not give up much. (BTW, this test case also refutes my claim in connection with the prior commit that the issue only arises with use of LATERAL. That was just wrong: some variants of SubLink always produce SubPlans.) As with previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/16213-871ac3bc208ecf23@postgresql.org	2020-01-17 16:17:31 -05:00
Tom Lane	b925a00f4e	Fix "force_parallel_mode = regress" to work with ANALYZE + VERBOSE. force_parallel_mode = regress is supposed to force use of a Gather node without having any impact on EXPLAIN output. But it failed to accomplish that if both ANALYZE and VERBOSE are given, because that enables per-worker output data that you wouldn't see if the Gather hadn't been inserted. Improve the logic so that we suppress the per-worker data too. This allows putting the new test case added by commit 5935917ce back into the originally intended form (cf. 776a2c887, 22864f6e0). We can also get rid of a kluge in subselect.sql, which previously had to clean up after force_parallel_mode's failure to do what it said on the tin. Discussion: https://postgr.es/m/18445.1576177309@sss.pgh.pa.us	2019-12-16 20:14:35 -05:00
Tom Lane	4766dce0dd	Fix choice of comparison operators for cross-type hashed subplans. Commit bf6c614a2 rearranged the lookup of the comparison operators needed in a hashed subplan, and in so doing, broke the cross-type case: it caused the original LHS-vs-RHS operator to be used to compare hash table entries too (which of course are all of the RHS type). This leads to C functions being passed a Datum that is not of the type they expect, with the usual hazards of crashes and unauthorized server memory disclosure. For the set of hashable cross-type operators present in v11 core Postgres, this bug is nearly harmless on 64-bit machines, which may explain why it escaped earlier detection. But it is a live security hazard on 32-bit machines; and of course there may be extensions that add more hashable cross-type operators, which would increase the risk. Reported by Andreas Seltenreich. Back-patch to v11 where the problem came in. Security: CVE-2019-10209	2019-08-05 11:20:31 -04:00
Tom Lane	9476131278	Prevent inlining of multiply-referenced CTEs with outer recursive refs. This has to be prevented because inlining would result in multiple self-references, which we don't support (and in fact that's disallowed by the SQL spec, see statements about linearly vs. nonlinearly recursive queries). Bug fix for commit 608b167f9. Per report from Yaroslav Schekin (via Andrew Gierth) Discussion: https://postgr.es/m/87wolmg60q.fsf@news-spur.riddles.org.uk	2019-04-09 15:47:35 -04:00
Peter Eisentraut	5e1963fb76	Collations with nondeterministic comparison This adds a flag "deterministic" to collations. If that is false, such a collation disables various optimizations that assume that strings are equal only if they are byte-wise equal. That then allows use cases such as case-insensitive or accent-insensitive comparisons or handling of strings with different Unicode normal forms. This functionality is only supported with the ICU provider. At least glibc doesn't appear to have any locales that work in a nondeterministic way, so it's not worth supporting this for the libc provider. The term "deterministic comparison" in this context is from Unicode Technical Standard #10 (https://unicode.org/reports/tr10/#Deterministic_Comparison). This patch makes changes in three areas: - CREATE COLLATION DDL changes and system catalog changes to support this new flag. - Many executor nodes and auxiliary code are extended to track collations. Previously, this code would just throw away collation information, because the eventually-called user-defined functions didn't use it since they only cared about equality, which didn't need collation information. - String data type functions that do equality comparisons and hashing are changed to take the (non-)deterministic flag into account. For comparison, this just means skipping various shortcuts and tie breakers that use byte-wise comparison. For hashing, we first need to convert the input string to a canonical "sort key" using the ICU analogue of strxfrm(). Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com	2019-03-22 12:12:43 +01:00
Tom Lane	3396138a6d	Check we don't misoptimize a NOT IN where the subquery returns no rows. Future-proofing against a common mistake in attempts to optimize NOT IN. We don't have such an optimization right now, but attempts to do so are in the works, and some of 'em are buggy. Add a regression test case covering the point. David Rowley Discussion: https://postgr.es/m/CAKJS1f90E9agVZryVyUpbHQbjTt5ExqS2Fsodmt5_A7E_cEyVA@mail.gmail.com	2019-03-01 17:57:20 -05:00
Tom Lane	608b167f9f	Allow user control of CTE materialization, and change the default behavior. Historically we've always materialized the full output of a CTE query, treating WITH as an optimization fence (so that, for example, restrictions from the outer query cannot be pushed into it). This is appropriate when the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE query is non-recursive and side-effect-free, there's no hazard of changing the query results by pushing restrictions down. Another argument for materialization is that it can avoid duplicate computation of an expensive WITH query --- but that only applies if the WITH query is called more than once in the outer query. Even then it could still be a net loss, if each call has restrictions that would allow just a small part of the WITH query to be computed. Hence, let's change the behavior for WITH queries that are non-recursive and side-effect-free. By default, we will inline them into the outer query (removing the optimization fence) if they are called just once. If they are called more than once, we will keep the old behavior by default, but the user can override this and force inlining by specifying NOT MATERIALIZED. Lastly, the user can force the old behavior by specifying MATERIALIZED; this would mainly be useful when the query had deliberately been employing WITH as an optimization fence to prevent a poor choice of plan. Andreas Karlsson, Andrew Gierth, David Fetter Discussion: https://postgr.es/m/87sh48ffhb.fsf@news-spur.riddles.org.uk	2019-02-16 16:11:12 -05:00
Tom Lane	a391ff3c3d	Build out the planner support function infrastructure. Add support function requests for estimating the selectivity, cost, and number of result rows (if a SRF) of the target function. The lack of a way to estimate selectivity of a boolean-returning function in WHERE has been a recognized deficiency of the planner since Berkeley days. This commit finally fixes it. In addition, non-constant estimates of cost and number of output rows are now possible. We still fall back to looking at procost and prorows if the support function doesn't service the request, of course. To make concrete use of the possibility of estimating output rowcount for SRFs, this commit adds support functions for array_unnest(anyarray) and the integer variants of generate_series; the lack of plausible rowcount estimates for those, even when it's obvious to a human, has been a repeated subject of complaints. Obviously, much more could now be done in this line, but I'm mostly just trying to get the infrastructure in place. Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us	2019-02-09 18:32:23 -05:00
Andrew Gierth	520acab171	Set scan direction appropriately for SubPlans (bug #15336 ) When executing a SubPlan in an expression, the EState's direction field was left alone, resulting in an attempt to execute the subplan backwards if it was encountered during a backwards scan of a cursor. Also, though much less likely, it was possible to reach the execution of an InitPlan while in backwards-scan state. Repair by saving/restoring estate->es_direction and forcing forward scan mode in the relevant places. Backpatch all the way, since this has been broken since 8.3 (prior to commit c7ff7663e, SubPlans had their own EStates rather than sharing the parent plan's, so there was no confusion over scan direction). Per bug #15336 reported by Vladimir Baranoff; analysis and patch by me, review by Tom Lane. Discussion: https://postgr.es/m/153449812167.1304.1741624125628126322@wrigleys.postgresql.org	2018-08-17 15:44:13 +01:00
Andres Freund	4f63e85eb1	Add COSTS off to two EXPLAIN using tests. Discussion: https://postgr.es/m/20180312222023.i4sgkbl4oqtstus3@alap3.anarazel.de	2018-03-13 16:12:31 -07:00
Tom Lane	9b63c13f0a	Repair failure with SubPlans in multi-row VALUES lists. When nodeValuesscan.c was written, it was impossible to have a SubPlan in VALUES --- any sub-SELECT there would have to be uncorrelated and thereby would produce an InitPlan instead. We therefore took a shortcut in the logic that throws away a ValuesScan's per-row expression evaluation data structures. This was broken by the introduction of LATERAL however; a sub-SELECT containing a lateral reference produces a correlated SubPlan. The cleanest fix for this would be to give up the optimization of discarding the expression eval state. But that still seems pretty unappetizing for long VALUES lists. It seems to work to just prevent the subexpressions from hooking into the ValuesScan node's subPlan list, so let's do that and see how well it works. (If this breaks, due to additional connections between the subexpressions and the outer query structures, we might consider compromises like throwing away data only for VALUES rows not containing SubPlans.) Per bug #14924 from Christian Duta. Back-patch to 9.3 where LATERAL was introduced. Discussion: https://postgr.es/m/20171124120836.1463.5310@wrigleys.postgresql.org	2017-11-25 14:15:48 -05:00
Tom Lane	896eb5efbd	In the planner, delete joinaliasvars lists after we're done with them. Although joinaliasvars lists coming out of the parser are quite simple, those lists can contain arbitrarily complex expressions after subquery pullup. We do not perform expression preprocessing on them, meaning that expressions in those lists will not meet the expectations of later phases of the planner (for example, that they do not contain SubLinks). This had been thought pretty harmless, since we don't intentionally touch those lists in later phases --- but Andreas Seltenreich found a case in which adjust_appendrel_attrs() could recurse into a joinaliasvars list and then die on its assertion that it never sees a SubLink. We considered a couple of localized fixes to prevent that specific case from looking at the joinaliasvars lists, but really this seems like a generic hazard for all expression processing in the planner. Therefore, probably the best answer is to delete the joinaliasvars lists from the parsetree at the end of expression preprocessing, so that there are no reachable expressions that haven't been through preprocessing. The case Andreas found seems to be harmless in non-Assert builds, and so far there are no field reports suggesting that there are user-visible effects in other cases. I considered back-patching this anyway, but it turns out that Andreas' test doesn't fail at all in 9.4-9.6, because in those versions adjust_appendrel_attrs contains code (added in commit 842faa714 and removed again in commit 215b43cdc) to process SubLinks rather than complain about them. Barring discovery of another path by which unprocessed joinaliasvars lists can cause trouble, the most prudent compromise seems to be to patch this into v10 but not further. Patch by me, with thanks to Amit Langote for initial investigation and review. Discussion: https://postgr.es/m/87r2tvt9f1.fsf@ansel.ydns.eu	2017-10-24 18:42:47 -04:00
Robert Haas	bf11e7ee2e	Propagate sort instrumentation from workers back to leader. Up until now, when parallel query was used, no details about the sort method or space used by the workers were available; details were shown only for any sorting done by the leader. Fix that. Commit 1177ab1dabf72bafee8f19d904cee3a299f25892 forced the test case added by commit 1f6d515a67ec98194c23a5db25660856c9aab944 to run without parallelism; now that we have this infrastructure, allow that again, with a little tweaking to make it pass with and without force_parallel_mode. Robert Haas and Tom Lane Discussion: http://postgr.es/m/CA+Tgmoa2VBZW6S8AAXfhpHczb=Rf6RqQ2br+zJvEgwJ0uoD_tQ@mail.gmail.com	2017-08-29 13:26:33 -04:00
Tom Lane	1177ab1dab	Make new regression test case parallel-safe, and improve its output. The test case added by commit 1f6d515a6 fails on buildfarm members that have force_parallel_mode turned on, because we currently don't report sort performance details from worker processes back to the master. To fix that, just make the test table be temp rather than regular; that's a good idea anyway to forestall any possible interference from auto-analyze. (The restriction that workers can't access temp tables might go away someday, but almost certainly not before the other thing gets fixed.) Also, improve the test so that we retain as much as possible of the EXPLAIN ANALYZE output. This aids debugging failures, and might also expose problems that the preceding version masked. Discussion: http://postgr.es/m/CADE5jYLuugnEEUsyW6Q_4mZFYTxHxaVCQmGAsF0yiY8ZDggi-w@mail.gmail.com	2017-08-24 13:39:58 -04:00
Robert Haas	1f6d515a67	Push limit through subqueries to underlying sort, where possible. Douglas Doole, reviewed by Ashutosh Bapat and by me. Minor formatting change by me. Discussion: http://postgr.es/m/CADE5jYLuugnEEUsyW6Q_4mZFYTxHxaVCQmGAsF0yiY8ZDggi-w@mail.gmail.com	2017-08-21 14:19:44 -04:00
Tom Lane	555494d1bc	Fix placement of initPlans when forcibly materializing a subplan. If we forcibly place a Material node atop a finished subplan, we need to move any initPlans attached to the subplan up to the Material node, in order to keep SS_finalize_plan() happy. I'd figured this out in commit 7b67a0a49 for the case of materializing a cursor plan, but out of an abundance of caution, I put the initPlan movement hack at the call site for that case, rather than inside materialize_finished_plan(). That was the wrong thing, because it turns out to also be necessary for the only other caller of materialize_finished_plan(), ie subselect.c. We lacked any test cases that exposed the mistake, but bug#14524 from Wei Congrui shows that it's possible to get an initPlan reference into the top tlist in that case too, and then SS_finalize_plan() complains. Hence, move the hack into materialize_finished_plan(). In HEAD, also relocate some recently-added tests in subselect.sql, which I'd unthinkingly dropped into the middle of a sequence of related tests. Report: https://postgr.es/m/20170202060020.1400.89021@wrigleys.postgresql.org	2017-02-02 19:11:32 -05:00
Tom Lane	1e7c4bb004	Change unknown-type literals to type text in SELECT and RETURNING lists. Previously, we left such literals alone if the query or subquery had no properties forcing a type decision to be made (such as an ORDER BY or DISTINCT clause using that output column). This meant that "unknown" could be an exposed output column type, which has never been a great idea because it could result in strange failures later on. For example, an outer query that tried to do any operations on an unknown-type subquery output would generally fail with some weird error like "failed to find conversion function from unknown to text" or "could not determine which collation to use for string comparison". Also, if the case occurred in a CREATE VIEW's query then the view would have an unknown-type column, causing similar failures in queries trying to use the view. To fix, at the tail end of parse analysis of a query, forcibly convert any remaining "unknown" literals in its SELECT or RETURNING list to type text. However, provide a switch to suppress that, and use it in the cases of SELECT inside a set operation or INSERT command. In those cases we already had type resolution rules that make use of context information from outside the subquery proper, and we don't want to change that behavior. Also, change creation of an unknown-type column in a relation from a warning to a hard error. The error should be unreachable now in CREATE VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown" in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because it's nothing but a foot-gun. This change creates a pg_upgrade failure case: a matview that contains an unknown-type column can't be pg_upgraded, because reparsing the matview's defining query will now decide that the column is of type text, which doesn't match the cstring-like storage that the old materialized column would actually have. Add a checking pass to detect that. While at it, we can detect tables or composite types that would fail, essentially for free. Those would fail safely anyway later on, but we might as well fail earlier. This patch is by me, but it owes something to previous investigations by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for review. Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com	2017-01-25 09:17:24 -05:00
Tom Lane	72daabc7a3	Disallow pushing volatile quals past set-returning functions. Pushing an upper-level restriction clause into an unflattened subquery-in-FROM is okay when the subquery contains no SRFs in its targetlist, or when it does but the SRFs are unreferenced by the clause and the clause is not volatile. Otherwise, we're changing the number of times the clause is evaluated, which is bad for volatile quals, and possibly changing the result, since a volatile qual might succeed for some SRF output rows and not others despite not referencing any of the changing columns. (Indeed, if the clause is something like "random() > 0.5", the user is probably expecting exactly that behavior.) We had most of these restrictions down, but not the one about the upper clause not being volatile. Fix that, and add a regression test to illustrate the expected behavior. Although this is definitely a bug, it doesn't seem like back-patch material, since possibly some users don't realize that the broken behavior is broken and are relying on what happens now. Also, while the added test is quite cheap in the wake of commit a4c35ea1c, it would be much more expensive (or else messier) in older branches. Per report from Tom van Tilburg. Discussion: <CAP3PPDiucxYCNev52=YPVkrQAPVF1C5PFWnrQPT7iMzO1fiKFQ@mail.gmail.com>	2016-09-27 18:43:36 -04:00
Andres Freund	168d5805e4	Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.	2015-05-08 05:43:10 +02:00
Tom Lane	b62f94c603	Allow simplification of EXISTS() subqueries containing LIMIT. The locution "EXISTS(SELECT ... LIMIT 1)" seems to be rather common among people who don't realize that the database already performs optimizations equivalent to putting LIMIT 1 in the sub-select. Unfortunately, this was actually making things worse, because it prevented us from optimizing such EXISTS clauses into semi or anti joins. Teach simplify_EXISTS_query() to suppress constant-positive LIMIT clauses. That fixes the semi/anti-join case, and may help marginally even for cases that have to be left as sub-SELECTs. Marti Raudsepp, reviewed by David Rowley	2014-11-22 19:12:38 -05:00
Tom Lane	d685814835	Fix bug with whole-row references to append subplans. ExecEvalWholeRowVar incorrectly supposed that it could "bless" the source TupleTableSlot just once per query. But if the input is coming from an Append (or, perhaps, other cases?) more than one slot might be returned over the query run. This led to "record type has not been registered" errors when a composite datum was extracted from a non-blessed slot. This bug has been there a long time; I guess it escaped notice because when dealing with subqueries the planner tends to expand whole-row Vars into RowExprs, which don't have the same problem. It is possible to trigger the problem in all active branches, though, as illustrated by the added regression test.	2014-07-11 19:12:35 -04:00
Tom Lane	9e2f2d7a05	Don't assume a subquery's output is unique if there's a SRF in its tlist. While the x output of "select x from t group by x" can be presumed unique, this does not hold for "select x, generate_series(1,10) from t group by x", because we may expand the set-returning function after the grouping step. (Perhaps that should be re-thought; but considering all the other oddities involved with SRFs in targetlists, it seems unlikely we'll change it.) Put a check in query_is_distinct_for() so it's not fooled by such cases. Back-patch to all supported branches. David Rowley	2014-07-08 14:03:56 -04:00

1 2

77 Commits