doris

Author	SHA1	Message	Date
Mingyu Chen	fcc26cc671	[test](migrate) move some cases from p2 to p0 (#36750 )(#36787 ) (#36922 ) bp #36750 and #36787	2024-06-27 20:59:50 +08:00
Mingyu Chen	5c1eef5f06	[feature](tvf) support max_filter_ratio (#35431 ) (#36911 ) bp #35431 Co-authored-by: 苏小刚 <suxiaogang223@icloud.com>	2024-06-27 20:58:53 +08:00
morrySnow	a05d5cc75e	[refactor](variant) refactor sub path push down on variant type (#36478 ) (#36923 ) pick from master #36478 intro a new rule VARIANT_SUB_PATH_PRUNING to prune variant sub path. for example, variant slot v in table t has two sub path: 'c1' and 'c2', after this rule, select v['c1'] from t will only scan one sub path 'c1' of v to reduce scan time. This rule accomplishes all the work using two components. The Collector traverses from the top down, collecting all the element_at functions on the variant types, and recording the required path from the original variant slot to the current element_at. The Replacer traverses from the bottom up, generating the slots for the required sub path on scan, union, and cte consumer. Then, it replaces the element_at with the corresponding slot.	2024-06-27 17:48:43 +08:00
yujun	22cb7b8fcb	[improvement](compaction) be do not compact invisible version to avoid query error -230 #28082 (#36222 ) cherry pick from #28082	2024-06-27 13:45:21 +08:00
feiniaofeiafei	a8e9c89dc6	[Fix](nereids) fix NormalizeAgg, change the upper project projections rewrite logic (#36161 ) (#36622 ) cherry-pick #36161 to branch-2.1 NormalizeAggregate rewrite logic has a bug, for sql like this: SELECT CASE 1 WHEN CAST( NULL AS SIGNED ) THEN NULL WHEN COUNT( DISTINCT CAST( NULL AS SIGNED ) ) THEN NULL ELSE null END ; This is the plan after NormalizeAggregate, the LogicalAggregate only output `count(DISTINCT cast(NULL as SIGNED))`#3, do not output cast(NULL as SIGNED)#2, but the upper project use cast(NULL as SIGNED)#2, so Doris report error "cast(NULL as SIGNED) not in aggregate's output". LogicalResultSink[29] ( outputExprs=[__case_when_0#1] ) +--LogicalProject[26] ( distinct=false, projects=[CASE WHEN (1 = cast(NULL as SIGNED)#2) THEN NULL WHEN (1 = count(DISTINCT cast(NULL as SIGNED))#3) THEN NULL ELSE NULL END AS `CASE WHEN (1 = cast(NULL as SIGNED)) THEN NULL WHEN (1 = count(DISTINCT cast(NULL as SIGNED))) THEN NULL ELSE NULL END`#1], excepts=[] ) +--LogicalAggregate[25] ( groupByExpr=[], outputExpr=[count(DISTINCT cast(NULL as SIGNED)#2) AS `count(DISTINCT cast(NULL as SIGNED))`#3], hasRepeat=false ) +--LogicalProject[24] ( distinct=false, projects=[cast(NULL as SIGNED) AS `cast(NULL as SIGNED)`#2], excepts=[] ) +--LogicalOneRowRelation ( projects=[0 AS `0`#0] ) The problem is that the cast(NULL as SIGNED)#2 should not outputted by LogicalAggregate, cast(NULL as SIGNED) should be computed in LogicalProject. This pr change the upper project projections rewrite logic: aggregateOutputs is rewritten and become the upper-level LogicalProject projections. During the rewriting process, the expressions inside the agg function can be rewritten with expressions in aggregate function arguments and group by expressions, but the ones outside the agg function can only be rewritten with group by expressions. --------- Co-authored-by: moailing <moailing@selectdb.com>	2024-06-27 12:17:18 +08:00
gnehil	a6a84b8ecc	[improvement](stream load)(cherry-pick) support hll_from_base64 for stream load column mapping (#36819 ) picked from https://github.com/apache/doris/pull/35923	2024-06-26 20:12:40 +08:00
TengJianPing	25fb30c723	[fix](intersect) fix coredump caused by intersect of nullable and not nullable children #36401 (#36441 ) ## Proposed changes Pick #36765	2024-06-26 17:45:21 +08:00
zhannngchen	6ec9a731e8	[branch-2.1](cherry-pick) partial update should not read old fileds from rows with delete sign (#36210 ) (#36755 ) cherry-pick #36210	2024-06-24 21:13:24 +08:00
HappenLee	90a4dd09f3	[Fix](func) CoreDump and Result Error in percentile function (#36647 ) cherry pick #36643	2024-06-21 23:42:45 +08:00
airborne12	c939781411	[Pick 2.1](inverted index) fix wrong no need read data when need_remaining_after_evaluate (#36684 ) When using an equal predicate on a column that applies an inverted index with a parser, it requires remaining_after_evaluate. In this situation, we cannot optimize the column without reading the data. ## Proposed changes From (#36637)	2024-06-21 22:01:39 +08:00
Mingyu Chen	0cff539810	[feature](function) support new function replace_empty (#36283 ) (#36656 ) #36283	2024-06-21 16:46:22 +08:00
airborne12	8105dc7de8	[Pick 2.1](inverted index) fix wrong opt for pk no need read data (#36634 ) ## Proposed changes Pick from #36618	2024-06-21 00:57:23 +08:00
wuwenchi	ac0f6e75d2	[bugfix](iceberg)Read error when timestamp does not have time zone for 2.1 (#36435 ) bp: #36141	2024-06-20 18:32:31 +08:00
Sun Chenyang	fbcf63e1f5	[cherry-pick] (branch-2.1)fix variant index (#36577 ) pick from master #36163	2024-06-20 17:57:26 +08:00
zclllyybb	bd47d5a681	[branch-2.1](auto-partition) Fix auto partition load failure in multi replica (#36586 ) this pr 1. picked #35630, which was reverted #36098 before. 2. picked #36344 from master these two pr fixed existing bug about auto partition load. --------- Co-authored-by: Kaijie Chen <ckj@apache.org>	2024-06-20 17:51:18 +08:00
starocean999	cbaff8a700	[fix](nereids)change the decimal's precision and scale for cast(xx as decimal) (#36540 ) pick from master #36316 expression cast( xx as decimal )'s datatype maybe decimalv3 or decimalv2 depending on enable_decimal_conversion value in fe conf file. if enable_decimal_conversion is true, the datatype is decimalv3(9, 0), but the datatype was decimalv3(38, 9) in 2.0 releases. So this pr change the datatype same as 2.0 releases to keep the behavior consistent.	2024-06-20 17:46:11 +08:00
xy720	c5bb0e3a21	[bug](prepared statement) fix prepared statement throw exception when inserting null value (#36484 ) ## Proposed changes bp #36426 <!--Describe your changes.-->	2024-06-20 11:31:59 +08:00
zzzxl	dabd27edd2	[opt](inverted index) performance optimization for need_read_data in compound #35346 #36292 (#36404 ) pick from master https://github.com/apache/doris/pull/35346 https://github.com/apache/doris/pull/36292	2024-06-20 08:43:16 +08:00
lihangyu	5b7d93df5e	[Pick](Variant) pick 2 PRs to correct tmp column name to go fast execute #36277 #36313 (#36527 )	2024-06-19 19:07:47 +08:00
谢健	349b943e12	[opt](Nereids) Optimize Join Penalty Calculation Based on Build Side Data Volume (#36107 ) pick from master #35773 This PR introduces an optimization that adjusts the penalty applied during join operations based on the volume of data on the build side. Specifically, when the number of rows and width of the tables being joined are equal, the materialization costs are now considered more accurately. The update ensures that joins with a larger dataset on the build side incur a higher penalty, improving overall query performance and resource allocation.	2024-06-19 14:49:09 +08:00
LiBinfeng	1e54a5a66e	[Fix](Nereids) fix leading with brace can not generate correct plan (#36328 ) cherry-pick #36193 Problem: when using leading like: leading(t1 {t2 t3} {t4 t5} t6) it would not generate correct plan because levellist can not express enough message of braces Solved: remove levellist express of leading levels and use reverse polish expression Algorithm: leading(t1 {t2 t3} {t4 t5} t6) ==> stack top to down(t1 t2 t3 join join t4 t5 join t6 join) when generate leading join, we can pop items in stack, when it's a table, make logicalscan when it's a join operator, make logical join and push back to stack	2024-06-19 14:47:55 +08:00
lihangyu	38d750a7e0	[Fix](Row Store) all filter should match key columns condition (#36400 ) (#36443 ) Queries like `select * from tbl` will pass `LogicalResultSinkToShortCircuitPointQuery` rule in the previous. Introduced by #35823	2024-06-19 14:06:53 +08:00
feiniaofeiafei	bdba954e1f	[Fix](nereids)make agg output unchanged after normalized repeat (#36367 ) cherry-pick #36207 to branch-2.1 Co-authored-by: feiniaofeiafei <moailing@selectdb.com>	2024-06-19 12:23:56 +08:00
airborne12	da0138a412	[Pick 2.1](segment iterator) fix shrink non-char column coredump #36275 (#36468 )	2024-06-18 21:59:15 +08:00
camby	e2350403a6	[fix](plan) fix wrong result for random distributed agg table with all keys not null (#36271 )	2024-06-18 11:25:31 +08:00
lihangyu	4a117800ca	[Bug](Function) fix json contains with empty value (#36320 ) (#36418 )	2024-06-18 10:20:45 +08:00
qiye	e68834158c	[fix](inverted index)Support Chinese column name with inverted index #36321 (#36374 ) 1. `std::string` to `std::wstring` conversion only supports ASCII characters. For non-ASCII characters, we need to use `StringUtil::string_to_wstring` 2. Fix index_tool check_terms_stats_v2 and add field info to print pick from master #36321	2024-06-17 19:41:09 +08:00
wuwenchi	4008a04da7	[bugfix](paimon)Fix field case issues for 2.1 (#36288 ) bp: #36239	2024-06-17 18:38:00 +08:00
zzzxl	845dcce7f0	Revert "[opt](inverted index) performance optimization for need_read_data in …" (#36260 ) Reverts apache/doris#36192	2024-06-13 21:31:20 +08:00
Mingyu Chen	d8eac07178	[branch-2.1](test) fix external p0 unstable test (#36262 ) Fix some unstable external p0 tests	2024-06-13 20:55:41 +08:00
lihangyu	226775f059	[Feature](Point Query) fully support in nereids #35823 (#36205 )	2024-06-13 08:37:31 +08:00
zzzxl	f1e83f5656	[opt](inverted index) performance optimization for need_read_data in compound #35346 (#36192 )	2024-06-12 20:02:00 +08:00
lihangyu	9708ca8fcb	[Feature](Prepared Statment) Implement in nereids planner (#35318 ) (#36172 )	2024-06-12 19:54:17 +08:00
924060929	acbfcf7ad9	[fix](Nereids) fix four phase aggregation compute wrong result (#36131 ) cherry pick from #36128	2024-06-11 20:40:18 +08:00
zclllyybb	3b23eee37c	Revert "[fix](auto-partition) fix auto partition load lost data in multi sender (#35287 )" (#36098 ) Reverts apache/doris#35630 because it brought some more damaging bugs. we will fix it and merge in next version	2024-06-11 17:11:42 +08:00
Sun Chenyang	0dccc4e6e4	[cherry-pick](branch-2.1)fix http error when downloading varaint inverted index file #35668 (#36061 ) pick from master[#35668](https://github.com/apache/doris/pull/35668)	2024-06-11 14:09:05 +08:00
Jerry Hu	4a277affdc	[fix](scan) In-predicate should not be pushed down for non-key column(#35913 ) (#35968 ) pick #35913	2024-06-11 11:13:34 +08:00
谢健	1916891725	[fix](regression): fix nereids_hint_tpcds_p0 query64 shape (#35906 ) only for 2.1	2024-06-09 14:20:34 +08:00
wuwenchi	9e972cb0b9	[bugfix](iceberg)Fix the datafile path error issue for 2.1 (#36066 ) bp: #35957	2024-06-08 21:51:46 +08:00
morrySnow	075481faf1	[opt](Nereids) use date signature for date arithmetic as far as possible (#36060 ) pick from master #35863	2024-06-08 09:05:34 +08:00
morrySnow	16fcdcd4b7	[fix](Nereids) not do distinct when aggregate with distinct project (#36057 ) pick from master #35899	2024-06-08 09:04:56 +08:00
wuwenchi	bd6b913e00	[bugfix](paimon)paimon's field length judgment error for 2.1 (#36049 ) bp #35981	2024-06-07 21:13:08 +08:00
924060929	67f4d88988	[enhancement](Nereids) support 4 phases distinct aggregate with full distribution (#36016 ) cherry pick from #35871	2024-06-07 21:08:33 +08:00
morrySnow	9f3fe3e57c	[fix](DDL) not set table type as default comment when create table (#36025 ) pick from master #35855	2024-06-07 15:29:10 +08:00
slothever	c794ea18c8	[fix](multi-catalog)put java udf to custom lib (#35984 ) bp #34990	2024-06-06 22:54:24 +08:00
zhangdong	9efc7b63ec	[fix](mtmv)Mtmv support row column (#35860 ) (#35956 ) pick from master: #35860	2024-06-06 22:53:08 +08:00
amory	5966354165	[FIX](cases)fix cases for test_ip_in_inverted_index (#35971 ) bp #35881	2024-06-06 21:52:53 +08:00
amory	b5a35b9cef	[FIX] Pick array inverted index bugfix (#35837 ) here with some array with inverted index bugfix: see also: https://github.com/apache/doris/pull/34766 https://github.com/apache/doris/pull/35086 https://github.com/apache/doris/pull/34683 https://github.com/apache/doris/pull/34076	2024-06-06 09:54:14 +08:00
feiniaofeiafei	4b5163c905	[Feat](nereids) add transform rule MergePercentileToArray (#35809 ) cherry-pick #34313 to branch-2.1 MergePercentileToArray is to perform a transformation in this case: select ss_item_sk, percentile(ss_quantity,0.9), percentile(ss_quantity,0.6), percentile(ss_quantity,0.3) from store_sales group by ss_item_sk; ==> select ss_item_sk, percentile_array(ss_quantity,[0.3,0.6,0.9]) from store_sales group by ss_item_sk;	2024-06-04 17:50:36 +08:00
starocean999	c23ab25474	[fix](nereids)keep equal predicate as join conjunct even if it can be fold to null literal (#35842 ) pick from master https://github.com/apache/doris/pull/35811 ## Proposed changes Issue Number: close #xxx <!--Describe your changes.-->	2024-06-04 14:46:58 +08:00

1 2 3 4 5 ...

3009 Commits