doris

Author	SHA1	Message	Date
slothever	07f296734a	[regression](insert)add hive DDL and CTAS regression case (#32924 ) Issue Number: #31442 dependent on #32824 add ddl(create and drop) test add ctas test add complex type test TODO: bucketed table test truncate test add/drop partition test	2024-04-12 10:24:23 +08:00
slothever	716c146750	[fix](insert)fix hive external return msgs and exception and pass all columns to BE (#32824 ) [fix](insert)fix hive external return msgs and exception and pass all columns to BE	2024-04-12 10:23:52 +08:00
meiyi	9ada38327b	[feature](txn insert) txn insert support insert into select (#31666 )	2024-04-12 10:11:22 +08:00
zclllyybb	3d66723214	[branch-2.1](auto-partition) pick auto partition and some more prs (#33523 )	2024-04-11 17:12:17 +08:00
谢健	e4eb76212a	[fix](Nereids): add order for constraint test (#33323 )	2024-04-11 09:31:50 +08:00
Pxl	3070eda58c	[Bug](load) fix stream load file on hll type mv column (#33373 ) fix stream load file on hll type mv column	2024-04-11 09:31:50 +08:00
huanghaibin	ea1e542e31	[fix](partial-update) remove unnecessary DECHEK on IndexChannel::num_rows_filtered (#33160 )	2024-04-11 09:31:50 +08:00
Pxl	6462264e77	[Improvement](materialized-view) adjust priority of materialized view match rule (#33305 ) adjust priority of materialized view match rule	2024-04-10 16:23:04 +08:00
minghong	4eee1a1f0d	[fix](nereids) make runtime filter targets in fixed order (#33191 ) * make runtime filter targets in fixed order	2024-04-10 16:22:39 +08:00
meiyi	741d4ff97e	[fix](group commit) Fix syntax error when insert into table which column names contain keyword (#33322 )	2024-04-10 16:22:09 +08:00
meiyi	4079a7b6ab	[fix](txn insert) Fix txn insert into values for sequence column or column name is keyword (#33336 )	2024-04-10 16:21:31 +08:00
Luwei	5e73d7a281	[fix](compaction) fix incorrect grouping of vertical compaction columns in tables only with key columns (#32896 ) (#33470 )	2024-04-10 16:04:33 +08:00
LiBinfeng	d61b9f7091	[chore](test) nereids support window function but some cases does not open yet (#33098 )	2024-04-10 16:00:12 +08:00
Pxl	2092a862fc	[Bug](materialized-view) fix wrong result when salias name same with base slot on mv (#33198 ) fix wrong result when salias name same with base slot on mv	2024-04-10 16:00:05 +08:00
HHoflittlefish777	b85bf3b6b0	[test](cast) add test for stream load cast (#33189 )	2024-04-10 15:26:09 +08:00
Xinyi Zou	b696909775	[fix](plsql) Fix plsql variable initialization (#33186 )	2024-04-10 15:26:09 +08:00
zzzxl	9670422d61	[fix](inverted index) fix the incorrect result issue of COUNT_ON_INDEX for key columns (#33164 )	2024-04-10 15:26:09 +08:00
feiniaofeiafei	5e59c09a60	[Fix](nereids) modify the binding aggregate function in order by (#32758 ) modify the bind logical to make the order by has same behavior with mysql when sort child is aggregate. when an order by Expr has aggregate function, all slots in this order by Expr should bind the LogicalAggregate non-AggFunction outputs first, then bind the LogicalAggregate Child e.g. select 2*abs(sum(c1)) as c1, c1,sum(c1)+c1 from t_order_by_bind_priority group by c1 order by sum(c1)+c1 asc; in this sql, the two c1 in order by all bind to the c1 in t_order_by_bind_priority	2024-04-10 15:26:09 +08:00
LiBinfeng	6798a24a27	[Enhencement](Nereids) reduce child output rows if agg child is literal (#32188 ) with group by: select max(1) from t1 group by c1; -> select 1 from (select c1 from t1 group by c1); without group by: select max(1) from t1; -> select max(1) from (select 1 from t1 limit 1) tmp;	2024-04-10 15:26:08 +08:00
zhangdong	0ab8b57db7	[enhance](mtmv)support create mtmv with other mtmv (#32984 )	2024-04-10 15:26:08 +08:00
924060929	9bc7902e5a	[fix](Nereids) fix bind group by int literal (#33117 ) This sql will failed because 2 in the group by will bind to 1 as col2 in BindExpression ResolveOrdinalInOrderByAndGroupBy will replace 1 to MIN (LENGTH (cast(age as varchar))) CheckAnalysis will throw an exception because group by can not contains aggregate function select MIN (LENGTH (cast(age as varchar))), 1 AS col2 from test_bind_groupby_slots group by 2 we should move ResolveOrdinalInOrderByAndGroupBy into BindExpression (cherry picked from commit 3fab4496c3fefe95b4db01f300bf747080bfc3d8)	2024-04-10 14:59:46 +08:00
924060929	ff990eb869	[enhancement](Nereids) refactor expression rewriter to pattern match (#32617 ) this pr can improve the performance of the nereids planner, in plan stage. 1. refactor expression rewriter to pattern match, so the lots of expression rewrite rules can criss-crossed apply in a big bottom-up iteration, and rewrite until the expression became stable. now we can process more cases because original there has no loop, and sometimes only process the top expression, like `SimplifyArithmeticRule`. 2. replace `Collection.stream()` to `ImmutableXxx.Builder` to avoid useless method call 3. loop unrolling some codes, like `Expression.<init>`, `PlanTreeRewriteBottomUpJob.pushChildrenJobs` 4. use type/arity specified-code, like `OneRangePartitionEvaluator.toNereidsLiterals()`, `PartitionRangeExpander.tryExpandRange()`, `PartitionRangeExpander.enumerableCount()` 5. refactor `ExtractCommonFactorRule`, now we can extract more cases, and I fix the deed loop when use `ExtractCommonFactorRule` and `SimplifyRange` in one iterative, because `SimplifyRange` generate right deep tree, but `ExtractCommonFactorRule` generate left deep tree 6. refactor `FoldConstantRuleOnFE`, support visitor/pattern match mode, in ExpressionNormalization, pattern match can criss-crossed apply with other rules; in PartitionPruner, visitor can evaluate expression faster 7. lazy compute and cache some operation 8. use int field to compare date 9. use BitSet to find disableNereidsRules 10. two level loop usually faster then build Multimap when bind slot in Scope, so I revert the code 11. `PlanTreeRewriteBottomUpJob` don't need to clearStatePhase any more ### test case 100 threads parallel continuous send this sql which query an empty table, test in my mac machine(m2 chip, 8 core), enable sql cache ```sql select count(1),date_format(time_col,'%Y%m%d'),varchar_col1 from tbl where partition_date>'2024-02-15' and (varchar_col2 ='73130' or varchar_col3='73130') and time_col>'2024-03-04' and time_col<'2024-03-05' group by date_format(time_col,'%Y%m%d'),varchar_col1 order by date_format(time_col,'%Y%m%d') desc, varchar_col1 desc,count(1) asc limit 1000 ``` before this pr: 3100 peak QPS, about 2700 avg QPS after this pr: 4800 peak QPS, about 4400 avg QPS (cherry picked from commit 7338683fdbdf77711f2ce61e580c19f4ea100723)	2024-04-10 14:59:45 +08:00
zhiqiang	bf022f9d8d	[enhancement](function truncate) truncate can use column as scale argument (#32746 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-04-10 14:53:56 +08:00
minghong	a7c8abe58c	[feature](nereids) support common sub expression by multi-layer projections (fe part) (#33087 ) * cse fe part	2024-04-10 14:53:56 +08:00
超威老仲	b0b5f84e40	[feature](load) support compressed JSON format data for broker load (#30809 )	2024-04-10 14:20:53 +08:00
Pxl	09db427eed	[Feature](materialized-view) support ignore not slot is null when count(slot) not has key in mv (#32912 ) support ignore not slot is null when count(slot) not has key in mv	2024-04-10 11:59:36 +08:00
Tiewei Fang	61e214c327	[Fix](Hive-Metastore) fix that if JDBC reads the NULL value, it will cause NPE (#32831 )	2024-04-10 11:55:17 +08:00
Qi Chen	5116724494	[Fix](hive-writer) Fix the issue of block was not copied to do filtering when hive partition writer write block to file. (#32775 ) (#33447 ) backport #32775	2024-04-10 11:42:23 +08:00
Qi Chen	4963d60a07	[Fix](multi-catalog)Fix the issue of not initializing the writer caused by refactoring and add hive writing regression test. (#32721 ) (#33446 ) backport #32721.	2024-04-10 11:42:22 +08:00
Qifeng	285e2fcb5a	[fix] (vectorization) regexp all_pass string (#32515 )	2024-04-10 11:34:30 +08:00
xueweizhang	fb910e5304	[fix](planner) retain groupingSlotIds as materialized for aggregate (#33060 )	2024-04-10 11:34:30 +08:00
xueweizhang	c5ab7ca573	[fix](planner) remove and retain input slot for aggregate slot which is not materialized (#33033 ) Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2024-04-10 11:34:30 +08:00
Pxl	5b162a80f2	[Improvement](materialized-view) The materialized view can not involved auto increment column (#32885 ) The materialized view can not involved auto increment column	2024-04-10 11:34:30 +08:00
starocean999	1f1932c6b7	[enhancement](nereids)add some date functions for constant fold (#32772 )	2024-04-10 11:34:30 +08:00
starocean999	814e4ed3ec	[fix](nereids)partition prune should consider <=> operator (#32965 )	2024-04-10 11:34:30 +08:00
wangbo	97a2977f2a	[improvement](executor)Add tag property for workload group #32874	2024-04-10 11:34:29 +08:00
jakevin	e980cd3e7f	[feature](Nereids): add ColumnPruningPostProcessor. (#32800 )	2024-04-10 11:34:29 +08:00
zhangdong	26e86d53a4	[enhance](mtmv)support olap table partition column is null (#32698 )	2024-04-10 11:34:29 +08:00
Jensen	bb8bc75af4	[feature](agg) add aggregate function sum0 (#32541 )	2024-04-10 11:34:29 +08:00
zclllyybb	2a0644f442	[Fix](function) Fix unix_timestamp core for string input (#32871 )	2024-04-09 12:48:35 +08:00
zhangstar333	ebbfb06162	[Bug](array) fix array column core dump in get_shrinked_column as not check type (#33295 ) * [Bug](array) fix array column core dump in get_shrinked_column as not check type * add function could_shrinked_column	2024-04-08 07:27:40 +08:00
zy-kkk	1b3e4322e8	[improvement](serde) Handle NaN values in number for MySQL result write (#33227 )	2024-04-07 23:24:23 +08:00
zy-kkk	fae55e0e46	[Feature](information_schema) add processlist table for information_schema db (#32511 )	2024-04-07 23:24:22 +08:00
Ashin Gau	29556f758e	[fix](parquet) fix time zone error in parquet reader (#33217 ) `isAdjustedToUTC` is exactly the opposite in parquet reader(https://github.com/apache/parquet-format/blob/master/LogicalTypes.md), resulting the time with `isAdjustedToUTC=true` has increased by eight hours(UTC8). The parquet with `isAdjustedToUTC=true` can be produced by spark-sql with the following configuration: ``` --conf spark.sql.session.timeZone=UTC --conf spark.sql.parquet.outputTimestampType=TIMESTAMP_MICROS ``` However, using the following configuration, there's no logical and convert type in parquet meta data, so the time read by doris will also increase by eight hours(UTC8). Users need to set their own UTC time zone in doris(https://doris.apache.org/docs/dev/advanced/time-zone/) ``` --conf spark.sql.session.timeZone=UTC --conf spark.sql.parquet.outputTimestampType=INT96 ```	2024-04-07 23:24:22 +08:00
Mingyu Chen	d9d950d98e	[fix](iceberg) fix iceberg predicate conversion bug (#33283 ) Followup #32923 Some cases are not covered in #32923	2024-04-07 22:12:38 +08:00
wuwenchi	190763e301	[bugfix](iceberg)Convert the datetime type in the predicate according to the target column (#32923 ) Convert the datetime type in the predicate according to the target column. And add a testcase for #32194 related #30478 #30162	2024-04-07 22:12:33 +08:00
zhangstar333	62699c8eea	[improve](function) the offset params in lead/lag function could use 0 (#33174 )	2024-04-07 12:58:03 +08:00
amory	797b8fa456	[FIX](agg) fix vertical_compaction_reader for agg table with array/map type (#33130 )	2024-04-03 18:09:45 +08:00
Jerry Hu	425c00a0d1	[fix](agg) incorrect result with having conjuncts and limit (#33040 )	2024-03-30 10:14:44 +08:00
lihangyu	9d6fb39573	[regression-test](Variant) add order by to make test stable (#33014 ) (#33039 )	2024-03-29 17:25:26 +08:00

1 2 3 4 5 ...

2757 Commits