Commit Graph

18263 Commits

Author SHA1 Message Date
4eee1a1f0d [fix](nereids) make runtime filter targets in fixed order (#33191)
* make runtime filter targets in fixed order
2024-04-10 16:22:39 +08:00
159ebc76e7 [fix](npe) fix kafka be id npe (#33151) 2024-04-10 16:22:27 +08:00
741d4ff97e [fix](group commit) Fix syntax error when insert into table which column names contain keyword (#33322) 2024-04-10 16:22:09 +08:00
d667df2d06 [improvement](spill) avoid unnecessary spilling in hash join build phase (#33277) 2024-04-10 16:21:50 +08:00
4079a7b6ab [fix](txn insert) Fix txn insert into values for sequence column or column name is keyword (#33336) 2024-04-10 16:21:31 +08:00
29777bc3a8 [fix](fe)reduce memory usage in alter (#32810) (#33474)
Co-authored-by: kylinmac <kylinmac@163.com>
2024-04-10 16:04:50 +08:00
5e73d7a281 [fix](compaction) fix incorrect grouping of vertical compaction columns in tables only with key columns (#32896) (#33470) 2024-04-10 16:04:33 +08:00
f8d1fa2be3 [chore](multi-table-load) add context info in log when using single-stream-multi-table load (#33317) 2024-04-10 16:03:05 +08:00
d1099852b5 [fix](Nereids) partial update generate column in wrong way (#33326)
intro by PR #31461
2024-04-10 16:02:54 +08:00
f31e273ae8 [fix](Nereids) variant column prune push down failed on variant literal (#33328) 2024-04-10 16:02:54 +08:00
be9fe12b26 Fix compatibility issues with GLIBC(>= 2.34) for prebuilt thirdparty packages (#33314)
Some symbols changed after GLIBC 2.34 according to the release notes(https://lists.gnu.org/archive/html/info-gnu/2021-08/msg00001.html).

This may cause linkage errors if we use GLIBC(>= 2.34).
2024-04-10 16:02:34 +08:00
Pxl
5c0256e4bf [Bug](case) fix wrong case test_mv_partition (#33324)
fix wrong case test_mv_partition
2024-04-10 16:02:20 +08:00
93b20f0cc4 [chore](Nereids) create policy always allow fallback (#33226) 2024-04-10 16:01:58 +08:00
bcc819ddd9 [fix](Nereids) array_range not support amount without unit (#33231) 2024-04-10 16:01:58 +08:00
b8d4a87703 [chore](Nereids) load command always could fallback (#33233) 2024-04-10 16:00:53 +08:00
14c5247fb7 [feature](replica) support force set replicate allocation for olap tables (#32916)
Add a config to force set replication allocation for all OLAP tables and partitions.
2024-04-10 16:00:15 +08:00
Pxl
6412753517 [improve](exec) reduce copy on store_string_ref (#33232) 2024-04-10 16:00:12 +08:00
d61b9f7091 [chore](test) nereids support window function but some cases does not open yet (#33098) 2024-04-10 16:00:12 +08:00
e6e2099256 [fix](spill) fix hash join error 'invalid slot id' (#33273) 2024-04-10 16:00:12 +08:00
Pxl
2092a862fc [Bug](materialized-view) fix wrong result when salias name same with base slot on mv (#33198)
fix wrong result when salias name same with base slot on mv
2024-04-10 16:00:05 +08:00
0e99926b28 (httpaction) log response of http (#33270) 2024-04-10 15:58:14 +08:00
2785269d36 [Improvement](executor)Add BypassWorkloadGroup to pass query queue #33101 2024-04-10 15:56:41 +08:00
febdfb1c63 [fix](inverted index) fix incorrect case test_index_delete (#33246) 2024-04-10 15:54:50 +08:00
5ecce2fff2 [fix](plsql) Fix regression test execute the same name procedure in parallel #33234 2024-04-10 15:53:17 +08:00
16f8afc408 [refactor](coordinator) split profile logic and instance report logic (#32010)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-04-10 15:51:32 +08:00
b85bf3b6b0 [test](cast) add test for stream load cast (#33189) 2024-04-10 15:26:09 +08:00
7fae123b01 [FIX](inverted_index) fix inverted index write array with _doc is empty (#33170) 2024-04-10 15:26:09 +08:00
96867ff3fd [fix](Nereids) support update without filter (#33214) 2024-04-10 15:26:09 +08:00
2b1ab89b5b [fix](memory) Fix memory log compile by ASAN (#33162)
ASAN compiles BE, add markers in memory logs
2024-04-10 15:26:09 +08:00
b696909775 [fix](plsql) Fix plsql variable initialization (#33186) 2024-04-10 15:26:09 +08:00
edd1701963 [fix](Nereids) convert agg state type failed in some cases (#33208) 2024-04-10 15:26:09 +08:00
9670422d61 [fix](inverted index) fix the incorrect result issue of COUNT_ON_INDEX for key columns (#33164) 2024-04-10 15:26:09 +08:00
5e59c09a60 [Fix](nereids) modify the binding aggregate function in order by (#32758)
modify the bind logical to make the order by has same behavior with mysql when sort child is aggregate.
when an order by Expr has aggregate function, all slots in this order by Expr should bind the LogicalAggregate non-AggFunction outputs first, then bind the LogicalAggregate Child
e.g.
select 2*abs(sum(c1)) as c1, c1,sum(c1)+c1 from t_order_by_bind_priority group by c1 order by sum(c1)+c1 asc;
in this sql, the two c1 in order by all bind to the c1 in t_order_by_bind_priority
2024-04-10 15:26:09 +08:00
67bb519613 [Fix](nereids) forward the user define variables to master (#33013) 2024-04-10 15:26:08 +08:00
2e40e39584 [chore](spill) add timers for performance tuning (#33185) 2024-04-10 15:26:08 +08:00
Pxl
8fd6d4c41b [Chore](build) add -Wconversion and remove some unused code (#33127)
add -Wconversion and remove some unused code
2024-04-10 15:26:08 +08:00
6798a24a27 [Enhencement](Nereids) reduce child output rows if agg child is literal (#32188)
with group by:
select max(1) from t1 group by c1; -> select 1 from (select c1 from t1 group by c1);
without group by:
select max(1) from t1; -> select max(1) from (select 1 from t1 limit 1) tmp;
2024-04-10 15:26:08 +08:00
0ab8b57db7 [enhance](mtmv)support create mtmv with other mtmv (#32984) 2024-04-10 15:26:08 +08:00
77ad3f6a19 [feature](hive)Get updated information from coordinate and commit (#32441) (#33466)
issue: #31442
1. Get updated information from coordinate and commit
2. refresh table after commit

Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>
2024-04-10 15:07:18 +08:00
9bc7902e5a [fix](Nereids) fix bind group by int literal (#33117)
This sql will failed because

    2 in the group by will bind to 1 as col2 in BindExpression
    ResolveOrdinalInOrderByAndGroupBy will replace 1 to MIN (LENGTH (cast(age as varchar)))
    CheckAnalysis will throw an exception because group by can not contains aggregate function

select MIN (LENGTH (cast(age as varchar))), 1 AS col2
from test_bind_groupby_slots
group by 2

we should move ResolveOrdinalInOrderByAndGroupBy into BindExpression

(cherry picked from commit 3fab4496c3fefe95b4db01f300bf747080bfc3d8)
2024-04-10 14:59:46 +08:00
cc363f26c2 [fix](Nereids) fix group concat (#33091)
Fix failed in regression_test/suites/query_p0/group_concat/test_group_concat.groovy

select
group_concat( distinct b1, '?'), group_concat( distinct b3, '?')
from
table_group_concat
group by
b2

exception:

lowestCostPlans with physicalProperties(GATHER) doesn't exist in root group

The root cause is '?' is push down to slot by NormalizeAggregate, AggregateStrategies treat the slot as a distinct parameter and generate a invalid PhysicalHashAggregate, and then reject by ChildOutputPropertyDeriver.

I fix this bug by avoid push down literal to slot in NormalizeAggregate, and forbidden generate stream aggregate node when group by slots is empty
2024-04-10 14:59:46 +08:00
38d580dfb7 [fix](Nereids) fix link children failed (#33134)
#32617 introduce a bug: rewrite may not working when plan's arity >= 3.
this pr fix it

(cherry picked from commit 8b070d1a9d43aa7d25225a79da81573c384ee825)
2024-04-10 14:59:45 +08:00
ff990eb869 [enhancement](Nereids) refactor expression rewriter to pattern match (#32617)
this pr can improve the performance of the nereids planner, in plan stage.

1. refactor expression rewriter to pattern match, so the lots of expression rewrite rules can criss-crossed apply in a big bottom-up iteration, and rewrite until the expression became stable. now we can process more cases because original there has no loop, and sometimes only process the top expression, like `SimplifyArithmeticRule`.
2. replace `Collection.stream()` to `ImmutableXxx.Builder` to avoid useless method call
3. loop unrolling some codes, like `Expression.<init>`, `PlanTreeRewriteBottomUpJob.pushChildrenJobs`
4. use type/arity specified-code, like `OneRangePartitionEvaluator.toNereidsLiterals()`, `PartitionRangeExpander.tryExpandRange()`, `PartitionRangeExpander.enumerableCount()`
5. refactor `ExtractCommonFactorRule`, now we can extract more cases, and I fix the deed loop when use `ExtractCommonFactorRule` and `SimplifyRange` in one iterative, because `SimplifyRange` generate right deep tree, but `ExtractCommonFactorRule` generate left deep tree
6. refactor `FoldConstantRuleOnFE`, support visitor/pattern match mode, in ExpressionNormalization, pattern match can criss-crossed apply with other rules; in PartitionPruner, visitor can evaluate expression faster
7. lazy compute and cache some operation
8. use int field to compare date
9. use BitSet to find disableNereidsRules
10. two level loop usually faster then build Multimap when bind slot in Scope, so I revert the code
11. `PlanTreeRewriteBottomUpJob` don't need to clearStatePhase any more

### test case
100 threads parallel continuous send this sql which query an empty table, test in my mac machine(m2 chip, 8 core), enable sql cache
```sql
select  count(1),date_format(time_col,'%Y%m%d'),varchar_col1
from tbl
where  partition_date>'2024-02-15'  and (varchar_col2 ='73130' or varchar_col3='73130') and time_col>'2024-03-04'
  and  time_col<'2024-03-05'
group by date_format(time_col,'%Y%m%d'),varchar_col1
order by date_format(time_col,'%Y%m%d') desc, varchar_col1 desc,count(1) asc
limit 1000
```

before this pr: 3100 peak QPS, about 2700 avg QPS
after this pr: 4800 peak QPS, about 4400 avg QPS

(cherry picked from commit 7338683fdbdf77711f2ce61e580c19f4ea100723)
2024-04-10 14:59:45 +08:00
6c5dd820c0 [improvement](spill) improve spill timers (#33156) 2024-04-10 14:55:11 +08:00
7f2fdf78ac [Enhancement](inverted index) set need to read data only when delete predicate contains the column (#33172) 2024-04-10 14:53:56 +08:00
c61d6ad1e2 [Feature] support function uuid_to_int and int_to_uuid #33005 2024-04-10 14:53:56 +08:00
bf022f9d8d [enhancement](function truncate) truncate can use column as scale argument (#32746)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-04-10 14:53:56 +08:00
a69f3eb870 [fix](fe) partitionInfo is null, fe can not start (#33108) 2024-04-10 14:53:56 +08:00
8b1d174b13 [Optimize] Move strings_pool from individual tree nodes to the tree itself (#33089)
Previously, strings_pool was allocated within each tree node. However, due to the Arena's alignment of allocated chunks to at least 4K, this allocation size was excessively large for a single tree node. Consequently, when there are numerous nodes within the SubcolumnTree, a significant portion of memory was wasted. Moving strings_pool to the tree itself optimizes memory usage and reduces wastage, improving overall efficiency.
2024-04-10 14:53:56 +08:00
02b24abed2 [Fix](Nereids) ntile function should check argument (#32994)
Problem:
when ntile using 0 as parameter, be would core because no checking of parameter
Solved:
check parameter in fe analyze
2024-04-10 14:53:56 +08:00