Commit Graph

7105 Commits

Author SHA1 Message Date
88e02c836d [Fix]Fix insert select missing audit log when connect follower FE (#36481)
## Proposed changes

pick #36472
2024-06-20 15:16:16 +08:00
c5bb0e3a21 [bug](prepared statement) fix prepared statement throw exception when inserting null value (#36484)
## Proposed changes

bp #36426

<!--Describe your changes.-->
2024-06-20 11:31:59 +08:00
9c1f34359d [fix](point query) should check it is Slot before check it is DELETE_SIGN (#36566)
pick from master #36564

introduced by #36443
2024-06-20 10:29:21 +08:00
7b36e81b7a [fix](split) FileSystemCacheKey are always different in overload equals (#36431)
bp: #36432
## Proposed changes

## Fixed Bugs introduced from #34307
1. `FileSystemCacheKey.equals()` compares properties by `==`, resulting
in creating new file system in each partition
2. `dfsFileSystem` is not synchronized, resulting in creating more file
systems than need.
3. `jobConf.iterator()` will produce more than 2000 pairs of key-value
2024-06-20 10:08:05 +08:00
c1f15f7e4c [fix](catalog) fix wrong check when using "use_meta_cache=true" (#36533)
bp #36530
2024-06-19 18:03:03 +08:00
f59dc4fb37 [opt](split) generate and get split batch concurrently (#36044)
bp #36045, and turn on batch split, which is turn off in #36109
Generate and get split batch concurrently.
`SplitSource.getNextBatch` remove the synchronization, and make each get their splits concurrently, and `SplitAssignment` generates splits asynchronously.
2024-06-19 16:16:02 +08:00
9c896efe0b [fix](race) fix access colocate group ids race #36444 (#36501)
cherry pick from #36444
2024-06-19 15:38:50 +08:00
349b943e12 [opt](Nereids) Optimize Join Penalty Calculation Based on Build Side Data Volume (#36107)
pick from master #35773

This PR introduces an optimization that adjusts the penalty applied
during join operations based on the volume of data on the build side.
Specifically, when the number of rows and width of the tables being
joined are equal, the materialization costs are now considered more
accurately. The update ensures that joins with a larger dataset on the
build side incur a higher penalty, improving overall query performance
and resource allocation.
2024-06-19 14:49:09 +08:00
1e54a5a66e [Fix](Nereids) fix leading with brace can not generate correct plan (#36328)
cherry-pick #36193

Problem:
when using leading like:
leading(t1 {t2 t3} {t4 t5} t6)
it would not generate correct plan because levellist can not express
enough message of braces
Solved:
remove levellist express of leading levels and use reverse polish
expression
Algorithm:
leading(t1 {t2 t3} {t4 t5} t6)
==>
stack top to down(t1 t2 t3 join join t4 t5 join t6 join) when generate
leading join, we can pop items in stack, when it's a table, make
logicalscan when it's a join
operator, make logical join and push back to stack
2024-06-19 14:47:55 +08:00
3c952c75be [refactor](Nereids) New expression extractor for partitions pruning (#36407)
cherry pick from #36326

An exception throw in TryEliminateUninterestedPredicates, for this case

 CREATE TABLE `tbltest` (
  `id` INT NULL,
  `col2` VARCHAR(255) NULL,
  `col3` VARCHAR(255) NULL,
  `dt` DATE NULL
) ENGINE=OLAP
DUPLICATE KEY(`id`, `col2`)
PARTITION BY RANGE(`dt`)
(PARTITION p20240617 VALUES [('2024-06-17'), ('2024-06-18')))
DISTRIBUTED BY HASH(`id`) BUCKETS 10
PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);

select * from tbltest
where
  case
    when col2 = 'xxx' and col3='yyy' then false -- note this is not about partition column
    when col2 in ('xxx') then false
    when col2 like 'xxx%' then false
    else true
  end

The `CaseWhen` require children should be `WhenClause`, TryEliminateUninterestedPredicates maybe rewrite the WhenClause to true/false predicate, and cause this exception:

ERROR 1105 (HY000): errCode = 2, detailMessage = The children format needs to be [WhenClause+, DefaultValue?]

Original extractor(TryEliminateUninterestedPredicates.java) caused some errors while try to derive the expressions which can be used for pruning partitions.
I tried to write a new extractor(and with unit tests) for pruning partitions, it is more simple and reliable (I think).

The theory of extractor is pretty simple:
A:Sort the expression in two kinds:
  1. evaluable-expression (let's mark it as E).
    Expressions that can be evaluated in the partition pruning stage.
    In the other word: not contains non-partition slots or deterministic expression.
  2. un-evaluable-expression (let's mark it as UE).
    Expressions that can NOT be evaluated in the partition pruning stage.
    In the other word: contains non-partition slots or deterministic expression.

B: Travel the predicate, only point on AND and OR operator, following the rule:
  (E and UE) -> (E and TRUE) -> E
  (UE and UE) -> TRUE
  (E and E) -> (E and E)
  (E or UE) -> TRUE
  (UE or UE) -> TRUE
  (E or E) -> (E or E)
2024-06-19 14:47:26 +08:00
38d750a7e0 [Fix](Row Store) all filter should match key columns condition (#36400) (#36443)
Queries like `select * from tbl` will pass
`LogicalResultSinkToShortCircuitPointQuery` rule in the previous.
Introduced by #35823
2024-06-19 14:06:53 +08:00
bdba954e1f [Fix](nereids)make agg output unchanged after normalized repeat (#36367)
cherry-pick #36207 to branch-2.1

Co-authored-by: feiniaofeiafei <moailing@selectdb.com>
2024-06-19 12:23:56 +08:00
97ac46d2be [fix](mtmv) Mapping materialization statistics's expressionToColumnStats to mv scan plan based (#36058)
bp #35749
2024-06-19 11:25:13 +08:00
74162a1b7e [enhancement](prepared statement) Handle unsigned numeric type in prepare statement (#36388)
## Proposed changes

Issue Number: bp #36133

<!--Describe your changes.-->
2024-06-18 19:33:12 +08:00
Pxl
dda25cceb6 [Bug](information-schema) fix some bug of information_schema.PROCESSLIST (#36447)
## Proposed changes
pick from #36409
2024-06-18 16:45:48 +08:00
e2350403a6 [fix](plan) fix wrong result for random distributed agg table with all keys not null (#36271) 2024-06-18 11:25:31 +08:00
4ae8607b2e [fix](hudi) disable fs.impl.cache to avoid FE OOM (#36402) (#36403)
bp #36402
2024-06-17 22:20:23 +08:00
4008a04da7 [bugfix](paimon)Fix field case issues for 2.1 (#36288)
bp:  #36239
2024-06-17 18:38:00 +08:00
98fccb1809 [improvement](build index)Make build index and clone mutually exclusive and add timeout for index change job (#36293)
Currently the index change job and clone task can be executed at the
same time. If the clone task gets stuck at this point, it will cause the
index change job to get stuck as well and keep retrying. To solve this
problem, we can refer to alter job and make index change job exclusive
with clone task, and introduce the timeout to prevent infinite retries
of build index.

Add the following checks and status in FE.
1. Check if table is stable (build index is not allowed when clone is in
progress)
1.1. Tablet is HEALTHY.
1.2. Whether the tablet is included in the Tablet scheduler, if so, it
means the current tablet is doing clone.
2. When creating the index change job, set the timeout at the same time.

pick from master #35724
2024-06-16 09:34:32 +08:00
55b4cf1658 [fix](load) fix NPE in LoadManager#jobRemovedTrigger() (#36173) (#36337)
cherry-pick #36173
2024-06-15 23:06:31 +08:00
bfab7a2537 [fix](shuffle) fix tablets num calculation in shuffle condition (#36050) (#36339)
cherry-pick #36050
2024-06-15 23:06:00 +08:00
bfb41c15de [fix](statistics)Fix sync analyze job timeout block bug. (#36199)
Fix sync analyze job timeout block bug. When a task of a analyze job
timeout, it should throw an exception instead of finish silently.
2024-06-14 09:47:51 +08:00
a23aee2883 [fix](broker) fix no error url when broker data quality error (#35643) (#36089)
## Proposed changes

cherry-pick from #35643
2024-06-14 09:29:14 +08:00
e2f7e0da0a [Fix](nereids) fix merge aggregate rule, rules should not have mutable members (#36223)
cherry-pick #36145  to branch-2.1
2024-06-13 17:49:57 +08:00
d70751a808 [fix](planner)remove constant expr in window function's partition and order exprs (#36185)
pick from master https://github.com/apache/doris/pull/36184
2024-06-13 15:05:21 +08:00
e51cd58d6e [fix](clone) fix check replica failed due to replica had drop #35994 (#36219)
cherry pick from #35994
2024-06-13 13:39:09 +08:00
375770f2b4 [fix](hudi) move wrong members in HMSExternalTable (#36187)
Previously, there are 2 members: TableScanParams and IncrementalRelation
in HMSExternalTable.
These 2 members are for Hudi's incremental query, so their lifecycle
should be with query task,
should not be saved in HMSExternalTable.

This PR mainly changes:

- Add LogicalHudiScan and PhysicalHudiScan, extends from LogicalFileScan
and PhysicalFileScan.
- Move TableScanParams and IncrementalRelation from HMSExternalTable to
XXXHudiScan.
- Add or modify related Nereids rules
2024-06-13 11:50:40 +08:00
226775f059 [Feature](Point Query) fully support in nereids #35823 (#36205) 2024-06-13 08:37:31 +08:00
3a3c8cd9ee [cherry-pick](branch-2.1) fix inverted index format is lost during a schema change #36059 (#36100) 2024-06-12 23:06:51 +08:00
6d54527395 [fix](dynamic partition) fix dynamic partition thread met uncatch exception #35778 (#36166)
cherry pick from #35778
2024-06-12 22:16:51 +08:00
9708ca8fcb [Feature](Prepared Statment) Implement in nereids planner (#35318) (#36172) 2024-06-12 19:54:17 +08:00
0b28420e1c [pick](Variant) make remote schema fetch rpc timeout configurable (#35296) (#36174) 2024-06-12 19:51:53 +08:00
c78c7f6b45 [branch-2.1](test) fix some tests in external p0 (#36127)
Also move the analysis exception of "Not support insert with partition
spec in hive catalog."
from create sink phase to bind sink phase.
So that when `set enable_fallback_to_original_planner=false;`, the
return error will be correct.
2024-06-11 22:15:28 +08:00
acbfcf7ad9 [fix](Nereids) fix four phase aggregation compute wrong result (#36131)
cherry pick from #36128
2024-06-11 20:40:18 +08:00
d2a6911791 [opt](split) close the batch mode of file split in default (#36109)
bp: #36108
2024-06-11 19:19:09 +08:00
3b23eee37c Revert "[fix](auto-partition) fix auto partition load lost data in multi sender (#35287)" (#36098)
Reverts apache/doris#35630 because it brought some more damaging bugs.
we will fix it and merge in next version
2024-06-11 17:11:42 +08:00
afe2c57e05 [Fix](explain) fix tablet showing problem (#35830) (#36028)
cherry-pick: #35830
2024-06-11 10:55:35 +08:00
75a6f28f2e [cherry-pick]Add query type when report (#35918)
pick #34978
2024-06-11 10:51:59 +08:00
936bf65622 [fix](nereids)decimal and datetime literal comparison should compare datatype too (#36064)
pick from master #36055
2024-06-08 22:01:37 +08:00
9e972cb0b9 [bugfix](iceberg)Fix the datafile path error issue for 2.1 (#36066)
bp: #35957
2024-06-08 21:51:46 +08:00
075481faf1 [opt](Nereids) use date signature for date arithmetic as far as possible (#36060)
pick from master #35863
2024-06-08 09:05:34 +08:00
16fcdcd4b7 [fix](Nereids) not do distinct when aggregate with distinct project (#36057)
pick from master #35899
2024-06-08 09:04:56 +08:00
240d8938f8 [bugfix](iceberg)Fixed missing type of iceberg table for timetravel for 2.1 (#36048)
bp:  #36047
2024-06-07 21:13:56 +08:00
5148c3908e [fix](mtmv)fix mtmv show partition error when base table dropped (#35729) (#36051)
bp #35729
2024-06-07 21:09:41 +08:00
67f4d88988 [enhancement](Nereids) support 4 phases distinct aggregate with full distribution (#36016)
cherry pick from #35871
2024-06-07 21:08:33 +08:00
19bc98a11a [pick 2.1 ][fix ut][fix](inverted index) cloud mode supports lowercase (#32841) (#36034)
pick from master #32841 
Co-authored-by: zzzxl <33418555+zzzxl1993@users.noreply.github.com>
2024-06-07 17:08:29 +08:00
a518915626 [fix](pipeline) Do not push data in local exchange if eos (#35972) (#36010)
pick #35972 and #34536
2024-06-07 15:40:55 +08:00
9f3fe3e57c [fix](DDL) not set table type as default comment when create table (#36025)
pick from master #35855
2024-06-07 15:29:10 +08:00
f751ca4e04 [branch-2.1](functions) fix be crash for function random_bytes and mark_first/last_n (#36003)
pick #35884
2024-06-07 10:30:41 +08:00
c794ea18c8 [fix](multi-catalog)put java udf to custom lib (#35984)
bp #34990
2024-06-06 22:54:24 +08:00