Commit Graph

19109 Commits

Author SHA1 Message Date
c8e4c404fa [Fix]check if fe set thrift field current_connect_fe (#36681)
bp #36678
2024-06-21 22:15:25 +08:00
c939781411 [Pick 2.1](inverted index) fix wrong no need read data when need_remaining_after_evaluate (#36684)
When using an equal predicate on a column that applies an inverted index
with a parser, it requires remaining_after_evaluate. In this situation,
we cannot optimize the column without reading the data.

## Proposed changes

From (#36637)
2024-06-21 22:01:39 +08:00
0cff539810 [feature](function) support new function replace_empty (#36283) (#36656)
#36283
2024-06-21 16:46:22 +08:00
c8f2a3f952 [fix](eq_for_null) fix incorrect logic in function eq_for_null #36004 (#36124)
cherry pick from #36004
cherry pick from #36164
2024-06-21 14:31:21 +08:00
8105dc7de8 [Pick 2.1](inverted index) fix wrong opt for pk no need read data (#36634)
## Proposed changes
 
Pick from #36618
2024-06-21 00:57:23 +08:00
58cc1dca7f [improve](fe) Support to config max msg/frame size of the thrift server (#36594)
Cherry-pick #35845
2024-06-21 00:15:15 +08:00
3febac1d91 [fix](connection) kill connection when meeting Write mysql packet failed error #36559 (#36616)
bp #36559
2024-06-20 22:27:01 +08:00
c28c243c98 [Fix](Variant) forbit create variant as key #36555 (#36578) 2024-06-20 20:33:48 +08:00
a79b56ac23 [chore](be) Support config max message size for be thrift server (#36595)
Cherry-pick #36467
2024-06-20 20:15:43 +08:00
b3dcfae864 [chore](be) Improve ingesting binlog error checking (#36596)
Cherry-pick #36487
2024-06-20 20:15:26 +08:00
26b1ef428a [branch-2.1](doris compose) fix docker start failed (#36534) 2024-06-20 20:14:17 +08:00
838af13001 [fix](auth)ldap set passwd need forward to master (#36436) (#36598)
pick from master: #36436
2024-06-20 18:35:37 +08:00
3ee259fc00 [branch-2.1][fix](jdbc catalog) fix jdbc mysql client match jsonb type (#36180)
bp #36177
2024-06-20 18:33:27 +08:00
ac0f6e75d2 [bugfix](iceberg)Read error when timestamp does not have time zone for 2.1 (#36435)
bp: #36141
2024-06-20 18:32:31 +08:00
22d37ba3fe [fix](auth)Auth support case insensitive (#36381) (#36557)
pick from: #36381
2024-06-20 18:31:30 +08:00
f7f7b2b738 [Enhancement](multi-catalog) Add more error msgs for wrong data types in orc and parquet reader. (#36580)
Backport #36417
2024-06-20 18:10:25 +08:00
fbcf63e1f5 [cherry-pick] (branch-2.1)fix variant index (#36577)
pick from master #36163
2024-06-20 17:57:26 +08:00
64a94e883d [fix](nereids)NullSafeEqualToEqual rule should keep <=> unchanged if it has none-literal child (#36523)
pick from master #36521

convert:
expr <=> null to expr is null
null <=> null to true
null <=> 1 to false
literal <=> literal to literal = literal ( 1 <=> 2 to 1 = 2 )
others are unchanged.
2024-06-20 17:55:36 +08:00
bd47d5a681 [branch-2.1](auto-partition) Fix auto partition load failure in multi replica (#36586)
this pr
1. picked #35630, which was reverted #36098 before.
2. picked #36344 from master

these two pr fixed existing bug about auto partition load.

---------

Co-authored-by: Kaijie Chen <ckj@apache.org>
2024-06-20 17:51:18 +08:00
6df1a9ab75 [branch-2.1](auto-partition) fix auto partition expr change unexpected (#36345) (#36514)
pick #36345
2024-06-20 17:50:31 +08:00
cbaff8a700 [fix](nereids)change the decimal's precision and scale for cast(xx as decimal) (#36540)
pick from master #36316

expression cast( xx as decimal )'s datatype maybe decimalv3 or decimalv2
depending on enable_decimal_conversion value in fe conf file. if
enable_decimal_conversion is true, the datatype is decimalv3(9, 0), but
the datatype was decimalv3(38, 9) in 2.0 releases. So this pr change the
datatype same as 2.0 releases to keep the behavior consistent.
2024-06-20 17:46:11 +08:00
1a242b8ae0 [cherry-pick](branch2.1) fix week/yearweek function get wrong result (#36538)
## Proposed changes
cherry-pick from master #36000 #36159
2024-06-20 15:48:19 +08:00
88e02c836d [Fix]Fix insert select missing audit log when connect follower FE (#36481)
## Proposed changes

pick #36472
2024-06-20 15:16:16 +08:00
c5bb0e3a21 [bug](prepared statement) fix prepared statement throw exception when inserting null value (#36484)
## Proposed changes

bp #36426

<!--Describe your changes.-->
2024-06-20 11:31:59 +08:00
9c1f34359d [fix](point query) should check it is Slot before check it is DELETE_SIGN (#36566)
pick from master #36564

introduced by #36443
2024-06-20 10:29:21 +08:00
7b36e81b7a [fix](split) FileSystemCacheKey are always different in overload equals (#36431)
bp: #36432
## Proposed changes

## Fixed Bugs introduced from #34307
1. `FileSystemCacheKey.equals()` compares properties by `==`, resulting
in creating new file system in each partition
2. `dfsFileSystem` is not synchronized, resulting in creating more file
systems than need.
3. `jobConf.iterator()` will produce more than 2000 pairs of key-value
2024-06-20 10:08:05 +08:00
dabd27edd2 [opt](inverted index) performance optimization for need_read_data in compound #35346 #36292 (#36404)
pick from master
https://github.com/apache/doris/pull/35346
https://github.com/apache/doris/pull/36292
2024-06-20 08:43:16 +08:00
0be5331b28 [Fix](Variant) fix variant schema change may cause invalid block schema and write missing blocks #36317 (#36536) 2024-06-19 19:09:16 +08:00
5b7d93df5e [Pick](Variant) pick 2 PRs to correct tmp column name to go fast execute #36277 #36313 (#36527) 2024-06-19 19:07:47 +08:00
c1f15f7e4c [fix](catalog) fix wrong check when using "use_meta_cache=true" (#36533)
bp #36530
2024-06-19 18:03:03 +08:00
8d5b621021 [improvement](inverted index) Change inverted index field_name from column_name to id in format v2 #36470 (#36516)
pick from master #36470
2024-06-19 17:29:26 +08:00
f59dc4fb37 [opt](split) generate and get split batch concurrently (#36044)
bp #36045, and turn on batch split, which is turn off in #36109
Generate and get split batch concurrently.
`SplitSource.getNextBatch` remove the synchronization, and make each get their splits concurrently, and `SplitAssignment` generates splits asynchronously.
2024-06-19 16:16:02 +08:00
9c896efe0b [fix](race) fix access colocate group ids race #36444 (#36501)
cherry pick from #36444
2024-06-19 15:38:50 +08:00
349b943e12 [opt](Nereids) Optimize Join Penalty Calculation Based on Build Side Data Volume (#36107)
pick from master #35773

This PR introduces an optimization that adjusts the penalty applied
during join operations based on the volume of data on the build side.
Specifically, when the number of rows and width of the tables being
joined are equal, the materialization costs are now considered more
accurately. The update ensures that joins with a larger dataset on the
build side incur a higher penalty, improving overall query performance
and resource allocation.
2024-06-19 14:49:09 +08:00
1e54a5a66e [Fix](Nereids) fix leading with brace can not generate correct plan (#36328)
cherry-pick #36193

Problem:
when using leading like:
leading(t1 {t2 t3} {t4 t5} t6)
it would not generate correct plan because levellist can not express
enough message of braces
Solved:
remove levellist express of leading levels and use reverse polish
expression
Algorithm:
leading(t1 {t2 t3} {t4 t5} t6)
==>
stack top to down(t1 t2 t3 join join t4 t5 join t6 join) when generate
leading join, we can pop items in stack, when it's a table, make
logicalscan when it's a join
operator, make logical join and push back to stack
2024-06-19 14:47:55 +08:00
3c952c75be [refactor](Nereids) New expression extractor for partitions pruning (#36407)
cherry pick from #36326

An exception throw in TryEliminateUninterestedPredicates, for this case

 CREATE TABLE `tbltest` (
  `id` INT NULL,
  `col2` VARCHAR(255) NULL,
  `col3` VARCHAR(255) NULL,
  `dt` DATE NULL
) ENGINE=OLAP
DUPLICATE KEY(`id`, `col2`)
PARTITION BY RANGE(`dt`)
(PARTITION p20240617 VALUES [('2024-06-17'), ('2024-06-18')))
DISTRIBUTED BY HASH(`id`) BUCKETS 10
PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);

select * from tbltest
where
  case
    when col2 = 'xxx' and col3='yyy' then false -- note this is not about partition column
    when col2 in ('xxx') then false
    when col2 like 'xxx%' then false
    else true
  end

The `CaseWhen` require children should be `WhenClause`, TryEliminateUninterestedPredicates maybe rewrite the WhenClause to true/false predicate, and cause this exception:

ERROR 1105 (HY000): errCode = 2, detailMessage = The children format needs to be [WhenClause+, DefaultValue?]

Original extractor(TryEliminateUninterestedPredicates.java) caused some errors while try to derive the expressions which can be used for pruning partitions.
I tried to write a new extractor(and with unit tests) for pruning partitions, it is more simple and reliable (I think).

The theory of extractor is pretty simple:
A:Sort the expression in two kinds:
  1. evaluable-expression (let's mark it as E).
    Expressions that can be evaluated in the partition pruning stage.
    In the other word: not contains non-partition slots or deterministic expression.
  2. un-evaluable-expression (let's mark it as UE).
    Expressions that can NOT be evaluated in the partition pruning stage.
    In the other word: contains non-partition slots or deterministic expression.

B: Travel the predicate, only point on AND and OR operator, following the rule:
  (E and UE) -> (E and TRUE) -> E
  (UE and UE) -> TRUE
  (E and E) -> (E and E)
  (E or UE) -> TRUE
  (UE or UE) -> TRUE
  (E or E) -> (E or E)
2024-06-19 14:47:26 +08:00
38d750a7e0 [Fix](Row Store) all filter should match key columns condition (#36400) (#36443)
Queries like `select * from tbl` will pass
`LogicalResultSinkToShortCircuitPointQuery` rule in the previous.
Introduced by #35823
2024-06-19 14:06:53 +08:00
bdba954e1f [Fix](nereids)make agg output unchanged after normalized repeat (#36367)
cherry-pick #36207 to branch-2.1

Co-authored-by: feiniaofeiafei <moailing@selectdb.com>
2024-06-19 12:23:56 +08:00
df22344550 [opt](tools) update tools schema (#36114)
pick from master #35873

Update tpcds tools table customer_demographics's bucket column as its
primary key column, avoid performance issue due to data skew.
2024-06-19 12:23:48 +08:00
97ac46d2be [fix](mtmv) Mapping materialization statistics's expressionToColumnStats to mv scan plan based (#36058)
bp #35749
2024-06-19 11:25:13 +08:00
8d71a30595 [FIX](thrift)fix thrift for match-element-xxx (#36439) 2024-06-18 22:00:04 +08:00
da0138a412 [Pick 2.1](segment iterator) fix shrink non-char column coredump #36275 (#36468) 2024-06-18 21:59:15 +08:00
74162a1b7e [enhancement](prepared statement) Handle unsigned numeric type in prepare statement (#36388)
## Proposed changes

Issue Number: bp #36133

<!--Describe your changes.-->
2024-06-18 19:33:12 +08:00
8149b2b00d [fix](regression)Disable auto analyze before running mtmv test. (#36457)
Disable auto analyze before running mtmv test. Because auto analyze
result may overwrite manual analyze result.
backport: https://github.com/apache/doris/pull/36449
2024-06-18 17:09:17 +08:00
Pxl
dda25cceb6 [Bug](information-schema) fix some bug of information_schema.PROCESSLIST (#36447)
## Proposed changes
pick from #36409
2024-06-18 16:45:48 +08:00
33540ec87b [Pick 2.1](inverted index) fix inverted index compound reader memory leak (#36387)
## Proposed changes

Issue Number: close #xxx

Pick from #36146 #36420
2024-06-18 16:13:21 +08:00
e2350403a6 [fix](plan) fix wrong result for random distributed agg table with all keys not null (#36271) 2024-06-18 11:25:31 +08:00
4a117800ca [Bug](Function) fix json contains with empty value (#36320) (#36418) 2024-06-18 10:20:45 +08:00
4ae8607b2e [fix](hudi) disable fs.impl.cache to avoid FE OOM (#36402) (#36403)
bp #36402
2024-06-17 22:20:23 +08:00
3810861bb1 [branch-2.1](cherry-pick) add _pk_index_meta's size to Segment::_meta_mem_usage (#36329) (#36399)
cherry-pick #36329

add _pk_index_meta's size to Segment::_meta_mem_usage to make memory
estimation more accurate.
2024-06-17 20:41:38 +08:00