Commit Graph

19141 Commits

Author SHA1 Message Date
0051832f91 [fix](statistics)Use ConcurrentHashMap to avoid ConcurrentModificationException (#36452) (#36950)
OlapTable idToPartition Map should use ConcurrentHashMap to avoid
ConcurrentModificationException.
backport: https://github.com/apache/doris/pull/36452
2024-06-27 23:06:03 +08:00
474295cf31 [chore](autobucket) add autobucket test and log #36874 (#36907)
cherry pick from #36874
2024-06-27 22:30:13 +08:00
ee7f9a4f26 [fix](oom) avoid oom when a lot of tablets fail on load (#36944)
pick #36873
2024-06-27 22:12:42 +08:00
46eef9d948 [build](docker) add repo for new version of git (#35892) (#36909)
bp #35892
2024-06-27 21:00:14 +08:00
fcc26cc671 [test](migrate) move some cases from p2 to p0 (#36750)(#36787) (#36922)
bp #36750 and #36787
2024-06-27 20:59:50 +08:00
5c1eef5f06 [feature](tvf) support max_filter_ratio (#35431) (#36911)
bp #35431

Co-authored-by: 苏小刚 <suxiaogang223@icloud.com>
2024-06-27 20:58:53 +08:00
bfd634f9c7 [fix](protocol) only return multi result when CLIENT_MULTI_STATEMENTS been set (#36759) (#36919)
pick from master #36759

multi statement support by PR #3050.
But there is a minor issue in implementation.

as MySQL dev doc say in


https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_command_phase_sp.html#sect_protocol_command_phase_sp_multi_statement

server should only process multi statement
when client set CLIENT_MULTI_STATEMENTS.
When client not set CLIENT_MULTI_STATEMENTS, server should treat query
as single statement.

but Doris do slightly different with MySQL server. Doris always treat
query as multi statement, but only return multi result when client set
CLIENT_MULTI_STATEMENTS. When client do not set CLIENT_MULTI_STATEMENTS,
Doris will return the last statement result only.

Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
2024-06-27 18:35:40 +08:00
a05d5cc75e [refactor](variant) refactor sub path push down on variant type (#36478) (#36923)
pick from master #36478

intro a new rule VARIANT_SUB_PATH_PRUNING to prune variant sub path.

for example, variant slot v in table t has two sub path: 'c1' and 'c2',
after this rule, select v['c1'] from t will only scan one sub path 'c1'
of v to reduce scan time.

This rule accomplishes all the work using two components. The Collector
traverses from the top down, collecting all the element_at functions on
the variant types, and recording the required path from the original
variant slot to the current element_at. The Replacer traverses from the
bottom up, generating the slots for the required sub path on scan,
union, and cte consumer. Then, it replaces the element_at with the
corresponding slot.
2024-06-27 17:48:43 +08:00
8a1ebba1cc [Improvement](multicatalog) support read tencent dlc table on lakefs (#36891)
bp #36823
2024-06-27 14:03:48 +08:00
22cb7b8fcb [improvement](compaction) be do not compact invisible version to avoid query error -230 #28082 (#36222)
cherry pick from #28082
2024-06-27 13:45:21 +08:00
89fc55d833 [improvement](balance) partition rebalance chose disk by rr #36826 (#36900)
cherry pick from #36826
2024-06-27 13:43:30 +08:00
f80750faed [improvement](clone) dead be will abort sched task #36795 (#36897)
cherry pick from #36795
2024-06-27 13:35:51 +08:00
a8e9c89dc6 [Fix](nereids) fix NormalizeAgg, change the upper project projections rewrite logic (#36161) (#36622)
cherry-pick #36161 to branch-2.1

NormalizeAggregate rewrite logic has a bug, for sql like this:

SELECT
	CASE
		1 WHEN CAST( NULL AS SIGNED ) THEN NULL
		WHEN COUNT( DISTINCT CAST( NULL AS SIGNED ) ) THEN NULL
		ELSE null
	END ;

This is the plan after NormalizeAggregate, the LogicalAggregate only
output `count(DISTINCT cast(NULL as SIGNED))`#3, do not output cast(NULL
as SIGNED)#2, but the upper project use cast(NULL as SIGNED)#2, so Doris
report error "cast(NULL as SIGNED) not in aggregate's output".

LogicalResultSink[29] ( outputExprs=[__case_when_0#1] ) +--LogicalProject[26] ( distinct=false, projects=[CASE WHEN (1 = cast(NULL as SIGNED)#2) THEN NULL WHEN (1 = count(DISTINCT cast(NULL as SIGNED))#3) THEN NULL ELSE NULL END AS `CASE WHEN (1 = cast(NULL as SIGNED)) THEN NULL WHEN (1 = count(DISTINCT cast(NULL as SIGNED))) THEN NULL ELSE NULL END`#1], excepts=[] )
   +--LogicalAggregate[25] ( groupByExpr=[], outputExpr=[count(DISTINCT cast(NULL as SIGNED)#2) AS `count(DISTINCT cast(NULL as SIGNED))`#3], hasRepeat=false )
      +--LogicalProject[24] ( distinct=false, projects=[cast(NULL as SIGNED) AS `cast(NULL as SIGNED)`#2], excepts=[] )
         +--LogicalOneRowRelation ( projects=[0 AS `0`#0] )

The problem is that the cast(NULL as SIGNED)#2 should not outputted by
LogicalAggregate, cast(NULL as SIGNED) should be computed in
LogicalProject.
This pr change the upper project projections rewrite logic:
aggregateOutputs is rewritten and become the upper-level LogicalProject
projections. During the rewriting process, the expressions inside the
agg function can be rewritten with expressions in aggregate function
arguments and group by expressions, but the ones outside the agg
function can only be rewritten with group by expressions.

---------

Co-authored-by: moailing <moailing@selectdb.com>
2024-06-27 12:17:18 +08:00
23cf494b48 [fix](schema-change) Fix schema-change from non-null to null (#36389)
https://github.com/apache/doris/pull/32913
2024-06-26 20:20:50 +08:00
a6a84b8ecc [improvement](stream load)(cherry-pick) support hll_from_base64 for stream load column mapping (#36819)
picked from https://github.com/apache/doris/pull/35923
2024-06-26 20:12:40 +08:00
25fb30c723 [fix](intersect) fix coredump caused by intersect of nullable and not nullable children #36401 (#36441)
## Proposed changes

Pick #36765
2024-06-26 17:45:21 +08:00
8fc70e32bc [fix](planner) fix wrong resut of function ifnull/coalesce caused by … (#36727) 2024-06-25 17:32:04 +08:00
695d58f354 [cherry-pick](scan)scanner could eos early when reached limit (#36535) (#36736)
## Proposed changes
cherry-pick from master #36535
2024-06-25 17:22:43 +08:00
11201feae5 [fix](spill join) fix coredump of debug_string (#36723)
## Proposed changes

Pick #36715

<!--Describe your changes.-->
2024-06-25 16:33:33 +08:00
785a1f49f5 [fix](txn) Fix coordidator be restart not abort txn #35342 (#36437)
cherry pick from #35342
2024-06-25 13:35:01 +08:00
07ce9cf52c [fix](schema change) reduce memory usage in schema change process #30231 #36285 #33073 (#36756)
pick
https://github.com/apache/doris/pull/30231
https://github.com/apache/doris/pull/36285
https://github.com/apache/doris/pull/33073
2024-06-25 12:21:17 +08:00
3652fc31c3 [Pick 2.1] "Fix data loss when node channel been cancelled before close wait (#36662)" (#36744)
## Proposed changes

Pick from https://github.com/apache/doris/pull/36662
2024-06-25 11:36:31 +08:00
6ec9a731e8 [branch-2.1](cherry-pick) partial update should not read old fileds from rows with delete sign (#36210) (#36755)
cherry-pick #36210
2024-06-24 21:13:24 +08:00
67adbdae75 [branch-2.1] Pick "[Fix](JournalEntity) re-add a line of code that is accidentally removed in #19917" (#36427)
## Proposed changes

pick https://github.com/apache/doris/pull/36423
2024-06-24 20:45:02 +08:00
e0088df3b3 [case](udf) support run java udf case on cluster with multiple BEs (#… (#36742)
…36669)

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: stephen <hello-stephen@qq.com>
2024-06-24 17:32:08 +08:00
e4b6dac0c1 [fix](ubsan) reinterpret_cast fix length types to int8 is not safe (#36725)
## Proposed changes

Fix type check of ubsan. 
```
/root/doris/be/src/vec/exec/format/parquet/fix_length_plain_decoder.h:75:78: runtime error: member call on address 0x5582f35db5c0 which does not point to an object of type 'doris::vectorized::ColumnVector<signed char>'
0x5582f35db5c0: note: object is of type 'doris::vectorized::ColumnVector<int>'
 83 55 00 00  78 c0 b0 5a 82 55 00 00  02 00 00 00 00 00 00 00  10 a0 00 d7 83 55 00 00  10 a0 00 d7
              ^~~~~~~~~~~~~~~~~~~~~~~
              vptr for 'doris::vectorized::ColumnVector<int>'
doris::Status doris::vectorized::FixLengthPlainDecoder::_decode_values<false>(COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const>&, doris::vectorized::ColumnSelectVector&, bool) at fix_length_plain_decoder.h:75:78
```
2024-06-24 14:03:41 +08:00
Pxl
c6205783fa [Bug](function) fix wrong output_char_size on hll_to_base64 (#36572)
## Proposed changes
pick from #36529
2024-06-24 13:19:28 +08:00
aeec08639d [branch-2.1](tag) 2.1.4-rc03 (#36706)
change version to 2.1.4-rc03
2024-06-23 00:26:13 +08:00
02fad48870 [Fix](upgrade) Fix fields not handled correctly during upgrade and downgrade (#36691)
master version is #36690
2024-06-22 14:23:04 +08:00
17cf34b244 [Fix](multi-catalog) Fix core in orc and parquet reader sometimes after low mem exception. (#36575)
## Proposed changes

Backport #36574.
2024-06-22 11:28:21 +08:00
90a4dd09f3 [Fix](func) CoreDump and Result Error in percentile function (#36647)
cherry pick #36643
2024-06-21 23:42:45 +08:00
445d42a57d [fix](topn-opt) remove redundant check for fetch phase (#36676)
#36629
Issue Number: close #xxx

<!--Describe your changes.-->
2024-06-21 22:28:38 +08:00
c8e4c404fa [Fix]check if fe set thrift field current_connect_fe (#36681)
bp #36678
2024-06-21 22:15:25 +08:00
c939781411 [Pick 2.1](inverted index) fix wrong no need read data when need_remaining_after_evaluate (#36684)
When using an equal predicate on a column that applies an inverted index
with a parser, it requires remaining_after_evaluate. In this situation,
we cannot optimize the column without reading the data.

## Proposed changes

From (#36637)
2024-06-21 22:01:39 +08:00
0cff539810 [feature](function) support new function replace_empty (#36283) (#36656)
#36283
2024-06-21 16:46:22 +08:00
c8f2a3f952 [fix](eq_for_null) fix incorrect logic in function eq_for_null #36004 (#36124)
cherry pick from #36004
cherry pick from #36164
2024-06-21 14:31:21 +08:00
8105dc7de8 [Pick 2.1](inverted index) fix wrong opt for pk no need read data (#36634)
## Proposed changes
 
Pick from #36618
2024-06-21 00:57:23 +08:00
58cc1dca7f [improve](fe) Support to config max msg/frame size of the thrift server (#36594)
Cherry-pick #35845
2024-06-21 00:15:15 +08:00
3febac1d91 [fix](connection) kill connection when meeting Write mysql packet failed error #36559 (#36616)
bp #36559
2024-06-20 22:27:01 +08:00
c28c243c98 [Fix](Variant) forbit create variant as key #36555 (#36578) 2024-06-20 20:33:48 +08:00
a79b56ac23 [chore](be) Support config max message size for be thrift server (#36595)
Cherry-pick #36467
2024-06-20 20:15:43 +08:00
b3dcfae864 [chore](be) Improve ingesting binlog error checking (#36596)
Cherry-pick #36487
2024-06-20 20:15:26 +08:00
26b1ef428a [branch-2.1](doris compose) fix docker start failed (#36534) 2024-06-20 20:14:17 +08:00
838af13001 [fix](auth)ldap set passwd need forward to master (#36436) (#36598)
pick from master: #36436
2024-06-20 18:35:37 +08:00
3ee259fc00 [branch-2.1][fix](jdbc catalog) fix jdbc mysql client match jsonb type (#36180)
bp #36177
2024-06-20 18:33:27 +08:00
ac0f6e75d2 [bugfix](iceberg)Read error when timestamp does not have time zone for 2.1 (#36435)
bp: #36141
2024-06-20 18:32:31 +08:00
22d37ba3fe [fix](auth)Auth support case insensitive (#36381) (#36557)
pick from: #36381
2024-06-20 18:31:30 +08:00
f7f7b2b738 [Enhancement](multi-catalog) Add more error msgs for wrong data types in orc and parquet reader. (#36580)
Backport #36417
2024-06-20 18:10:25 +08:00
fbcf63e1f5 [cherry-pick] (branch-2.1)fix variant index (#36577)
pick from master #36163
2024-06-20 17:57:26 +08:00
64a94e883d [fix](nereids)NullSafeEqualToEqual rule should keep <=> unchanged if it has none-literal child (#36523)
pick from master #36521

convert:
expr <=> null to expr is null
null <=> null to true
null <=> 1 to false
literal <=> literal to literal = literal ( 1 <=> 2 to 1 = 2 )
others are unchanged.
2024-06-20 17:55:36 +08:00