Commit Graph

3392 Commits

Author SHA1 Message Date
28066a0854 [fix](mtmv) Fix compensate union all wrongly when query rewrite by materialized view #40803 (#42019)
## Proposed changes

pr: https://github.com/apache/doris/pull/40803
commitId: d7e5d461
2024-10-18 12:10:53 +08:00
fb12e10272 [fix](array-funcs)fix array agg func with decimal type (#40839) (#42023)
## Proposed changes
backport: (https://github.com/apache/doris/pull/40839)
Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-17 20:47:39 +08:00
5fe37c0708 [Feat](Nereids) support fold constant by fe (#40441)(#40772)(#40744)(#40745)(40820) (#41837)
cherry-pick from master
#40441 
#40772 
#40744 
#40745
#40820
2024-10-17 20:43:17 +08:00
80d7523a62 [Feat](Nereids) support use cbo rule hint #35925 #39715 #40167 #40958 (#41869)
pick: #35925 #39715 #40167 #40958
Add feat of force use/nouse cbo rule hint and fix pr

introduce
when not using this hint, cbo rules like INFER_SET_OPERATOR_DISTINCT
would generate two plans and compare their cost
and nereids optimizer would decide which is better. But when we want to
control the behavior of cbo rules we could use this force cbo rule hint
usage example
explain shape plan
select /*+ USE_CBO_RULE(INFER_SET_OPERATOR_DISTINCT) */
*
from t1
union
select * from t2;
the USE_CBO_RULE(INFER_SET_OPERATOR_DISTINCT) hint would force rule
INFER_SET_OPERATOR_DISTINCT to be used
and generate plan like, which hashAgg below union is generated by this
rule:

-- !with_hint_union_distinct --
----hashAgg[GLOBAL]
--------hashAgg[LOCAL]
----------PhysicalUnion
--------------hashAgg[LOCAL]
----------------PhysicalOlapScan[t1]
--------------hashAgg[LOCAL]
----------------PhysicalOlapScan[t2]
Hint log:
Used: INFER_SET_OPERATOR_DISTINCT
UnUsed:
SyntaxError:
When we want to force disable this rule, we could use
explain shape plan select /*+
NO_USE_CBO_RULE(INFER_SET_OPERATOR_DISTINCT) */ * from t1 union select *
from t2;
which would generate plan with this rule:

-- !with_hint_no_union_distinct --
----hashAgg[GLOBAL]
--------hashAgg[LOCAL]
----------PhysicalUnion
--------------PhysicalOlapScan[t1]
--------------PhysicalOlapScan[t2]
Hint log:
Used: NO_INFER_SET_OPERATOR_DISTINCT
UnUsed:
SyntaxError:
change sessionvariable enableNereidsRules to varType.remove
2024-10-17 20:36:03 +08:00
669f59ce5a [branch-2.1][feat](job)Implementing Job in Nereids (#41391) (#42012)
## Proposed changes

The JOB's execution SQL is currently defined by an older CUP file, which
causes some issues with lexical analysis in the new optimizer as it
doesn't pass under the old optimizer. Since the JOB's underlying
execution already uses the new optimizer, we're planning to fully
migrate to ANTLR4 for consistency.

(cherry picked from commit 334b473deb5ff2e5c29c5eedcfac95dd806ae622)

#41391
2024-10-17 16:56:36 +08:00
67d057a711 [cherry-pick](branch-21) fix conv function parser string failure return wrong result (#40530) (#41964)
## Proposed changes

Issue Number: close #39618
cherry-pick from master (#40530)
2024-10-17 14:45:46 +08:00
968e33f07e [cherry-pick](branch-21) pick (#39057) (#41352) (#41958)
## Proposed changes

pick from master (#39057) (#41352)

<!--Describe your changes.-->

---------

Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com>
2024-10-17 14:30:40 +08:00
1b901f6fcc [cherry-pick](branch-2.1) add parquet tvf cases and fix some parquet bug (#41931)
## Proposed changes
pick pr:
  https://github.com/apache/doris/pull/41683
  https://github.com/apache/doris/pull/41506
  https://github.com/apache/doris/pull/41338
  https://github.com/apache/doris/pull/39326

---------

Co-authored-by: morningman <morningman@163.com>
2024-10-17 14:20:58 +08:00
19784d420c [opt](inverted index) Improved top-N optimization by refining the sorting column check. (#39496) (#41954)
https://github.com/apache/doris/pull/39496
2024-10-17 11:31:11 +08:00
cf2ec26bc2 [fix](catalog) should return error if try using a unknown database (#40479) (#41971)
bp #40479
2024-10-17 11:13:56 +08:00
e62e47700d [fix](Nereids) fixed the limit offset error pick 39316 (#41878) 2024-10-16 10:38:05 +08:00
a4b7d93ded [bugfix](iceberg)add prefix for endpoint with s3 client for 2.1 (#41336) (#41877)
bp: #41336
2024-10-15 19:59:10 +08:00
94687a2f3c [fix](array/map) fix resize impl in array/map (#41595) (#41699)
backport: https://github.com/apache/doris/pull/41595
2024-10-15 09:50:11 +08:00
d97642e9b5 [cherry-pick](branch-21) fix tablet sink shuffle without project not match the output tuple (#40299)(#41293) (#41327)
## Proposed changes

cherry-pick from master  (#40299)(#41293)

<!--Describe your changes.-->
2024-10-15 00:12:23 +08:00
4888c632f4 [cherry-pick](branch2.1) support escape.delim and serialization.null.format for hive text (#41684)
## Proposed changes
pick from master:
https://github.com/apache/doris/pull/40291
2024-10-15 00:08:23 +08:00
ff52e73a07 [Fix](inverted index) fix match null for inverted index #41746 (#41787)
cherry pick from #41746
2024-10-14 14:45:36 +08:00
f112af0fd2 [pick](branch-2.1) pick #41555 #41592 #38204 (#41781)
pick #41555 #41592 #38204
2024-10-14 14:05:08 +08:00
ec0c008317 [feature](paimon)support paimon with dlf for 2.1 (#41247) (#41694)
bp: #41247
2024-10-13 20:04:01 +08:00
cfe7a8302b [enhance](mtmv) mtmv query sql expand star (#36543) (#41744)
pick: https://github.com/apache/doris/pull/36543
2024-10-12 17:23:13 +08:00
203f00ef1d [fix](bloom filter)Fix drop column with bloom filter (#41369) (#41711)
bp #41369
2024-10-12 17:14:31 +08:00
ae56739f88 [enhancement](sequence col) add session variable to skip sequence column check while INSERT INTO (#41655) (#41720)
cp #41655
2024-10-12 15:30:20 +08:00
90d6985f91 [Fix](bug) Is null predicate get error query result (#41704)
cherry-pick #41668
2024-10-12 13:18:14 +08:00
b2bac26c17 [fix](jdbc catalog) Disable oracle scan null operator pushdown (#41563) (#41712)
Because Oracle versions below Oracle21 do not support null as an
operator, and considering that most users' Oracle versions are below
Oracle21, we disable Oracle's null operator pushdown by default.
pick (#41563)
2024-10-11 21:01:05 +08:00
18cb395496 [fix] (inverted index) fix the error result in the query when using count on index (#41375) (#41690)
## Proposed changes

pick from master  #41375

<!--Describe your changes.-->
2024-10-11 17:15:14 +08:00
4ac07fe918 [Feature](json) Support json_search function in 2.1 (#41590)
cherry-pick #40948 

Like mysql, json_search returns the path which point to a json string
witch match the pattern.
`SELECT JSON_SEARCH('["A",[{"B":"1"}],{"C":"AB"},{"D":"BC"}]', 'one',
'A_') as res;`
```
+----------+
| res      |
+----------+
| "$[2].C" |
+----------+
```

Co-authored-by: liutang123 <liulijia@gmail.com>
2024-10-11 16:33:07 +08:00
8c0f73cb90 [Enhancement](MaxCompute)Refactoring maxCompute catalog using Storage API.(#40225 , #40888 ,#41386 ) (#41610)
bp #40225 , #40888 ,#41386

## Proposed changes
Among them, #40225 is the new api of mc,
#40888 is used to fix the bug when reading null between the new and old
apis,
#41386 is used for compatibility between the new and old versions
2024-10-11 11:55:41 +08:00
6dddd4c499 [function](cast)Make string casting to integers more like MySQL's beh… (#41541)
…avior (#38847)
https://github.com/apache/doris/pull/38847
## Proposed changes

There are two issues here. First, the results of casting are
inconsistent between FE and BE .
```
FE
mysql [(none)]>select cast('3.000' as int); 
+----------------------+
| cast('3.000' as INT) |
+----------------------+
|                    3 |
+----------------------+

mysql [(none)]>set debug_skip_fold_constant = true;

BE
mysql [(none)]>select cast('3.000' as int);
+----------------------+
| cast('3.000' as INT) |
+----------------------+
|                 NULL |
+----------------------+
```
The second issue is that casting on BE converts '3.0' to null. Here, the
casting logic for FE and BE has been unified

<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

---------

Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
2024-10-11 09:32:00 +08:00
0fb42d3a48 [Enhancement](tvf)catalog tvf implements user permission checks and hides sensitive information (#41497) (#41604)
bp #41497 

before #21790
## Proposed changes
This PR unifies the duplicate parts of `catalog tvf` and `show
catalogs`, adds permission check when querying `catalog tvf`, and hides
sensitive information.
2024-10-10 17:55:40 +08:00
1db0aef9b7 [feature](array_agg) support array_agg with param is array/map/struct… (#41651)
… (#40697)

this pr we support array_agg function support param with array map
struct type

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-10 17:54:54 +08:00
33fad04341 [opt](Nereids) use 1 instead narrowest column when do column pruning (#41548) (#41627)
pick from master #41548
2024-10-10 14:02:23 +08:00
a45dc8796a [fix](Nereids) simplify decimal comparison wrong when cast to smaller scale (#41151) (#41618)
pick from master #41151
2024-10-09 23:03:01 +08:00
649cefd70f [opt](Nereids) forbid distribute under project and filter (#39812) (#41622)
pick from master #39812
2024-10-09 23:02:06 +08:00
308700f0ca [fix](test) fix unstable test_export_external_table cases (#41523) (#41570)
bp #41523
2024-10-09 11:53:22 +08:00
afb477c66d [Fix](inverted index) Fix wrong need read data opt when enable_common_expr_pushdown is disabled #40689 (#41562)
cherry pick from #40689
2024-10-08 22:12:10 +08:00
4f81fc474c [bugfix](paimon)Get the file format by file name (#41020) (#41487)
bp #41020
2024-09-30 15:46:13 +08:00
b7db357847 [test](inverted index) refine test_ignore_above case, add compound query sql #40355 (#41445)
cherry pick from #40355
2024-09-30 07:04:08 +08:00
d659750fd9 [pick](Serde-2.1) fix variant serde may lost num_rows when subcolumns empty (#41438)
serialization object with empty subcolumns may lost num_rows, so need to
record num_rows and set back num_rows in serdes

backport #38413
2024-09-29 09:45:37 +08:00
727f0374be [Refactor](inverted index) refactor inverted index compound predicates evaluate logic #38908 (#41385)
cherry pick from #38908
2024-09-29 09:19:17 +08:00
0b4552f74b [cherry-pick](branch-2.1) pick hive text write from master (#40537)
## Proposed changes
pick prs:
https://github.com/apache/doris/pull/38549
https://github.com/apache/doris/pull/40183
https://github.com/apache/doris/pull/40315

---------

Co-authored-by: Calvin Kirs <kirs@apache.org>
2024-09-27 20:57:07 +08:00
82228358b9 [Fix](nereids) fix create view with nullable column (#41234) (#41393)
cherry-pick from master #41234
2024-09-27 19:13:54 +08:00
0c51ee26ea [fix](function) add time type in conditional-functions (#41270) (#41379)
## Proposed changes
https://github.com/apache/doris/pull/41270

<!--Describe your changes.-->
2024-09-27 17:19:54 +08:00
eb13cd4154 [branch-2.1] Picks "[Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update #40272" (#40964)
picks https://github.com/apache/doris/pull/40272
2024-09-26 22:54:27 +08:00
bf3d4240be [fix](window_func) fix bug of agg function used in window function and add many test cases (#40678) (#41328)
## Proposed changes

Issue Number: close #xxx

BP #40678
2024-09-26 22:50:34 +08:00
4deda2fce7 [improvement](nereids) Simplify ScanNode projection handling by removing redundant conditions (#40801) (#41315)
pick from master #40801

This PR simplifies the handling of `ScanNode` projection logic.
Previously, the code included multiple conditional checks to determine
whether a `projectionTuple` should be generated. These conditions have
been removed, and now `projectionTuple `is always generated for
`ScanNode`, ensuring a consistent projection setup. Additionally,
redundant handling of `SlotId` and `SlotRef` has been eliminated, making
the code cleaner and easier to maintain. The behavior for `OlapScanNode`
remains unchanged.
2024-09-26 10:35:01 +08:00
a11fd62043 [fix](window function) Fix illegal frame range (#41147) (#41305)
pick #41147

0# doris::signal::(anonymous namespace)::FailureSignalHandler(int,
siginfo_t*, void*) at

/home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:421
 1# 0x00007F591D573520 in /lib/x86_64-linux-gnu/libc.so.6
 2# pthread_kill at ./nptl/pthread_kill.c:89
 3# raise at ../sysdeps/posix/raise.c:27
 4# abort at ./stdlib/abort.c:81
 5# _nl_load_domain at ./intl/loadmsgcat.c:1177
 6# 0x00007F591D56AE96 in /lib/x86_64-linux-gnu/libc.so.6
7# doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false,
false, false, DefaultMemoryAllocator>, 16ul, 15ul>::operator[](long)
const at

/home/zcp/repo_center/doris_master/doris/be/src/vec/common/pod_array.h:365
8# doris::vectorized::ColumnNullable::is_null_at(unsigned long) const at
/home/zcp/repo_center/doris_master/doris/be/src/vec/columns/column_nullable.h:158
9#

doris::vectorized::ReaderFirstAndLastData<doris::vectorized::ColumnVector<double>,
true, true, false>::insert_result_into(doris::vectorized::IColumn&)
const at

/home/zcp/repo_center/doris_master/doris/be/src/vec/aggregate_functions/aggregate_function_reader_first_last.h:125
10# doris::pipeline::AnalyticLocalState::_insert_result_info(long) in
/mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be
11# std::_Function_handler<void (long), std::_Bind_result<void, void
(doris::pipeline::AnalyticLocalState::*(doris::pipeline::AnalyticLocalState*,
std::_Placeholder<1>))(long)> >::_M_invoke(std::_Any_data const&,
long&&) at

/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
12# std::function<void (long)>::operator()(long) const at
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560
13# doris::pipeline::AnalyticLocalState::_get_next_for_rows(unsigned
long) in /mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be 14#
std::enable_if<is_invocable_r_v<doris::Status, doris::Status
(doris::pipeline::AnalyticLocalState::*&)(unsigned long),
doris::pipeline::AnalyticLocalState*&, unsigned long>,
doris::Status>::type std::__invoke_r<doris::Status, doris::Status
(doris::pipeline::AnalyticLocalState::*&)(unsigned long),
doris::pipeline::AnalyticLocalState*&, unsigned long>(doris::Status
(doris::pipeline::AnalyticLocalState::*&)(unsigned long),
doris::pipeline::AnalyticLocalState*&, unsigned long&&) at
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:114
15# doris::Status std::_Bind_result<doris::Status, doris::Status
(doris::pipeline::AnalyticLocalState::*(doris::pipeline::AnalyticLocalState*,
std::_Placeholder<1>))(unsigned long)>::__call<doris::Status, unsigned
long&&, 0ul, 1ul>(std::tuple<unsigned long&&>&&, std::_Index_tuple<0ul,
1ul>) at

/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/functional:570
16# std::_Function_handler<doris::Status (unsigned long),
std::_Bind_result<doris::Status, doris::Status

(doris::pipeline::AnalyticLocalState::*(doris::pipeline::AnalyticLocalState*,
std::_Placeholder<1>))(unsigned long)> >::_M_invoke(std::_Any_data
const&, unsigned long&&) at

/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
17# std::function<doris::Status (unsigned long)>::operator()(unsigned
long) const at

/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560
18#

doris::pipeline::AnalyticSourceOperatorX::get_block(doris::RuntimeState*,
doris::vectorized::Block*, bool*) in
/mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be
19#

doris::pipeline::OperatorXBase::get_block_after_projects(doris::RuntimeState*,
doris::vectorized::Block*, bool*) at

/home/zcp/repo_center/doris_master/doris/be/src/pipeline/exec/operator.cpp:322
20# doris::pipeline::PipelineTask::execute(bool*) in
/mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be
21# doris::pipeline::TaskScheduler::_do_work(unsigned long) at
/home/zcp/repo_center/doris_master/doris/be/src/pipeline/task_scheduler.cpp:138
22# doris::ThreadPool::dispatch_thread() in
/mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be
23# doris::Thread::supervise_thread(void*) at
/home/zcp/repo_center/doris_master/doris/be/src/util/thread.cpp:499 24#
start_thread at ./nptl/pthread_create.c:442
25# 0x00007F591D657850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83
2024-09-26 09:55:33 +08:00
5b3b2cec80 [feat](metatable) support table$partitions for hive table (#40774) (#41230)
bp #40774
and pick part of #34552, add `isPartitionedTable()` interface in `TableIf`
2024-09-25 09:52:07 +08:00
8bb57bcc3e [fix] (inverted index) fix the error in the query result when using count on index (#41200)
## Proposed changes

Introduced by  #39473

<!--Describe your changes.-->
2024-09-24 19:47:18 +08:00
2b427c316a [feature](functions) impl scalar functions normal_cdf,to_iso8601,from_iso8601_date (#40695) (#41049)
bp #40695
2024-09-24 09:52:39 +08:00
0d38a9a36d [feature](restore) support atomic restore (#41107)
Cherry-pick #40353, #40734, #40817, #40876, #40921, #41017, #41083
2024-09-24 09:41:41 +08:00
48e60f3ff3 [Fix](inverted index) fix wrong opt for count_on_index #41127 (#41154)
cherry pick from #41127
2024-09-23 22:45:52 +08:00