Commit Graph

20446 Commits

Author SHA1 Message Date
8409f24062 [fix](Nereids) fix fold constant by be return type mismatched (#39723)(#41164)(#41331)(#41546) (#41838)
cherry-pick: #39723 #41164 #41331 #41546 because later problem is intro by prev one, so put them together
when using fold constant by be,
the return type of substring('123456',1, 3) would changed to be text, which we want it to be 3 remove windowframe in window expression to avoid folding constant on be
2024-10-18 20:34:03 +08:00
03136baacf [fix](scanner) Fix incorrect _max_thread_num in scanner context when many queries are running. #41273 (#42016)
cherry pick from #41273
2024-10-18 18:08:07 +08:00
e64f2e68e0 [opt](nereids) refine stats derive (#40654) (#40698) (#42050)
pick from master #40654 #40698
2024-10-18 16:18:10 +08:00
1236cfd159 [fix](Nereids) fix Is Cached is not Yes in Profile when enable_sql_cache=true (#42032) (#42034)
fix `Is  Cached` is not Yes in Profile when enable_sql_cache=true, introduced by #33262
2024-10-18 16:17:15 +08:00
28066a0854 [fix](mtmv) Fix compensate union all wrongly when query rewrite by materialized view #40803 (#42019)
## Proposed changes

pr: https://github.com/apache/doris/pull/40803
commitId: d7e5d461
2024-10-18 12:10:53 +08:00
cec0458860 [branch-2.1][chore](dependencies)upgrade fe dependencies (#41142) (#42056)
## Proposed changes
upgrade commons-configuration2 to 2.11.0
upgrade logging-interceptor to 4.12.0
upgrade commons-compress to 1.27.1
upgrade jetty-bom to 9.4.56.v20240826
upgrade azure-sdk to 1.2.27

Iceberg depends on configuration2, and configuration2 relies on a newer
version of commons-lang3. However, there were significant breaking
changes in commons-lang3, which made it
incompatible.https://issues.apache.org/jira/browse/LANG-1705 As a
result, I rewrote the clone method.

(cherry picked from commit 945edf8dbffaa25c987bcefad59b6cde52772d4f)

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-18 09:54:12 +08:00
1332f286a5 [chore](dependencies)upgrade some dependencies (#41901) (#42047)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

(cherry picked from commit 9cfc3f16681cd1ef5b8371e03d88b014e4c0a3a0)

#41901
2024-10-17 21:49:21 +08:00
fb12e10272 [fix](array-funcs)fix array agg func with decimal type (#40839) (#42023)
## Proposed changes
backport: (https://github.com/apache/doris/pull/40839)
Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-17 20:47:39 +08:00
5fe37c0708 [Feat](Nereids) support fold constant by fe (#40441)(#40772)(#40744)(#40745)(40820) (#41837)
cherry-pick from master
#40441 
#40772 
#40744 
#40745
#40820
2024-10-17 20:43:17 +08:00
80d7523a62 [Feat](Nereids) support use cbo rule hint #35925 #39715 #40167 #40958 (#41869)
pick: #35925 #39715 #40167 #40958
Add feat of force use/nouse cbo rule hint and fix pr

introduce
when not using this hint, cbo rules like INFER_SET_OPERATOR_DISTINCT
would generate two plans and compare their cost
and nereids optimizer would decide which is better. But when we want to
control the behavior of cbo rules we could use this force cbo rule hint
usage example
explain shape plan
select /*+ USE_CBO_RULE(INFER_SET_OPERATOR_DISTINCT) */
*
from t1
union
select * from t2;
the USE_CBO_RULE(INFER_SET_OPERATOR_DISTINCT) hint would force rule
INFER_SET_OPERATOR_DISTINCT to be used
and generate plan like, which hashAgg below union is generated by this
rule:

-- !with_hint_union_distinct --
----hashAgg[GLOBAL]
--------hashAgg[LOCAL]
----------PhysicalUnion
--------------hashAgg[LOCAL]
----------------PhysicalOlapScan[t1]
--------------hashAgg[LOCAL]
----------------PhysicalOlapScan[t2]
Hint log:
Used: INFER_SET_OPERATOR_DISTINCT
UnUsed:
SyntaxError:
When we want to force disable this rule, we could use
explain shape plan select /*+
NO_USE_CBO_RULE(INFER_SET_OPERATOR_DISTINCT) */ * from t1 union select *
from t2;
which would generate plan with this rule:

-- !with_hint_no_union_distinct --
----hashAgg[GLOBAL]
--------hashAgg[LOCAL]
----------PhysicalUnion
--------------PhysicalOlapScan[t1]
--------------PhysicalOlapScan[t2]
Hint log:
Used: NO_INFER_SET_OPERATOR_DISTINCT
UnUsed:
SyntaxError:
change sessionvariable enableNereidsRules to varType.remove
2024-10-17 20:36:03 +08:00
1245df670f [feat](nereids) adjust stats derive by delta row #39222 (2.1) (#42025)
## Proposed changes
pick #39222
wait JiBin merge updateRows

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-17 19:54:52 +08:00
Pxl
4d04db467e [Bug](predicate) Fixed the problem that the number of rows in inlist #41824 (#41910)
pick from #41824
2024-10-17 17:13:00 +08:00
Pxl
f4d9ddcb00 [Improvement](runtime-filter) set some rf brpc request to ignore_eovercrowded #41698 (#41897)
pick from #41698
2024-10-17 16:57:26 +08:00
669f59ce5a [branch-2.1][feat](job)Implementing Job in Nereids (#41391) (#42012)
## Proposed changes

The JOB's execution SQL is currently defined by an older CUP file, which
causes some issues with lexical analysis in the new optimizer as it
doesn't pass under the old optimizer. Since the JOB's underlying
execution already uses the new optimizer, we're planning to fully
migrate to ANTLR4 for consistency.

(cherry picked from commit 334b473deb5ff2e5c29c5eedcfac95dd806ae622)

#41391
2024-10-17 16:56:36 +08:00
5806dae467 [fix](move-memtable) do not retry open streams (#41550) (#41999)
backport #41550
2024-10-17 15:56:56 +08:00
4521404849 [fix](case) test_limit result is unstable (#41938) (#41977)
pick from master #41938
2024-10-17 15:36:59 +08:00
3fcd64366f [opt](Nereids) use 1 as narrowest column when do column pruning on union (#41719) (#41975)
pick from master #41719

just like previous PR #41548

this PR process union node to ensure not require any column from its
children when it is required by its parent with empty slot set
2024-10-17 15:28:27 +08:00
b4875c2789 [fix](jni)fix jni use timezone_obj get timezone be core. (#41956) (#42003)
bp #41956 

This PR #40225 try to pass time zone info from BE to JNI, and it use
`_state->timezone_obj().name()`
to get the timezone name.
But when we do some rolling upgrade of BE, it may coredump like:

```
*** SIGSEGV address not mapped to object (@0x610) received by PID 72661 (TID 73538 OR 0x7f2e898d1640) from PID 1552; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
 4# 0x00007F3070D3E520 in /lib/x86_64-linux-gnu/libc.so.6
 5# cctz::time_zone::name[abi:cxx11]() const in /mnt/hdd01/ci/compatibility-deploy/be/lib/doris_be
 6# doris::vectorized::JniConnector::open(doris::RuntimeState*, doris::RuntimeProfile*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/jni_connector.cpp:87
 7# doris::vectorized::AvroJNIReader::init_fetch_table_schema_reader() at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/format/avro/avro_jni_reader.cpp:119
 8# std::_Function_handler::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
 9# doris::WorkThreadPool::work_thread(int) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/util/work_thread_pool.hpp:159
10# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84
11# start_thread at ./nptl/pthread_create.c:442
12# 0x00007F3070E22850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83
172.20.50.206 last coredump sql: 2024-10-13 04:12:23,985 [query] 
```

This PR use another method: `_state->timezone()`, which just return a
string, instead of reading and initializing
time zone info file, to avoid potential coredump.
2024-10-17 14:47:33 +08:00
67d057a711 [cherry-pick](branch-21) fix conv function parser string failure return wrong result (#40530) (#41964)
## Proposed changes

Issue Number: close #39618
cherry-pick from master (#40530)
2024-10-17 14:45:46 +08:00
0b41cd2472 [fix](serde)fix the bug in DataTypeNullableSerDe.deserialize_column_from_fixed_json (#41217) (#41960)
bp #41217 

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-17 14:36:01 +08:00
1c5c27eceb [Enhancement](ExternalTable)Optimize the performance of getCachedRowCount when reading ExternalTable (#41659) (#41959)
bp #41659 
## Proposed changes

Because ExternalTable will initialize the previously uninitialized table
when `getCachedRowCount()`, which is unnecessary. So for the
uninitialized table, we directly return -1.
This will increase the speed of our query `information_schema.tables`.
2024-10-17 14:34:23 +08:00
968e33f07e [cherry-pick](branch-21) pick (#39057) (#41352) (#41958)
## Proposed changes

pick from master (#39057) (#41352)

<!--Describe your changes.-->

---------

Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com>
2024-10-17 14:30:40 +08:00
1b901f6fcc [cherry-pick](branch-2.1) add parquet tvf cases and fix some parquet bug (#41931)
## Proposed changes
pick pr:
  https://github.com/apache/doris/pull/41683
  https://github.com/apache/doris/pull/41506
  https://github.com/apache/doris/pull/41338
  https://github.com/apache/doris/pull/39326

---------

Co-authored-by: morningman <morningman@163.com>
2024-10-17 14:20:58 +08:00
eea916e466 [cherry-pick](branch-21) should check the expr of auto range partition (#41626) (#41872)
## Proposed changes

cherry-pick from master (#41626)

<!--Describe your changes.-->
2024-10-17 12:45:49 +08:00
b8214952a1 [branch-2.1] Fix is_partial_update parameter is not set in append_block_with_partial_content() (#41865)
https://github.com/apache/doris/pull/41439 forgets to set
`is_partial_update` parameter for `Tablet::lookup_row_key()` in
`append_block_with_partial_content()`
2024-10-17 12:44:41 +08:00
3ff67350d0 [opt](Nereids) support all syntax to avoid fallback in multi-statement query (#41811) 2024-10-17 12:42:01 +08:00
5736dc537c [fix](mtmv) Fix duplicate column name not check when create materialized view #40658 (#41822)
## Proposed changes

pr: https://github.com/apache/doris/pull/40658
commitId: 252aeeb6
2024-10-17 12:19:15 +08:00
95c0a7a8e3 [chore](planner) change decimal literal toSql as plainString (#41809) (#41976)
pick from master #41809

for example, 0 with decimal(38,4) will return 0.0000
2024-10-17 12:16:10 +08:00
5521a25392 [enhance](insertoverwrite)insert overwrite not fallback (#41799)
- Insert overwrite on NEREIDS can automatically clean up garbage
temporary partitions after restart, which is not available on old
optimizers
- When insert fails, no longer throw nereids exceptions
2024-10-17 12:14:17 +08:00
7daf423e98 [opt](profile) Move ExecutedByFrontend to execution summary profile #41761 (#41831)
cherry pick from #41761
2024-10-17 11:37:48 +08:00
19784d420c [opt](inverted index) Improved top-N optimization by refining the sorting column check. (#39496) (#41954)
https://github.com/apache/doris/pull/39496
2024-10-17 11:31:11 +08:00
cf2ec26bc2 [fix](catalog) should return error if try using a unknown database (#40479) (#41971)
bp #40479
2024-10-17 11:13:56 +08:00
169a12058b [chore](Variant) forbid variant type as hash join key (#41673) (#41974)
pick from master #41673
2024-10-17 11:06:37 +08:00
f98aa1d08b [Fix](Branch-2.1) fix fallback to legacy planner when set group commit in session variable (#41984) 2024-10-17 10:40:33 +08:00
0b6447faeb [Fix](SchemaChange) refactor variant root column iterator to make row… (#41941)
pick #41700
2024-10-17 10:39:07 +08:00
d04082f685 [improvement](statistics)Use min row count of all replicas as tablet/table row count. (#41894) (#41978)
backport: https://github.com/apache/doris/pull/41894
2024-10-16 21:45:37 +08:00
7d99d5fcc4 [fix](analytic) Fix data distribution after analytic operator (#41902) (#41949)
Fix data distribution after analytic operator

pick #41902
2024-10-16 18:41:56 +08:00
5bd33fc88c [pick](branch-2.1) pick #41292 #41350 #41589 #41628 #41743 #41601 #41667 #41751 (#41927)
## Proposed changes

pick #41292 #41350 #41589 #41628 #41743 #41601 #41667 #41751

<!--Describe your changes.-->

---------

Co-authored-by: Pxl <pxl290@qq.com>
2024-10-16 15:41:28 +08:00
e56216211e [pick](branch-2.1) pick #40667 #40714 (#41905)
pick
#40667
#40714

---------

Co-authored-by: wangbo <wangbo@apache.org>
2024-10-16 14:09:03 +08:00
95d429b7de [pick](branch-2.1) pick #41891 (#41929)
pick #41891
2024-10-16 13:59:46 +08:00
e6545a36a3 [improvement](iceberg)Parallelize splits for count(*) for 2.1 (#41169) (#41880)
bp: #41169
2024-10-16 10:52:06 +08:00
e62e47700d [fix](Nereids) fixed the limit offset error pick 39316 (#41878) 2024-10-16 10:38:05 +08:00
b185dfcbf6 [pick](branch-2.1) pick #41676 #41740 #41857 (#41904)
pick #41676 #41740 #41857
2024-10-15 22:41:17 +08:00
a4b7d93ded [bugfix](iceberg)add prefix for endpoint with s3 client for 2.1 (#41336) (#41877)
bp: #41336
2024-10-15 19:59:10 +08:00
b91d8e2327 [Improvement](minor) Reduce locking scope (#41845) (#41844)
pick #41845
2024-10-15 18:39:53 +08:00
78b6157aa9 [fix](ip/variant) fix information meta (#41871)
fix datatype information meta  for ip/variant (#41666)

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-15 18:01:14 +08:00
5fbefa084c [opt](hive) make supported hive table error msg clearer (#41616) (#41851)
bp #41616
2024-10-15 17:36:27 +08:00
f3389973e0 [fix](heartbeat) fill default value for required field in TFrontendPingFrontendResult (#41609) (#41854)
bp #41609
2024-10-15 17:33:46 +08:00
24ceb60ac7 [fix](glue) support glue on aws (#41084) (#41855)
bp #41084
2024-10-15 17:33:25 +08:00
abcba778ff [fix](cancel) Fix cancel msg on branch-2.1 (#41798)
Make sure we can tell cancel reason from:
1. user cancel
2. timeout
3. others

```text
mysql [demo]>set query_timeout=1;
--------------
set query_timeout=1
--------------

Query OK, 0 rows affected (0.00 sec)

mysql [demo]>select sleep(5);
--------------
select sleep(5)
--------------

ERROR 1105 (HY000): errCode = 2, detailMessage = Timeout

mysql [demo]>select sleep(5);
--------------
select sleep(5)
--------------

^C^C -- sending "KILL QUERY 0" to server ...
^C -- query aborted
ERROR 1105 (HY000): errCode = 2, detailMessage = cancel query by user from 127.0.0.1:64208
```
2024-10-15 17:15:05 +08:00