Commit Graph

8043 Commits

Author SHA1 Message Date
db0724dfe0 [Fix-2.1](function) fix function covar core for not null input (#39943)
## Proposed changes

Issue Number: close #xxx

add testcases like:
```groovy
    qt_notnull1 "select covar_samp(non_nullable(x), non_nullable(y)) from test_covar_samp"
    qt_notnull2 "select covar_samp(x, non_nullable(y)) from test_covar_samp"
    qt_notnull3 "select covar_samp(non_nullable(x), y) from test_covar_samp"
```

before they will all coredump in 2.1
2024-08-27 08:39:47 +08:00
21bd4a4ac8 [bug](function)fix json_replace check return type error (#37014) (#39938)
1. fix the return type dcheck error:
```
mysql [test]>select (json_replace(a, '$.fparam.nested_2', "qwe")) from json_table_2 limit 1;
ERROR 1105 (HY000): errCode = 2, detailMessage = (10.16.10.8)[INTERNAL_ERROR]Function json_replace get failed, expr is VectorizedFnCall[json_replace](arguments=a, String, String, String,return=Nullable(String)) and return type is Nullable(String).
```

2. improve the json_replace/json_insert/json_set function execute of not
convert const column, test about could faster 1s on 1000w table rows

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-27 08:03:48 +08:00
a5c8ed1cde [branch2.1][fix](cache) Catch the directory_iterator's error_code (#39922)
## Proposed changes

Catch the directory_iterator's error_code to avoid exceptions causing
core dump
2024-08-27 08:00:52 +08:00
a6e485e54f [fix](followup) Fix wrong partition num (#39917) 2024-08-26 18:25:36 +08:00
357394bb3e [branch-2.1]Reset io limit default value (#39898)
pick #39842
2024-08-26 14:27:35 +08:00
5acd1279a9 [fix](2.1) Fix correctness in branch-2.1 (#39901)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-26 14:12:59 +08:00
4c1c67e03a [improvemen](overflow) Provide the user with a suggestion to avoid th… (#39631) (#39897)
cherry-pick #39631 to branch-2.1
2024-08-26 08:10:32 +08:00
f2a37d58fb [fix](stat) handle overflow of memory stat if load failed (#39621) (#39887)
## Proposed changes

pick #39621 

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-25 18:24:08 +08:00
8dbd73988a [fix](recvr) catch exception of transmit_block (#39882)
BP #39881
2024-08-25 00:25:20 +08:00
2dea859bdb [debug](rpc) debug rpc time consumption problem (#39852)
## Proposed changes

Issue Number: close #xxx

Add detail RPC time info for each channel, sorted by max rpc time of
channels:
```
                     DATA_STREAM_SINK_OPERATOR  (id=1,dst_id=1):
                          -  Partitioner:  Crc32HashPartitioner(64)
                          -  BlocksProduced:  74
                          -  BrpcSendTime:  2.689us
                          -  BrpcSendTime.Wait:  0ns
                          -  BytesSent:  89.35  KB
                          -  CloseTime:  680.152us
                          -  CompressTime:  0ns
                          -  ExecTime:  160.663ms
                          -  InitTime:  263.608us
                          -  InputRows:  32.512K  (32512)
                          -  LocalBytesSent:  0.00  
                          -  LocalSendTime:  0ns
                          -  LocalSentRows:  0
                          -  MemoryUsage:  
                              -  PeakMemoryUsage:  80.00  KB
                          -  MergeBlockTime:  0ns
                          -  OpenTime:  4.113ms
                          -  OverallThroughput:  0.0  /sec
                          -  PendingFinishDependency:  41.179ms
                          -  RowsProduced:  32.512K  (32512)
                          -  RpcAvgTime:  11.850ms
                          -  RpcCount:  10
                          -  RpcMaxTime:  86.891ms
                          -  RpcMinTime:  15.200ms
                          -  RpcSumTime:  118.503ms
                          -  SerializeBatchTime:  13.517ms
                          -  SplitBlockDistributeByChannelTime:  38.923ms
                          -  SplitBlockHashComputeTime:  2.659ms
                          -  UncompressedRowBatchSize:  135.19  KB
                          -  WaitForDependencyTime:  0ns
                              -  WaitForRpcBufferQueue:  0ns
                        RpcInstanceDetails:
                              -  Instance  85d4f75b72a9ea61:  Count:  4,  MaxTime:  36.238ms,  MinTime:  12.107ms,  AvgTime:  21.722ms,  SumTime:  86.891ms
                              -  Instance  85d4f75b72a9ea91:  Count:  3,  MaxTime:  11.107ms,  MinTime:  2.431ms,  AvgTime:  5.470ms,  SumTime:  16.412ms
                              -  Instance  85d4f75b72a9eac1:  Count:  3,  MaxTime:  7.554ms,  MinTime:  3.160ms,  AvgTime:  5.066ms,  SumTime:  15.200m
```
2024-08-24 19:59:39 +08:00
5a810122a2 [debug](load) check the column type when string column is invalid (#39337)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-24 18:14:21 +08:00
263746b04b [fix](paimon) fix crash when enable cache with paimon deletion vector(#39877) (#39875)
bp #39877
2024-08-24 17:58:20 +08:00
14a2a66106 [fix](paimon) fix not able to read paimon data from hdfs with HA (#39806) (#39876)
bp #39806
2024-08-24 17:51:15 +08:00
460605ae3c [branch-2.1] pick some prs (#39860)
## Proposed changes

Issue Number: close #xxx

https://github.com/apache/doris/pull/38385 optimize parsing datetime
https://github.com/apache/doris/pull/38978 make stream load failure
message more clear and disable some error's stacktrace by default
https://github.com/apache/doris/pull/39255 fix random function coredump
https://github.com/apache/doris/pull/39324 fix function corr
inconsistency with doc
https://github.com/apache/doris/pull/39449 check auto partitoin nullity
when creating partition
https://github.com/apache/doris/pull/39695 make
DynamicPartitionScheduler immediately know interval's change
https://github.com/apache/doris/pull/39754 Add some partition expr check
on creating table
2024-08-24 17:26:42 +08:00
564d3cd647 [Performance](opt) opt the order by performance in permutation (#39089)
## Proposed changes

Issue Number: cherry pick #38985

<!--Describe your changes.-->
2024-08-24 16:05:46 +08:00
a6f267c479 [pick](Variant) fix element_at should return nullable if result type is nullable (#39846)
#39732
2024-08-24 09:22:03 +08:00
ae4d747c13 [branch-2.1](memory) Modify memory gc conf and add crash_in_alloc_large_memory_bytes (#39834)
pick #39611
2024-08-24 09:21:35 +08:00
8cf6c6a2b5 [fix](agg function) incorrect result of map agg(#39743) (#39854)
## Proposed changes

pick #39743
2024-08-24 09:19:52 +08:00
3103bb08dc [pick](Variant) casting to decimal type may lost precision (#39843)
#39650
2024-08-23 22:47:32 +08:00
37443aa7e1 [improve](move-memtable) reuse connection in load_stream_stub (#39231) (#39762)
backport #39231
2024-08-23 22:46:28 +08:00
5b124a03ba [enhancement](err-msg) Add detailed column and schema info when failed to create a column iterator (#38689) (#39861)
As title.
2024-08-23 21:39:19 +08:00
6ceb574aa0 [branch-2.1]Pick IO limit/workload group usage table (#39839) 2024-08-23 18:51:47 +08:00
baf5b71b39 [branch-2.1](memory) Modify thedefault JEMALLOC_CONF and support flush Jemalloc tcache (#39829)
pick #38185
2024-08-23 17:21:42 +08:00
c40246efa9 [bugfix](iceberg)Fixed random core with writing iceberg partitioned table for 2.1 (#39808)(#39569) (#39832)
## Proposed changes

bp: #39808 #39569
2024-08-23 17:19:48 +08:00
7a7292ad5a [branch-2.1][Refactor]use async to get be resource (#38389) (#39826)
pick #38389
2024-08-23 17:16:19 +08:00
e03b887a97 [opt](MultiCast) Avoid copying while holding a lock (#37462) (#39816)
Previously, copying was done while holding a lock; Now, get block while
holding the lock and then copy
https://github.com/apache/doris/pull/37462
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-23 17:15:34 +08:00
0934fbee7e [improvement](query) prefer to chose tablet on alive disk #39467 (#39654)
cherry pick from #39467
2024-08-23 12:23:12 +08:00
1f16daa5f6 Revert "[bugfix](iceberg)clear block for partition values for 2.1 (#39569)" (#39815)
Reverts apache/doris#39729
2024-08-23 11:58:42 +08:00
9d5468d198 [branch-2.1](memory) BE memory info compatible with CgroupV2 (#39799)
pick #39256
2024-08-23 02:03:00 +08:00
1367f74e7a [branch-2.1](memory) Optimize ClearCacheActionimplementation (#39796)
pick #38438
2024-08-23 01:51:14 +08:00
0f8bd33077 [fix](scan) fix predicate contains cast that results in null, the pr… (#39809)
…edicate will be miss. (#39550)
https://github.com/apache/doris/pull/39550
```
drop table datetest;

create table datetest (
  id int,
  dt date
)
DUPLICATE key (id)
distributed by hash(id) buckets 1
properties(
  "replication_num" = "1"
);
insert into datetest values (1, '2024-01-01');

mysql [test10]>select dt from datetest  WHERE dt = 1 ;
+------------+
| dt         |
+------------+
| 2024-01-01 |
+------------+
```

now

```
mysql [test10]>select dt from datetest  WHERE dt = 1 ;
Empty set (0.16 sec)
```

<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-23 01:46:22 +08:00
dc732fe33f [bugfix](iceberg)clear block for partition values for 2.1 (#39569) (#39729)
## Proposed changes

bp: #39569

clear block, or we will get wrong partition values.
2024-08-22 22:43:02 +08:00
06a0b35704 [chore] Comment for tv_nsec (#39752)
just comment.
2024-08-22 22:16:21 +08:00
04e993c1de [refine](pipeline) refine some VDataStreamRecvr code (#35063) (#37802)
## Proposed changes
https://github.com/apache/doris/pull/35063
https://github.com/apache/doris/pull/35428
2024-08-22 19:55:17 +08:00
13b882a4cc [branch-2.1](memory) Add memory metrics to bvar (#39763)
pick #38391
2024-08-22 17:34:30 +08:00
8ce8887b75 [branch-2.1](memory) Refactor refresh workload groups weighted memory ratio and record refresh interval memory growth (#39760)
pick #38168
overwrites changes in #37221 on workload_group_manager.cpp. If need to
pick 37221, ignore it.
2024-08-22 17:33:11 +08:00
ba7baa7e6b [fix](window_funnel) fix upgrading problem caused by behaviour change of window_funnel (#39766)
## Proposed changes

Issue Number: close #xxx

For the latest published 2.1 version `2.1.5`:
```
max_be_exec_version=4;
AGG_FUNCTION_NEW=2;
```
and `branch-2.1`:
```
max_be_exec_version=5;
AGG_FUNCTION_NEW=2;
```
It will cause problem when upgrading.
This PR fix the problem, set `AGG_FUNCTION_NEW` to `5`.
2024-08-22 17:26:51 +08:00
1c566253a8 [Pick][Improment]Query queued by be memory (#37559) (#39733)
pick #37559
2024-08-22 15:14:47 +08:00
a55e109e97 [pick][Improment]Add schema table workload_group_privileges (#38436) (#39708)
pick #38436
2024-08-22 00:44:43 +08:00
0e694f19db [fix](merge-on-write) segcompaction should process delete bitmap if necessary (#38369) (#39707)
## Proposed changes

Issue Number: close #xxx

cherry-pick #38369 and #38800
2024-08-22 00:42:56 +08:00
935d0eb110 [cherry-pick](branch-2.1) [Bug](http-api) fix core dump on API check_rpc_channel coz exec_env not initialized #39519 (#39692)
…rpc_channel coz exec_env not initialized #39519

## Proposed changes

Issue Number: close #xxx
backport #39519  #39520
2024-08-22 00:42:12 +08:00
56cc9cc304 [fix](cancel)) Fix pipeline task leak cancel (#39697)
pick #39737
2024-08-22 00:40:22 +08:00
Pxl
1e47d11560 [Improvement](runtime-filter) send RUNTIME_BLOOM_FILTER_MAX_SIZE to backends (#39686)
…ackends (#38972)

## Proposed changes
pick from #38972
2024-08-22 00:37:25 +08:00
Pxl
5e91fc6a8f [Bug](runtime-filter) set inited to true on BloomFilterFuncBase::assi… (#39674)
…gn (#39335)

## Proposed changes
pick from #39335
2024-08-22 00:29:16 +08:00
Pxl
63d45f5d89 [Bug](predicate) fix wrong result of AcceptNullPredicate (#39497) (#39672)
pick from #39497
2024-08-22 00:24:57 +08:00
e51dd68b93 [fix](local shuffle) Fix correctness for bucket hash shuffle exchange… (#39691)
…r (#39568)

For query plan


![image](https://github.com/user-attachments/assets/334cc4c4-49ae-4330-83ff-03b9bae00e3c)

we will plan local exchangers  and get a new plan


![image](https://github.com/user-attachments/assets/2b8ece64-3aa0-423c-9db0-fd02024957db)

and the hash join operator will get probe and build data which are
different distributed (one is HASH shuffle and another is Bucket hash
shuffle). This PR fix it.
<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
2024-08-22 00:23:39 +08:00
610f69432a [improvement](segmentcache) limit segment cache by fd limit or memory… (#39689)
… (#39658)

remove a useless config.
2024-08-21 15:19:52 +08:00
1e30d4ebaf Revert "[Bug](compatibility) fix window funnel function coredump when upgrade" (#39681)
Reverts apache/doris#39646
2024-08-21 14:47:27 +08:00
0bfcee1251 [opt](file-cache) support system table file_cache_statistics (#39552)
1. Add new system table: `file_cache_statistics`

	This table is used for viewing metrics related to file cache on BE side

	```
	mysql> select * from information_schema.file_cache_statistics limit 10;

+-------+---------------+----------------------------+--------------------------------+--------------------+
| BE_ID | BE_IP | CACHE_PATH | METRIC_NAME | METRIC_VALUE |

+-------+---------------+----------------------------+--------------------------------+--------------------+
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_curr_elements | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_curr_size | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_max_elements | 102400 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_max_size | 21474836480 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio |
0.8539634687001242 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio_1h | 0
|
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio_5m | 0
|
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_curr_elements | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_curr_size | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_max_elements | 102400 |

+-------+---------------+----------------------------+--------------------------------+--------------------+
	```

	It will show metrics of file caches on each BE.

2. Add new metrics `hits_ratio_1h` and `hits_ratio_5m` for file cache

This 2 metrics will show the hit ratio of file cache in recent 1 hour or
5 minutes.
So that we can know recent hit ratio instead of global historical hit
ratio.
2024-08-21 10:03:39 +08:00
bb687bd69c [cherry-pick](branch-2.1) add function regexp_extract_or_null (#39561)
# Proposed changes

pick https://github.com/apache/doris/pull/38296
2024-08-21 09:14:58 +08:00