Commit Graph

8181 Commits

Author SHA1 Message Date
06a0b35704 [chore] Comment for tv_nsec (#39752)
just comment.
2024-08-22 22:16:21 +08:00
04e993c1de [refine](pipeline) refine some VDataStreamRecvr code (#35063) (#37802)
## Proposed changes
https://github.com/apache/doris/pull/35063
https://github.com/apache/doris/pull/35428
2024-08-22 19:55:17 +08:00
13b882a4cc [branch-2.1](memory) Add memory metrics to bvar (#39763)
pick #38391
2024-08-22 17:34:30 +08:00
8ce8887b75 [branch-2.1](memory) Refactor refresh workload groups weighted memory ratio and record refresh interval memory growth (#39760)
pick #38168
overwrites changes in #37221 on workload_group_manager.cpp. If need to
pick 37221, ignore it.
2024-08-22 17:33:11 +08:00
ba7baa7e6b [fix](window_funnel) fix upgrading problem caused by behaviour change of window_funnel (#39766)
## Proposed changes

Issue Number: close #xxx

For the latest published 2.1 version `2.1.5`:
```
max_be_exec_version=4;
AGG_FUNCTION_NEW=2;
```
and `branch-2.1`:
```
max_be_exec_version=5;
AGG_FUNCTION_NEW=2;
```
It will cause problem when upgrading.
This PR fix the problem, set `AGG_FUNCTION_NEW` to `5`.
2024-08-22 17:26:51 +08:00
1c566253a8 [Pick][Improment]Query queued by be memory (#37559) (#39733)
pick #37559
2024-08-22 15:14:47 +08:00
a55e109e97 [pick][Improment]Add schema table workload_group_privileges (#38436) (#39708)
pick #38436
2024-08-22 00:44:43 +08:00
0e694f19db [fix](merge-on-write) segcompaction should process delete bitmap if necessary (#38369) (#39707)
## Proposed changes

Issue Number: close #xxx

cherry-pick #38369 and #38800
2024-08-22 00:42:56 +08:00
935d0eb110 [cherry-pick](branch-2.1) [Bug](http-api) fix core dump on API check_rpc_channel coz exec_env not initialized #39519 (#39692)
…rpc_channel coz exec_env not initialized #39519

## Proposed changes

Issue Number: close #xxx
backport #39519  #39520
2024-08-22 00:42:12 +08:00
56cc9cc304 [fix](cancel)) Fix pipeline task leak cancel (#39697)
pick #39737
2024-08-22 00:40:22 +08:00
Pxl
1e47d11560 [Improvement](runtime-filter) send RUNTIME_BLOOM_FILTER_MAX_SIZE to backends (#39686)
…ackends (#38972)

## Proposed changes
pick from #38972
2024-08-22 00:37:25 +08:00
Pxl
5e91fc6a8f [Bug](runtime-filter) set inited to true on BloomFilterFuncBase::assi… (#39674)
…gn (#39335)

## Proposed changes
pick from #39335
2024-08-22 00:29:16 +08:00
Pxl
63d45f5d89 [Bug](predicate) fix wrong result of AcceptNullPredicate (#39497) (#39672)
pick from #39497
2024-08-22 00:24:57 +08:00
e51dd68b93 [fix](local shuffle) Fix correctness for bucket hash shuffle exchange… (#39691)
…r (#39568)

For query plan


![image](https://github.com/user-attachments/assets/334cc4c4-49ae-4330-83ff-03b9bae00e3c)

we will plan local exchangers  and get a new plan


![image](https://github.com/user-attachments/assets/2b8ece64-3aa0-423c-9db0-fd02024957db)

and the hash join operator will get probe and build data which are
different distributed (one is HASH shuffle and another is Bucket hash
shuffle). This PR fix it.
<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
2024-08-22 00:23:39 +08:00
610f69432a [improvement](segmentcache) limit segment cache by fd limit or memory… (#39689)
… (#39658)

remove a useless config.
2024-08-21 15:19:52 +08:00
1e30d4ebaf Revert "[Bug](compatibility) fix window funnel function coredump when upgrade" (#39681)
Reverts apache/doris#39646
2024-08-21 14:47:27 +08:00
0bfcee1251 [opt](file-cache) support system table file_cache_statistics (#39552)
1. Add new system table: `file_cache_statistics`

	This table is used for viewing metrics related to file cache on BE side

	```
	mysql> select * from information_schema.file_cache_statistics limit 10;

+-------+---------------+----------------------------+--------------------------------+--------------------+
| BE_ID | BE_IP | CACHE_PATH | METRIC_NAME | METRIC_VALUE |

+-------+---------------+----------------------------+--------------------------------+--------------------+
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_curr_elements | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_curr_size | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_max_elements | 102400 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_max_size | 21474836480 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio |
0.8539634687001242 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio_1h | 0
|
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio_5m | 0
|
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_curr_elements | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_curr_size | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_max_elements | 102400 |

+-------+---------------+----------------------------+--------------------------------+--------------------+
	```

	It will show metrics of file caches on each BE.

2. Add new metrics `hits_ratio_1h` and `hits_ratio_5m` for file cache

This 2 metrics will show the hit ratio of file cache in recent 1 hour or
5 minutes.
So that we can know recent hit ratio instead of global historical hit
ratio.
2024-08-21 10:03:39 +08:00
bb687bd69c [cherry-pick](branch-2.1) add function regexp_extract_or_null (#39561)
# Proposed changes

pick https://github.com/apache/doris/pull/38296
2024-08-21 09:14:58 +08:00
7bb83ae379 [cherry-pick](branch-21) fix append_data_by_selector_impl reserve too mush useless memory (#39581) (#39635)
## Proposed changes

cherry-pick from master #39581
2024-08-21 08:47:30 +08:00
75eded04d7 [Bug](compatibility) fix window funnel function coredump when upgrade (#39646)
## Proposed changes
this PR https://github.com/apache/doris/pull/39270 have change the agg
of window funnel
and max_be_exec_version is update to 5, in order to compatibility of the
agg function when upgrade.

<!--Describe your changes.-->
2024-08-21 08:46:50 +08:00
a3fd13fee6 [fix](catalog) set timeout for split fetch (#39346) (#39624)
bp #39346
2024-08-20 21:59:55 +08:00
12ed2951c4 [fix] (inverted index) remove tmp columns in block (#39369) (#39533) 2024-08-20 20:53:23 +08:00
5fcd6e6270 [Fix](load) Fix the incorrect src value printed in the error log when strict mode is true #39447 (#39587)
cherry pick from #39447
2024-08-20 12:02:13 +08:00
3922fdddb6 [cherry-pick](branch-2.1) Pick "[Fix](core) Fix wal mgr heap use after free when stop doris (#33131)" (#39545)
Pick #33131
2024-08-19 22:12:09 +08:00
85f97a745a [fix](s3) Fix fmt in s3 file wirter S3FileWriter::_dump_completed_part OOM (#39562) 2024-08-19 22:02:06 +08:00
fb17f204d7 [fix](http) fix http url with incorrect character notation (#38420) (#39535)
## Proposed changes

pick from master #38420
2024-08-19 15:03:19 +08:00
830f250a80 [opt](query cancel) cancel query if it has pipeline task leakage #39223 (#39537)
pick #39223 with some modifications. Optimization will only be applied
to pipeline x.
2024-08-19 14:33:59 +08:00
273a62584c [opt](inverted index) unified optimization judgment to prevent omissions (#39473)
https://github.com/apache/doris/pull/38027
2024-08-17 16:57:19 +08:00
0ff2a9df15 [fix](delete) Only apply regex check on delete condition str for non-lsc tables (#39357) (#39500)
## Proposed changes

Light schema change capable tables will work on delete sub predicate v2
and doesn't need this check.
2024-08-17 16:56:37 +08:00
b0da8430bc [opt](inverted index) Optimize the usage of the multi_match function (#39472)
## Proposed changes

https://github.com/apache/doris/pull/39193

<!--Describe your changes.-->
2024-08-17 16:53:52 +08:00
79aa079cc6 [fix](array-funcs) array min/max #39307 (#39484) 2024-08-17 10:56:44 +08:00
7687f2c53a [fix](ip-funcs) fix ip inet6_aton funcs #39415 (#39513) 2024-08-17 10:56:06 +08:00
824f035b98 [pick](Row store) fix row store with invalid json string in variant ty… (#39456)
#39394
2024-08-16 14:43:11 +08:00
eea3676791 [fix](group commit) fix group commit insert rpc may stuck (#39391) (#39458)
pick https://github.com/apache/doris/pull/39391
2024-08-16 13:19:00 +08:00
021678c7c3 [fix](window_funnel) fix wrong result of window_funnel #38954 (#39270)
## Proposed changes

BP #38954
2024-08-16 09:59:31 +08:00
6257e706fa [improve](ip)update ip for bloom_filter (#39414)
## Proposed changes
backport: https://github.com/apache/doris/pull/39253
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-16 08:20:19 +08:00
fff26fe2fc [fix](group commit) fix group commit core if be inject FragmentMgr.exec_plan_fragment.failed (#39339) (#39396)
pick https://github.com/apache/doris/pull/39339
2024-08-15 17:54:11 +08:00
0680c8d314 [improve](cache) File cache async init (#39036)
## Proposed changes

Do `load_cache_info_into_memory()` asynchronously in a background thread
in `LRUFileCache::initialize()`.
When the cache is not ready, `LRUFileCache::get_or_set()` will return
the FileBlock which state is SKIP_CACHE.
2024-08-15 16:27:51 +08:00
a44a274563 [Fix](parquet-reader) Fix and optimize parquet min-max filtering. (#39375)
Backport #38277.
2024-08-15 14:12:54 +08:00
acf07cab6f [refactor](minor) Init counter in prepare phase (#39287) (#39385)
pick #39287
2024-08-15 13:36:12 +08:00
226e01889c [fix](array_apply) pick array apply fix (#39328)
## Proposed changes
backport: https://github.com/apache/doris/pull/39105
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-14 18:52:29 +08:00
78d6e318fb [fix](ip)pick ip rowstore (#39345)
## Proposed changes
backport: https://github.com/apache/doris/pull/39258
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-14 18:51:58 +08:00
b26af32934 [fix](function) fix error return type in corr(float32,float32) (#39251) (#39350)
https://github.com/apache/doris/pull/39251
```
mysql [test11]>select corr(cast(x as float),cast(y as float)) from test_corr;
ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INTERNAL_ERROR]column_type not match data_types in agg node, column_type=Nullable(Float64), data_types=Nullable(Float32),column name=

```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-14 18:47:14 +08:00
187461e2fd [Fix](Export) Export delete multiple times when specify the delete_existing_files property () (#39304)
bp: #38400

When the `Export` statement specifies the `delete_existing_files`
property, each `Outfile` statement generated by the `Export` will carry
this property. This causes each `Outfile` statement to delete existing
files, so only the result of the last Outfile statement will be
retained.

So, we add a rpc method which can delete existing files for `Export`
statement and the `Outfile` statements generated by the `Export` will
not carry `delete_existing_files` property any more.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-13 22:26:02 +08:00
Pxl
6f1d9812bb [Bug](brpc) fix sync_filter_size/apply_filterv2 has wrong closure (#39299)
pick from #39155
2024-08-13 19:01:22 +08:00
677435cef8 [Pick](Branch-2.1) pick json reader fix and support specify $. as column (#39271)
#39206
#38213
2024-08-13 17:44:45 +08:00
7e7729c4b0 [cherry-pick](branch-21) fix partition-topn calculate partition input rows have error (#39100) (#39281)
## Proposed changes

cherry-pick from master: #39100 

<!--Describe your changes.-->
2024-08-13 17:42:29 +08:00
c9949f24e5 [fix](compaction) fix the longest continuous rowsets cannot be selected when missing rowsets (#38728) (#39262)
pick master #38728
2024-08-13 17:41:11 +08:00
b976cbf14d [opt](log) avoid lots of json parse error logs (#39190) (#39246)
pick https://github.com/apache/doris/pull/39190 to branch-2.1
2024-08-13 17:01:10 +08:00
000ea20562 [fix](inverted index)Add exception check when write bkd index (#39248) (#39277)
bp #39248
2024-08-13 15:14:16 +08:00