87912de93f
[fix](scan) catch exceptions thrown in scanner ( #36101 ) ( #37408 )
...
## Proposed changes
pick #36101
The uncaught exceptions thrown in the scanner will cause the BE to
crash.
2024-07-12 08:49:39 +08:00
d97788dec8
[Refactor](Status) Refactor the scanner scheduler code make return error msg means ( #35286 )
...
## Proposed changes
Before error msg:
```
Failed to submit scanner to scanner pool
```
After error msg:
```
Failed to submit scanner to scanner pool reason:Scan thread pool had shutdown|type 1
```
2024-05-28 18:49:55 +08:00
eb49cd839b
[refactor](datalake) return the error status instead of static_cast<void> ( #34873 )
...
Followup #34797
`static_cast<void>` has ignored the wrong status, some of them should make the query finished with error status, so replace `static_cast<void>` with `RETURN_IF_ERROR`.
The following three scenarios need to be handled separately and cannot be simply replaced:
1. The outer function returns void;
2. Call status function inner constructors or destructors;
3. Call status function with best effort, and should ignore the wrong status.
2024-05-23 19:06:21 +08:00
47b54d4bd5
Fix remote scan pool ( #33976 )
2024-04-25 15:04:43 +08:00
6600e92b12
[scan](status) Finish execution if scanner failed ( #32966 )
2024-03-29 10:51:15 +08:00
352617a34d
[fix](scanner) cached blocks may be empty when VFileScanner return NOT_FOUND ( #32745 )
...
Cached blocks may be empty when VFileScanner return NOT_FOUND. This feature is introduced by https://github.com/apache/doris/pull/15226 . Move this function inner `VFileScanner`.
2024-03-27 10:01:05 +08:00
6e62017ed5
[fix](scanner) allocated_bytes should be called after success ( #31428 )
...
allocated_bytes should be called after success
2024-02-27 10:12:36 +08:00
c34639245e
[Improvement](executor)add remote scan thread pool ( #31376 )
...
* add remote scan thread pool
* +1
2024-02-27 10:12:33 +08:00
35333d7a77
[opt](scanner) scan enough blocks in each scan task ( #31277 )
2024-02-27 10:12:18 +08:00
97c9d75af3
[Feature](executor)Add scan_thread_num property for workload group ( #31106 )
2024-02-20 16:24:05 +08:00
366a6792bf
[refactor](scanner) refactoring and optimizing scanner scheduling ( #30746 )
2024-02-16 10:12:24 +08:00
bedad15f03
[enhancement](scanner) add a lower bound for bytes in scanner queue ( #29624 )
2024-01-27 09:13:21 +08:00
d3bf23d70d
[chore](removelogs) remove debug query timeout logs ( #30006 )
...
---------
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-16 18:48:18 +08:00
5e697990a8
[bugfix](timeout) serving_blocks_num may cause timeout, try to fix it ( #29912 )
...
Although serving_blocks_num is an atomic variable. It's ++ and -- are not protected by transfer lock.
I am not sure the memory order of ++ and --.
I think it maybe the root cause of query timeout. So that I remove the check and test it in github pipeline.
2024-01-16 18:34:19 +08:00
e4e57e9b05
[chore](removelogs) remove debug query timeout logs
2024-01-12 14:37:20 +08:00
ca75c9b8ab
add more logs to debug timeout
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-12 11:48:39 +08:00
abb7640d37
[debug](timeout) add more log in scanner ctx to find timeout problem #29704
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-12 11:44:21 +08:00
9ef4e49307
[bugfix](scannerdeadloop) there is a dead loop in scanner ctx ( #29794 )
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-11 16:47:54 +08:00
bd8113f424
[bugfix](scannerscheduler) should minus num_of_scanners before check should schedule #28926 ( #29331 )
...
---------
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-03 20:47:35 +08:00
2ed122b787
[improvement](task exec context) add parent class HasTaskExecutionCtx to own the task ctx ( #29388 )
...
---------
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-02 15:28:27 +08:00
c75e63a2a5
[Improvement](scan) Use scanner to do projection of scan node ( #29124 )
2023-12-27 16:00:52 +08:00
f30e50676e
[opt](scanner) optimize the number of threads of scanners ( #28640 )
...
1. Remove `doris_max_remote_scanner_thread_pool_thread_num`, use `doris_scanner_thread_pool_thread_num` only.
2. Set the default value `doris_scanner_thread_pool_thread_num` as `std::max(48, CpuInfo::num_cores() * 4)`
2023-12-26 10:24:12 +08:00
7081139bdc
[fix](block) fix be core while mutable block merge may cause different row size between columns in origin block ( #27943 )
2023-12-25 20:35:22 +08:00
1545c36d16
Revert "[bugfix](scannercore) scanner will core in deconstructor during collect profile ( #28727 )" ( #28931 )
...
This reverts commit 4066de375efe6ff8e156a61df4f9316b3d9eaa4e.
2023-12-24 20:37:33 +08:00
4066de375e
[bugfix](scannercore) scanner will core in deconstructor during collect profile ( #28727 )
2023-12-23 11:09:46 +08:00
aca8406e31
[refactor](executor)remove scan group #28847
2023-12-22 17:05:50 +08:00
73f7b61019
[refactor](scanner) use weak ptr to lock task execution context to avoid core in scanner dctor ( #28493 )
...
using weak ptr as a lock between fragment execute thread and scanner thread, to solve the core problem in scanner's dctor to access scannode's profile.
2023-12-18 14:09:32 +08:00
8df94f0d07
[fix](remote-scanner-pool) missing _remote_thread_pool_max_size value ( #28057 )
2023-12-07 11:18:42 +08:00
54fe1a166b
[Refactor](scan) refactor scan scheduler to improve performance ( #27948 )
...
* [Refactor](scan) refactor scan scheduler to improve performance
* fix pipeline x core
2023-12-05 13:03:16 +08:00
7398c3daf1
[Feature-Variant](Variant Type) support variant type query and index ( #27676 )
2023-11-29 10:37:28 +08:00
b457856bd2
[chore](be) remove bthread scanner related codes ( #27417 )
2023-11-23 15:18:49 +08:00
0491437a86
[Opt](scanner-scheduler) Optimize BlockingQueue, BlockingPriorityQueue and change remote scan thread pool. ( #26784 )
...
## Proposed changes
- Optimize `BlockingQueue`, `BlockingPriorityQueue` by swapping `notify` and `unlock` to reduce lock competition. Ref: https://www.boost.org/doc/libs/1_54_0/boost/thread/sync_bounded_queue.hpp
- Change remote scan thread pool to `PriorityQueue`.
### Test result
Before:
```
mysql> select sum(lo_partkey) from lineorder;
+-----------------+
| sum(lo_partkey) |
+-----------------+
| 300021444265405 |
+-----------------+
1 row in set (1.11 sec)
```
After:
```
mysql> select sum(lo_partkey) from lineorder;
+-----------------+
| sum(lo_partkey) |
+-----------------+
| 300021444265405 |
+-----------------+
1 row in set (0.80 sec)
```
2023-11-15 18:24:36 +08:00
5ad49dceaa
[fix](scanner_schedule) scanner hangs due to negative num_running_scanners ( #26816 )
...
* [fix] scanner hangs due to negative num_running_scanners
Before the patch, num_running_scanners is increased after submitting,
then it may be decreased before increasing then negative values can
be seen by get_block_from_queue and a expected submit does not happend.
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com >
2023-11-13 23:03:49 +08:00
66054a5c78
[opt](scanner) increase the connection num of s3 client ( #26795 )
2023-11-12 00:29:11 -06:00
196fadc044
[enhancement](metrics) enhance visibility of flush thread pool ( #26544 )
2023-11-11 19:53:24 +08:00
a5565f68b2
[Refactor](opentelemetry) Remove opentelemetry ( #26605 )
2023-11-09 18:05:34 +08:00
a4e415ab09
[feature](hive)Support hive tables after alter type. ( #25138 )
...
1.Reconstruct the logic of decode to read parquet. The parquet reader first reads the data according to the parquet physical type, and then performs a type conversion.
2.Support hive alter table.
2023-11-02 00:24:21 +08:00
46d40b1952
[refactor](executor)Remove empty group logic #26005
2023-10-27 14:24:41 +08:00
54780c62e0
[improvement](executor)Using cgroup to implement cpu hard limit ( #25489 )
...
* Using cgroup to implement cpu hard limit
* code style
2023-10-19 18:56:26 +08:00
7b2ff38401
query cpu hard limit based on doris scheduler ( #24844 )
2023-10-07 12:03:07 +08:00
642e5cdb69
[Fix](Status) Make Status [[nodiscard]] and handle returned Status correctly ( #23395 )
2023-09-29 22:38:52 +08:00
39e6512a21
[bug](scanner) Fix memory out of bound in scanner scheduler ( #24840 )
2023-09-25 09:58:26 +08:00
c9b2f4cb92
[workload](pipeline) Add cgroup cpu controller ( #24052 )
2023-09-21 21:49:33 +08:00
1405b7ca82
[improve](scan) support lower the thread priority of scan thread ( #24526 )
...
The configuration item is used to lower the priority of the scanner thread,
typically employed to ensure CPU scheduling for write operations.
2023-09-20 17:00:24 +08:00
71dcb58db9
[improvement](scanner_schedule) reduce memory consumption of scanner ( #24199 )
...
* [improvement](scanner_schedule) reduce memory consumption of scanner
1. limit scanner by memory consumptin rather than blocks.
2. scheduler run correcty instread of at lest 1.
2023-09-19 21:36:23 +08:00
35c5d71549
[Improvement](join) some improvement of hash join ( #23972 )
...
some improvement of hash join
2023-09-14 17:55:35 +08:00
c7ae2a7d22
[Refactor & Bugfix](static variables) move some static vairables to exec_env ( #24029 )
2023-09-13 09:27:03 +08:00
3317909141
[pipelineX](join) support nested loop join operator ( #23756 )
2023-09-04 10:08:22 +08:00
962221cb18
[test](log) add log for debug case failure ( #23506 )
2023-08-28 10:45:25 +08:00
cf1865a1c8
[Bug](scan) fix core dump due to store_path_map ( #23084 )
...
fix core dump due to store_path_map
2023-08-17 15:24:43 +08:00