doris

Author	SHA1	Message	Date
yiguolei	d3bf23d70d	[chore](removelogs) remove debug query timeout logs (#30006 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-01-16 18:48:18 +08:00
yiguolei	5e697990a8	[bugfix](timeout) serving_blocks_num may cause timeout, try to fix it (#29912 ) Although serving_blocks_num is an atomic variable. It's ++ and -- are not protected by transfer lock. I am not sure the memory order of ++ and --. I think it maybe the root cause of query timeout. So that I remove the check and test it in github pipeline.	2024-01-16 18:34:19 +08:00
yiguolei	e4e57e9b05	[chore](removelogs) remove debug query timeout logs	2024-01-12 14:37:20 +08:00
yiguolei	ca75c9b8ab	add more logs to debug timeout Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-01-12 11:48:39 +08:00
yiguolei	abb7640d37	[debug](timeout) add more log in scanner ctx to find timeout problem #29704 Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-01-12 11:44:21 +08:00
yiguolei	9ef4e49307	[bugfix](scannerdeadloop) there is a dead loop in scanner ctx (#29794 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-01-11 16:47:54 +08:00
yiguolei	bd8113f424	[bugfix](scannerscheduler) should minus num_of_scanners before check should schedule #28926 (#29331 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-01-03 20:47:35 +08:00
yiguolei	2ed122b787	[improvement](task exec context) add parent class HasTaskExecutionCtx to own the task ctx (#29388 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-01-02 15:28:27 +08:00
Gabriel	c75e63a2a5	[Improvement](scan) Use scanner to do projection of scan node (#29124 )	2023-12-27 16:00:52 +08:00
Ashin Gau	f30e50676e	[opt](scanner) optimize the number of threads of scanners (#28640 ) 1. Remove `doris_max_remote_scanner_thread_pool_thread_num`, use `doris_scanner_thread_pool_thread_num` only. 2. Set the default value `doris_scanner_thread_pool_thread_num` as `std::max(48, CpuInfo::num_cores() * 4)`	2023-12-26 10:24:12 +08:00
caiconghui	7081139bdc	[fix](block) fix be core while mutable block merge may cause different row size between columns in origin block (#27943 )	2023-12-25 20:35:22 +08:00
yiguolei	1545c36d16	Revert "[bugfix](scannercore) scanner will core in deconstructor during collect profile (#28727 )" (#28931 ) This reverts commit 4066de375efe6ff8e156a61df4f9316b3d9eaa4e.	2023-12-24 20:37:33 +08:00
yiguolei	4066de375e	[bugfix](scannercore) scanner will core in deconstructor during collect profile (#28727 )	2023-12-23 11:09:46 +08:00
wangbo	aca8406e31	[refactor](executor)remove scan group #28847	2023-12-22 17:05:50 +08:00
yiguolei	73f7b61019	[refactor](scanner) use weak ptr to lock task execution context to avoid core in scanner dctor (#28493 ) using weak ptr as a lock between fragment execute thread and scanner thread, to solve the core problem in scanner's dctor to access scannode's profile.	2023-12-18 14:09:32 +08:00
Mingyu Chen	8df94f0d07	[fix](remote-scanner-pool) missing _remote_thread_pool_max_size value (#28057 )	2023-12-07 11:18:42 +08:00
HappenLee	54fe1a166b	[Refactor](scan) refactor scan scheduler to improve performance (#27948 ) * [Refactor](scan) refactor scan scheduler to improve performance * fix pipeline x core	2023-12-05 13:03:16 +08:00
lihangyu	7398c3daf1	[Feature-Variant](Variant Type) support variant type query and index (#27676 )	2023-11-29 10:37:28 +08:00
walter	b457856bd2	[chore](be) remove bthread scanner related codes (#27417 )	2023-11-23 15:18:49 +08:00
Qi Chen	0491437a86	[Opt](scanner-scheduler) Optimize `BlockingQueue`, `BlockingPriorityQueue` and change remote scan thread pool. (#26784 ) ## Proposed changes - Optimize `BlockingQueue`, `BlockingPriorityQueue` by swapping `notify` and `unlock` to reduce lock competition. Ref: https://www.boost.org/doc/libs/1_54_0/boost/thread/sync_bounded_queue.hpp - Change remote scan thread pool to `PriorityQueue`. ### Test result Before: ``` mysql> select sum(lo_partkey) from lineorder; +-----------------+ \| sum(lo_partkey) \| +-----------------+ \| 300021444265405 \| +-----------------+ 1 row in set (1.11 sec) ``` After: ``` mysql> select sum(lo_partkey) from lineorder; +-----------------+ \| sum(lo_partkey) \| +-----------------+ \| 300021444265405 \| +-----------------+ 1 row in set (0.80 sec) ```	2023-11-15 18:24:36 +08:00
Yongqiang YANG	5ad49dceaa	[fix](scanner_schedule) scanner hangs due to negative num_running_scanners (#26816 ) * [fix] scanner hangs due to negative num_running_scanners Before the patch, num_running_scanners is increased after submitting, then it may be decreased before increasing then negative values can be seen by get_block_from_queue and a expected submit does not happend. Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>	2023-11-13 23:03:49 +08:00
Mingyu Chen	66054a5c78	[opt](scanner) increase the connection num of s3 client (#26795 )	2023-11-12 00:29:11 -06:00
Siyang Tang	196fadc044	[enhancement](metrics) enhance visibility of flush thread pool (#26544 )	2023-11-11 19:53:24 +08:00
zhiqiang	a5565f68b2	[Refactor](opentelemetry) Remove opentelemetry (#26605 )	2023-11-09 18:05:34 +08:00
daidai	a4e415ab09	[feature](hive)Support hive tables after alter type. (#25138 ) 1.Reconstruct the logic of decode to read parquet. The parquet reader first reads the data according to the parquet physical type, and then performs a type conversion. 2.Support hive alter table.	2023-11-02 00:24:21 +08:00
wangbo	46d40b1952	[refactor](executor)Remove empty group logic #26005	2023-10-27 14:24:41 +08:00
wangbo	54780c62e0	[improvement](executor)Using cgroup to implement cpu hard limit (#25489 ) * Using cgroup to implement cpu hard limit * code style	2023-10-19 18:56:26 +08:00
wangbo	7b2ff38401	query cpu hard limit based on doris scheduler (#24844 )	2023-10-07 12:03:07 +08:00
bobhan1	642e5cdb69	[Fix](Status) Make `Status` `[[nodiscard]]` and handle returned `Status` correctly (#23395 )	2023-09-29 22:38:52 +08:00
xy720	39e6512a21	[bug](scanner) Fix memory out of bound in scanner scheduler (#24840 )	2023-09-25 09:58:26 +08:00
wangbo	c9b2f4cb92	[workload](pipeline) Add cgroup cpu controller (#24052 )	2023-09-21 21:49:33 +08:00
lihangyu	1405b7ca82	[improve](scan) support lower the thread priority of scan thread (#24526 ) The configuration item is used to lower the priority of the scanner thread, typically employed to ensure CPU scheduling for write operations.	2023-09-20 17:00:24 +08:00
Yongqiang YANG	71dcb58db9	[improvement](scanner_schedule) reduce memory consumption of scanner (#24199 ) * [improvement](scanner_schedule) reduce memory consumption of scanner 1. limit scanner by memory consumptin rather than blocks. 2. scheduler run correcty instread of at lest 1.	2023-09-19 21:36:23 +08:00
Pxl	35c5d71549	[Improvement](join) some improvement of hash join (#23972 ) some improvement of hash join	2023-09-14 17:55:35 +08:00
zhiqqqq	c7ae2a7d22	[Refactor & Bugfix](static variables) move some static vairables to exec_env (#24029 )	2023-09-13 09:27:03 +08:00
Gabriel	3317909141	[pipelineX](join) support nested loop join operator (#23756 )	2023-09-04 10:08:22 +08:00
TengJianPing	962221cb18	[test](log) add log for debug case failure (#23506 )	2023-08-28 10:45:25 +08:00
Pxl	cf1865a1c8	[Bug](scan) fix core dump due to store_path_map (#23084 ) fix core dump due to store_path_map	2023-08-17 15:24:43 +08:00
Pxl	7839a0e708	[Bug](brpc) fix brpc failed on big query came concurrently (#22600 ) fix PriorityThreadPool get_info get wrong number change brpc pool from priority to fifo do not use brpc pool when send eos	2023-08-05 21:24:32 +08:00
Xinyi Zou	bc87002028	[opt](conf) remote scanner thread num is changed to core num * 10 (#22427 )	2023-08-01 23:09:49 +08:00
daidai	ae8a26335c	[opt](hive)opt select count() stmt push down agg on parquet in hive . (#22115 ) Optimization "select count() from table" stmtement , push down "count" type to BE. support file type : parquet ，orc in hive . 1. 4kfiles , 60kwline num before: 1 min 37.70 sec after: 50.18 sec 2. 50files , 60kwline num before: 1.12 sec after: 0.82 sec	2023-07-29 00:31:01 +08:00
Pxl	4171309b9b	[Bug](scanner) fix core dump due to release ScannerContext too early #21946	2023-07-19 00:53:23 +08:00
Pxl	b3d3ffa2de	[Bug](pipeline) adjust scanner scheduler.submit and _num_scheduling_ctx maintain (#21843 ) adjust scanner scheduler.submit and _num_scheduling_ctx maintain	2023-07-18 11:55:21 +08:00
wangbo	d3317aa33b	[Fix](executor)Fix scan entity core #21696 After the last time to call scan_task.scan_func()，the should be ended, this means PipelineFragmentContext could be released. Then after PipelineFragmentContext is released, visiting its field such as query_ctx or _state may cause core dump. But it can only explain core 2 void ScannerScheduler::_task_group_scanner_scan(ScannerScheduler* scheduler, taskgroup::ScanTaskTaskGroupQueue* scan_queue) { while (!_is_closed) { taskgroup::ScanTask scan_task; auto success = scan_queue->take(&scan_task); if (success) { int64_t time_spent = 0; { SCOPED_RAW_TIMER(&time_spent); scan_task.scan_func(); } scan_queue->update_statistics(scan_task, time_spent); } } }	2023-07-11 15:56:13 +08:00
airborne12	009b300abd	[Fix](ScannerScheduler) fix dead lock when shutdown group_local_scan_thread_pool (#21553 )	2023-07-06 13:09:37 +08:00
Lijia Liu	76bdcf1d26	[improvement](pipeline) task group scan entity (#19924 )	2023-06-25 14:43:35 +08:00
Gabriel	81abdeffbc	[Improvement](pipeline) Improve shared scan performance (#20785 )	2023-06-21 14:36:05 +08:00
Mingyu Chen	0c98355fff	[fix](catalog) fix create catalog with resource replay issue and kerberos auth issue (#20137 ) 1. Fix create catalog with resource replay bug. If user create catalog using `create catalog hive with resource xxx`, when replaying edit log, there is a bug that resource may be dropped, causing NPE and FE will fail to start. In this PR, I add a new FE config `disallow_create_catalog_with_resource`, default is true. So that `with resource` will not be allowed, and it will be deprecated later. And also fix the replay bug to avoid NPE. 2. Fix issue when creating 2 hive catalogs to connect with and without kerberos authentication. When user create 2 hive catalogs, one use simple auth, the other use kerberos auth. The query may fail with error like: `Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.` So I add a default property for hive catalog: `"ipc.client.fallback-to-simple-auth-allowed" = "true"`. Which means this property will be added automatically when user creating hive catalog, to avoid such problem. 3. Fix calling `hdfsExists()` issue When calling `hdfsExists()` with non-zero return code, should check if it encounters error or is file not found. 3. Some code refactor Avoid import `org.apache.parquet.Strings`	2023-05-30 16:57:39 +08:00
yiguolei	1d421a26d9	[bugfix](memory) merge block may allocate failed (#19507 )	2023-05-11 10:42:47 +08:00
Xinyi Zou	cf8ceb8586	[fix](scan) fix scanner mem tracker (#19354 )	2023-05-10 09:56:41 +08:00

1 2

78 Commits