doris

Author	SHA1	Message	Date
wangbo	469edbdd3d	[feature](executor)make scan task wait timeout config #28467	2023-12-16 11:36:15 +08:00
Gabriel	9fe2fce306	[minor](refactor) remove unused code (#28383 )	2023-12-14 17:16:41 +08:00
Ashin Gau	ec91dd1129	[opt](vfilescanner) interrupt running parquet/orc readers when scannode is finished (#28223 ) VScanNode::get_next will check whether the ScanNode has reached limit condition, and send eos to TaskScheduler, and TaskScheduler will try to close ScanNode. However, ScanNode must wait all running scanners finished, so even if ScanNode has reached limit condition, it can't be closed immediately. This PR try to interrupt the running readers, and make ScanNode to end as soon as possible.	2023-12-13 19:31:08 +08:00
zhangstar333	13b9350aeb	[Bug](scan)fix some case query timeout of not schedule scanner (#28243 ) now in pipeline, when result block queue is empty, will be reschedule, and then choose a batch of scanner, but sometimes, get_available_thread_slot_num() will return thread_slot_num <= 0, so it's will do nothing, and then block queue will always empty. have no chance to reschedule again until query timeout.	2023-12-12 21:00:22 +08:00
HappenLee	5ff110e845	[exec](profile) only build expr debug string enable profile (#28261 )	2023-12-12 09:13:37 +08:00
Gabriel	8f2202c89d	[minor](log) Add debug info in operators (#28211 )	2023-12-11 10:02:24 +08:00
Gabriel	9461e86b10	[pipelineX](debug) add debug string (#28137 ) * [pipelineX](debug) add debug string * update	2023-12-07 23:21:10 +08:00
Mingyu Chen	8df94f0d07	[fix](remote-scanner-pool) missing _remote_thread_pool_max_size value (#28057 )	2023-12-07 11:18:42 +08:00
wuwenchi	54d062ddee	[feature](stream load) (step one)Add arrow data type for stream load (#26709 ) By using the Arrow data format, we can reduce the streamload of data transferred and improve the data import performance	2023-12-06 23:29:46 +08:00
Gabriel	1be513b927	[pipelineX](local shuffle) Fix local shuffle for colocate/bucket join (#28032 )	2023-12-06 10:02:36 +08:00
zhangdong	6074cddcf8	[feature](mtmv)add Job and task tvf (#27967 ) add: select * from jobs("type"="mv"); select * from tasks("type"="mv"); select * from jobs("type"="insert"); select * from tasks("type"="insert"); add check priv for mv_infos("database"="xxx"); change JobType MTMV==>MV	2023-12-05 15:12:36 +08:00
HappenLee	54fe1a166b	[Refactor](scan) refactor scan scheduler to improve performance (#27948 ) * [Refactor](scan) refactor scan scheduler to improve performance * fix pipeline x core	2023-12-05 13:03:16 +08:00
Pxl	e3d2425d47	[Improvement](join) remove insert_indices_from_join and special judge for -1 (#27779 ) remove insert_indices_from_join and special judge for -1	2023-12-04 11:03:22 +08:00
zhangstar333	d2a99aa03b	[refactor](scan) change scan reschedule into scan context (#27766 ) * [refactor](scan) change scan reschedule into scan context	2023-12-04 10:25:52 +08:00
HHoflittlefish777	54b5d04ff9	[improve](csv_reader) handle csv reader error (#27892 )	2023-12-02 10:05:02 +08:00
slothever	1706699e7e	[fix](multi-catalog)support the max compute partition prune (#27154 ) 1. max compute partition prune, we just support filter mc partitions by '='，it can filter just one partition to support multiple partition filter and range operator('>','<', '>='..), the partition prune should be supported. 2. add max compute row count cache and partitionValues cache 3. add max compute regression case	2023-12-01 22:28:26 +08:00
Mryange	68525fc112	[feature](profile) add RuntimeFilterInfo in merge profile #27869	2023-12-01 21:42:25 +08:00
lihangyu	7398c3daf1	[Feature-Variant](Variant Type) support variant type query and index (#27676 )	2023-11-29 10:37:28 +08:00
ShowCode	f565f60bc3	[refactor](standard)BE:Initialize pointer variables in the class to nullptr by default (#27587 )	2023-11-28 13:02:30 +08:00
zy-kkk	d10a708fa2	[improve](jdbc catalog) add profile for jdbc scan (#27447 )	2023-11-27 10:33:39 +08:00
zhangdong	dfe3a2dd01	[feature](mtmv)(3)Implementing multi table materialized views (#26146 ) Introduction to Main Classes： - MTMVService：MTMV services for other modules to call - MTMVHookService：All operations that affect the MTMV - MTMVJobManager：All operations that affect the MTMV job - MTMVCacheManager：All operations that affect the MTMV Cache - MTMVTask&MTMVJob：Inherit from job framework	2023-11-24 12:34:38 +08:00
zclllyybb	2ea33518b0	[Opt](load) use batching to optimize auto partition (#26915 ) use batching to optimize auto partition	2023-11-23 19:12:28 +08:00
walter	b457856bd2	[chore](be) remove bthread scanner related codes (#27417 )	2023-11-23 15:18:49 +08:00
Gabriel	5442e8d1fc	[pipelineX](dependency) split different dependencies (#27366 )	2023-11-22 12:50:39 +08:00
TengJianPing	1ebb54afdc	[fix](null equal) fix coredump of pushing eq_for_null (#27341 )	2023-11-21 18:36:33 +08:00
Gabriel	459f75073f	[pipelineX](dependency) remove OrDependency (#27242 )	2023-11-20 13:05:34 +08:00
Gabriel	b1eef30b49	[pipelineX](dependency) Wake up task by dependencies (#26879 ) --------- Co-authored-by: Mryange <2319153948@qq.com>	2023-11-18 03:20:24 +08:00
huanghaibin	5d548935e0	[improvement](insert) support schema change and decommission for group commit (#26359 )	2023-11-17 21:41:38 +08:00
Ashin Gau	52995c528e	[fix](iceberg) iceberg use customer method to encode special characters of field name (#27108 ) Fix two bugs: 1. Missing column is case sensitive, change the column name to lower case in FE for hive/iceberg/hudi 2. Iceberg use custom method to encode special characters in column name. Decode the column name to match the right column in parquet reader.	2023-11-17 18:38:55 +08:00
Qi Chen	0491437a86	[Opt](scanner-scheduler) Optimize `BlockingQueue`, `BlockingPriorityQueue` and change remote scan thread pool. (#26784 ) ## Proposed changes - Optimize `BlockingQueue`, `BlockingPriorityQueue` by swapping `notify` and `unlock` to reduce lock competition. Ref: https://www.boost.org/doc/libs/1_54_0/boost/thread/sync_bounded_queue.hpp - Change remote scan thread pool to `PriorityQueue`. ### Test result Before: ``` mysql> select sum(lo_partkey) from lineorder; +-----------------+ \| sum(lo_partkey) \| +-----------------+ \| 300021444265405 \| +-----------------+ 1 row in set (1.11 sec) ``` After: ``` mysql> select sum(lo_partkey) from lineorder; +-----------------+ \| sum(lo_partkey) \| +-----------------+ \| 300021444265405 \| +-----------------+ 1 row in set (0.80 sec) ```	2023-11-15 18:24:36 +08:00
Yongqiang YANG	5ad49dceaa	[fix](scanner_schedule) scanner hangs due to negative num_running_scanners (#26816 ) * [fix] scanner hangs due to negative num_running_scanners Before the patch, num_running_scanners is increased after submitting, then it may be decreased before increasing then negative values can be seen by get_block_from_queue and a expected submit does not happend. Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>	2023-11-13 23:03:49 +08:00
TengJianPing	504ec324bb	Revert "[refactor](scan) delete bloom_filter_predicate (#26499 )" (#26851 ) This reverts commit 2bb3ef198144954583aea106591959ee09932cba.	2023-11-13 16:27:23 +08:00
zy-kkk	2f32a721ee	[refactor](jni) unified jni framework for jdbc catalog (#26317 ) This commit overhauls the JDBC connector logic within our project, transitioning from the previous mechanism of fetching data through JNI calls for individual ResultSet items to a more efficient and unified approach using the VectorTable data structure.	2023-11-13 14:28:15 +08:00
Yongqiang YANG	d9e0a9fa2e	[enhancement](230) print max version and spec version when -230 happens (#26643 ) More information is provided.	2023-11-13 09:57:22 +08:00
Mingyu Chen	66054a5c78	[opt](scanner) increase the connection num of s3 client (#26795 )	2023-11-12 00:29:11 -06:00
Siyang Tang	196fadc044	[enhancement](metrics) enhance visibility of flush thread pool (#26544 )	2023-11-11 19:53:24 +08:00
zhiqiang	a5565f68b2	[Refactor](opentelemetry) Remove opentelemetry (#26605 )	2023-11-09 18:05:34 +08:00
Jerry Hu	2bb3ef1981	[refactor](scan) delete bloom_filter_predicate (#26499 )	2023-11-07 19:37:31 +08:00
Ashin Gau	fa7a38b587	[fix](runtime filter) append late arrival runtime filters in vfilecanner (#25996 ) `VFileScanner` will try to append late arrival runtime filters in each loop of `ScannerScheduler::_scanner_scan`. However, `VFileScanner::_get_next_reader` only generates the `_push_down_conjuncts` in the first loop, so the late arrival runtime filters are ignored.	2023-11-07 09:50:35 +08:00
xy720	a5ef90dacc	[enhancement](recover) support skipping missing version in select by session variable (#25654 )	2023-11-02 20:01:51 +08:00
daidai	a4e415ab09	[feature](hive)Support hive tables after alter type. (#25138 ) 1.Reconstruct the logic of decode to read parquet. The parquet reader first reads the data according to the parquet physical type, and then performs a type conversion. 2.Support hive alter table.	2023-11-02 00:24:21 +08:00
plat1ko	ec85e22506	[enhance](scanner) pass the tablet in `NewOlapScanner`'s ctor (#26167 )	2023-11-01 17:50:14 +08:00
Gabriel	f2874b9452	[bug](shared scan) Fix use-after-free when enable pipeline shared scanning (#26199 ) When enable shared scan, all scanners will be created by one instance. When the main instance reach eos and quit, all states of it will be released. But other instances are still possible to get block from those scanners. So we must assure scanners will not be dependent on any states of the main instance after it quit.	2023-11-01 15:51:20 +08:00
Gabriel	8c454a3287	[bug](scanner) Fix scanner core dump (#26156 )	2023-10-31 22:23:32 +08:00
Mingyu Chen	e20cab64f4	[improvement](scan) avoid too many scanners for file scan node (#25727 ) In previous, when using file scan node(eq, querying hive table), the max number of scanner for each scan node will be the `doris_scanner_thread_pool_thread_num`(default is 48). And if the query parallelism is N, the total number of scanner would be 48 * N, which is too many. In this PR, I change the logic, the max number of scanner for each scan node will be the `doris_scanner_thread_pool_thread_num / query parallelism`. So that the total number of scanners will be up to `doris_scanner_thread_pool_thread_num`. Reduce the number of scanner can significantly reduce the memory usage of query.	2023-10-29 17:41:31 +08:00
wangbo	46d40b1952	[refactor](executor)Remove empty group logic #26005	2023-10-27 14:24:41 +08:00
zhiqiang	c3527672a5	[refactor & pipelineX][pick fix] Pick fix of predicate pushdown to pipelineX (#25953 ) Co-authored-by: JackDrogon <jack.xsuperman@gmail.com>	2023-10-26 18:04:43 +08:00
wangbo	1ba8a9bae4	[feature-wip](executor)Fe send topic info to be (#25798 )	2023-10-26 15:52:48 +08:00
zhiqiang	6e1a4dbda2	[Fix](predicate pushdown) Common expression not acting on any slot should not be pushed down (#25901 )	2023-10-26 11:20:12 +08:00
plat1ko	6dd60c6ebb	[Enhance](BE) Add -Wshadow-field compile option to avoid unexpected shadowing behavior (#25698 ) * Fix `Tablet::_meta_lock` shadows member inherited from `BaseTablet` * Add -Wshadow-field compile option to avoid unexpected shadowing behavior	2023-10-26 10:00:28 +08:00

1 2 3 4 5 ...

418 Commits