doris

Author	SHA1	Message	Date
plat1ko	8191cd1dad	[Bug](ScanNode) Fix potential incorrect query result caused by concurrent NewOlapScanNode initialization and Compaction (#24638 ) * Optimize fetch delete predicates * Fix incorrect query result when compaction eliminate delete predicates between `NewOlapScanNode::_init_scanners` and `NewOlapScanner::init` * Fix be ut	2023-09-25 22:24:35 +08:00
Gabriel	3b4d8b4ac8	[pipelineX](feature) Support schema scan operator (#24850 )	2023-09-25 14:42:25 +08:00
wangbo	9412775686	remove useless variable in scanctx (#24849 ) remove useless variable in scanctx	2023-09-25 14:36:18 +08:00
xy720	39e6512a21	[bug](scanner) Fix memory out of bound in scanner scheduler (#24840 )	2023-09-25 09:58:26 +08:00
HappenLee	9579634eac	[Debug](pipeline) add log of pipeline scan bug (#24804 )	2023-09-25 08:38:31 +08:00
Gabriel	27eed937b3	[pipelineX](es scan) Support ES scan operator (#24824 ) Support ES scan operator	2023-09-24 00:32:38 +08:00
Jerry Hu	8a85a75b8b	[chore](scanner) check columns' nullable with schema (#24724 ) Add a validation to prevent potential schema inconsistency issues.	2023-09-22 11:34:53 +08:00
wangbo	c9b2f4cb92	[workload](pipeline) Add cgroup cpu controller (#24052 )	2023-09-21 21:49:33 +08:00
lihangyu	1405b7ca82	[improve](scan) support lower the thread priority of scan thread (#24526 ) The configuration item is used to lower the priority of the scanner thread, typically employed to ensure CPU scheduling for write operations.	2023-09-20 17:00:24 +08:00
Gabriel	c0df8fca20	[pipelineX](fix) Fix potential concurrent problem (#24651 )	2023-09-20 13:00:58 +08:00
Yongqiang YANG	71dcb58db9	[improvement](scanner_schedule) reduce memory consumption of scanner (#24199 ) * [improvement](scanner_schedule) reduce memory consumption of scanner 1. limit scanner by memory consumptin rather than blocks. 2. scheduler run correcty instread of at lest 1.	2023-09-19 21:36:23 +08:00
HappenLee	6a33e4639a	[schedule](pipeline) Remove wait schedule time in pipeline query engine and change current queue to std::mutex (#24525 ) This reverts commit 591aeaa98d1178e2e277278c7afeafef9bdb88d6.	2023-09-18 23:57:56 +08:00
Gabriel	d24f3efd4a	[pipelineX](profile) Phase 1: refactor pipelineX detailed profile (#24322 )	2023-09-15 16:14:05 +08:00
Pxl	35c5d71549	[Improvement](join) some improvement of hash join (#23972 ) some improvement of hash join	2023-09-14 17:55:35 +08:00
神技圈子	d8feca2530	[Enhancement]The page cache can be parameterized by the session variable of fe. (#23981 )	2023-09-14 14:28:19 +08:00
zhiqqqq	c7ae2a7d22	[Refactor & Bugfix](static variables) move some static vairables to exec_env (#24029 )	2023-09-13 09:27:03 +08:00
HappenLee	dbf509edc0	[Debug](scan) Add debug log for find p0 scan coredump in pipeline (#24202 )	2023-09-12 12:17:44 +08:00
zhangdong	dbb9365556	[Enhance](ip)optimize priority_ network matching logic for be (#23795 ) Issue Number: close #xxx If the user has configured the wrong priority_network, direct startup failure to avoid users mistakenly assuming that the configuration is correct If the user has not configured p_ n. Select only the first IP from the IPv4 list, rather than selecting from all IPs, to avoid users' servers not supporting IPv4 extends #23784	2023-09-11 18:32:31 +08:00
Mingyu Chen	f85da7d942	[improvement](jdbc) add profile for jdbc read and convert phase (#23962 ) Add 2 metrics in jdbc scan node profile: - `CallJniNextTime`: call get next from jdbc result set - `ConvertBatchTime`: call convert jobject to columm block Also fix a potential concurrency issue when init jdbc connection cache pool	2023-09-10 21:42:06 +08:00
meiyi	82dc970916	[feature](insert) Support group commit insert (#22829 )	2023-09-08 15:51:03 +08:00
Gabriel	3317909141	[pipelineX](join) support nested loop join operator (#23756 )	2023-09-04 10:08:22 +08:00
airborne12	347cceb530	[Feature](inverted index) push count on index down to scan node (#22687 ) Co-authored-by: airborne12 <airborne12@gmail.com>	2023-09-02 22:24:43 +08:00
airborne12	95488c4d93	[Fix](vscanner) remove TEMP column in block after filter (#23778 )	2023-09-02 21:54:27 +08:00
Gabriel	65f41f71c1	[pipelineX](refactor) refine codes (#23726 )	2023-09-01 07:57:35 +08:00
Pxl	f35ab37e1e	[Bug](materialized-view) fix load db use analyzer to analyze diffrent metaindex (#23673 ) fix load db use analyzer to analyze diffrent metaindex	2023-08-31 12:35:38 +08:00
TengJianPing	962221cb18	[test](log) add log for debug case failure (#23506 )	2023-08-28 10:45:25 +08:00
Mingyu Chen	40be6a0b05	[fix](hive) do not split compress data file and support lz4/snappy block codec (#23245 ) 1. do not split compress data file Some data file in hive is compressed with gzip, deflate, etc. These kinds of file can not be splitted. 2. Support lz4 block codec for hive scan node, use lz4 block codec instead of lz4 frame codec 4. Support snappy block codec For hadoop snappy 5. Optimize the `count()` query of csv file For query like `select count() from tbl`, only need to split the line, no need to split the column. Need to pick to branch-2.0 after this PR: #22304	2023-08-26 12:59:05 +08:00
HappenLee	d331bfc513	[Performance](pipeline) support shared scan segment in mow (#23305 )	2023-08-25 10:43:02 +08:00
Pxl	d9db3f5431	[Improvement](scan) Remove redundant predicates on scan node (#23374 ) * Remove redundant predicates on scan node * update * fix	2023-08-25 10:41:37 +08:00
lihangyu	527293aa41	[refactor](dynamic table) remove dynamic table (#23298 )	2023-08-23 14:15:14 +08:00
Pxl	8ed4045df9	[Chore](primitive-type) remove VecPrimitiveTypeTraits (#22842 )	2023-08-23 08:37:40 +08:00
Gabriel	dcd6c3c022	[pipelineX](refactor) propose a new pipeline execution model (#22562 )	2023-08-21 15:38:45 +08:00
plat1ko	d4694167a8	[Enhancement](chore) Some Status relevant enhancement (#23072 )	2023-08-21 14:14:38 +08:00
HappenLee	433a6103ab	[Enhancement](scanner) allocate blocks in scanner_context on demand and free them on close (#23182 ) Introduced #19389 , removed #20785	2023-08-19 12:13:24 +08:00
wuwenchi	a5ca6cadd6	[Improvement] Optimize count operation for iceberg (#22923 ) Iceberg has its own metadata information, which includes count statistics for table data. If the table does not contain equli'ty delete, we can get the count data of the current table directly from the count statistics.	2023-08-18 09:57:51 +08:00
Pxl	cf1865a1c8	[Bug](scan) fix core dump due to store_path_map (#23084 ) fix core dump due to store_path_map	2023-08-17 15:24:43 +08:00
DongLiang-0	db69457576	[fix](avro)Fix S3 TVF avro format reading failure (#22199 ) This pr fixes two issues: 1. when using s3 TVF to query files in AVRO format, due to the change of `TFileType`, the originally queried `FILE_S3 ` becomes `FILE_LOCAL`, causing the query failed. 2. currently, both parameters `s3.virtual.key` and `s3.virtual.bucket` are removed. A new `S3Utils` in jni-avro to parse the bucket and key of s3. The purpose of doing this operation is mainly to unify the parameters of s3.	2023-08-11 17:22:48 +08:00
Pxl	56392e21ae	[Bug](decimalv3) fix decimalv3 keyrange set wrong number #22818	2023-08-10 18:15:40 +08:00
Qi Chen	f2658dc7bd	[Feature](multi-catalog) Truncate char or varchar columns if size is smaller than file columns or not found in the file column schema. (#22318 ) Truncate char or varchar columns if size is smaller than file columns or not found in the file column schema by session var `truncate_char_or_varchar_columns`.	2023-08-10 14:37:20 +08:00
herry2038	eafdab0cfd	[Enhancement](tvf) Add frontends_disks table-valued-function (#22568 ) --------- Co-authored-by: yuxianbing <yuxianbing@yy.com> Co-authored-by: yuxianbing <iloveqaz123>	2023-08-10 10:40:24 +08:00
Mingyu Chen	c9dc715c5d	[fix](broker-load) fix error when using multi data description for same table in load stmt (#22666 ) For load request, there are 2 tuples on scan node, input tuple and output tuple. The input tuple is for reading file, and it will be converted to output tuple based on user specified column mappings. And the broker load support different column mapping in different data description to same table(or partition). So for each scanner, the output tuples are same but the input tuple can be different. The previous implements save the input tuple in scan node level, causing different scanner using same input tuple, which is incorrect. This PR remove the input tuple from scan node and save them in each scanners.	2023-08-07 20:03:03 +08:00
Pxl	7839a0e708	[Bug](brpc) fix brpc failed on big query came concurrently (#22600 ) fix PriorityThreadPool get_info get wrong number change brpc pool from priority to fifo do not use brpc pool when send eos	2023-08-05 21:24:32 +08:00
Pxl	c1c38c956d	[exec] fix coredump when limit<0 and limit!=-1 with 1.2 fe (#22622 )	2023-08-04 22:18:45 +08:00
Mingyu Chen	1ed1b69485	[refactor](reader) move reader from vec/exec/scan to vec/exec/format (#22371 ) This readers should be in vec/exec/format	2023-08-04 09:47:20 +08:00
Jerry Hu	4bc65aa921	[fix](load) PrefetchBufferedReader Crashing caused updating counter with an invalid runtime profile (#22464 )	2023-08-02 18:19:48 +08:00
Xinyi Zou	bc87002028	[opt](conf) remote scanner thread num is changed to core num * 10 (#22427 )	2023-08-01 23:09:49 +08:00
Ashin Gau	89433f6a13	[fix](complex_type) throw error when reading complex types in broker/stream load (#22331 ) Check whether there are complex types in parquet/orc reader in broker/stream load. Broker/stream load will cast any type as string type, and complex types will be casted wrong. This is a temporary method, and will be replaced by tvf.	2023-07-31 22:23:08 +08:00
Pxl	210f6661b4	[Bug](profile) add lock on add_filter_info #22355 multiple scanner may update profile at same time	2023-07-29 12:45:50 +08:00
daidai	ae8a26335c	[opt](hive)opt select count() stmt push down agg on parquet in hive . (#22115 ) Optimization "select count() from table" stmtement , push down "count" type to BE. support file type : parquet ，orc in hive . 1. 4kfiles , 60kwline num before: 1 min 37.70 sec after: 50.18 sec 2. 50files , 60kwline num before: 1.12 sec after: 0.82 sec	2023-07-29 00:31:01 +08:00
Qi Chen	8caa5a9ba4	[Fix](mutli-catalog) Fix null partitions error in iceberg tables. (#22185 ) ### Issue when partition has null partitions, it throws error `Failed to fill partition column: t_int=null` ### Resolution - Fix the following null partitions error in iceberg tables by replacing null partition to '\N'. - Add regression test for hive null partition.	2023-07-27 23:57:35 +08:00

1 2 3 4 5 ...

346 Commits