doris

Author	SHA1	Message	Date
Mingyu Chen	3e8c75e246	[minor](orc) opt the log info in orc reader (#27951 )	2023-12-06 20:47:36 +08:00
Gabriel	1be513b927	[pipelineX](local shuffle) Fix local shuffle for colocate/bucket join (#28032 )	2023-12-06 10:02:36 +08:00
TengJianPing	fd1db4da3d	[agg](profile) fix incorrent profile (#28004 )	2023-12-05 20:48:10 +08:00
zhangdong	6074cddcf8	[feature](mtmv)add Job and task tvf (#27967 ) add: select * from jobs("type"="mv"); select * from tasks("type"="mv"); select * from jobs("type"="insert"); select * from tasks("type"="insert"); add check priv for mv_infos("database"="xxx"); change JobType MTMV==>MV	2023-12-05 15:12:36 +08:00
HappenLee	54fe1a166b	[Refactor](scan) refactor scan scheduler to improve performance (#27948 ) * [Refactor](scan) refactor scan scheduler to improve performance * fix pipeline x core	2023-12-05 13:03:16 +08:00
Qi Chen	2b4c4bb442	[Fix][Opt](parquet-reader) Fix filter push down with decimal types in parquet reader. (#27897 ) Fix filter push down with decimal types in parquet reader introduced by #22842	2023-12-04 22:25:39 +08:00
Pxl	e3d2425d47	[Improvement](join) remove insert_indices_from_join and special judge for -1 (#27779 ) remove insert_indices_from_join and special judge for -1	2023-12-04 11:03:22 +08:00
zhangstar333	d2a99aa03b	[refactor](scan) change scan reschedule into scan context (#27766 ) * [refactor](scan) change scan reschedule into scan context	2023-12-04 10:25:52 +08:00
HHoflittlefish777	97d36b4f38	[fix](csv_reader) fix trim_double_quotes behavior change (#27882 )	2023-12-03 22:57:55 +08:00
Qi Chen	fc8b32be7a	[Opt](multi-catalog) Opt parquet orc reader numeric copy by `memcpy()` and `memset()`. (#27545 ) Opt parquet orc reader null map decoding by memset().	2023-12-03 09:55:05 +08:00
HHoflittlefish777	54b5d04ff9	[improve](csv_reader) handle csv reader error (#27892 )	2023-12-02 10:05:02 +08:00
slothever	1706699e7e	[fix](multi-catalog)support the max compute partition prune (#27154 ) 1. max compute partition prune, we just support filter mc partitions by '='，it can filter just one partition to support multiple partition filter and range operator('>','<', '>='..), the partition prune should be supported. 2. add max compute row count cache and partitionValues cache 3. add max compute regression case	2023-12-01 22:28:26 +08:00
Mryange	68525fc112	[feature](profile) add RuntimeFilterInfo in merge profile #27869	2023-12-01 21:42:25 +08:00
HHoflittlefish777	3e910e2978	[refactor](simd_json_reader) refactor simd json reader to adapt to parse multi json (#27272 )	2023-11-30 15:01:06 +08:00
Qi Chen	e4149c6e4c	[Fix](parquet-reader) Fix null map issue in parquet reader. (#27777 ) Fix null map issue in parquet reader which cause result incorrect such as `min()`, `max()`. In order to share null map between parquet converted src column and dst column to avoid copying. It is very tricky that will call mutable function `doris_nullable_column->get_null_map_column_ptr()` which will set `_need_update_has_null = true`. Because some operations such as agg will call `has_null()` to set `_need_update_has_null = false`.	2023-11-30 13:55:37 +08:00
HHoflittlefish777	498d27c905	[improve](json_reader) add prompt when all fields is null (#27630 )	2023-11-29 18:26:42 +08:00
Mryange	d9d5468621	[feature](audit-log) add audit-log in insert into (#27641 )	2023-11-29 15:01:57 +08:00
lihangyu	7398c3daf1	[Feature-Variant](Variant Type) support variant type query and index (#27676 )	2023-11-29 10:37:28 +08:00
Pxl	d969047b50	[Refactor](join) refactor of hash join (#27557 ) Improve the performance under the tpch data set by reconstructing the join related code and the use of hash table Co-authored-by: HappenLee <happenlee@hotmail.com> Co-authored-by: BiteTheDDDDt <pxl290@qq.com>	2023-11-28 19:46:00 +08:00
Pxl	91b0edfaa2	[Bug](join) try fix wrong _has_null_in_build_side setted (#27684 ) try fix wrong _has_null_in_build_side setted	2023-11-28 17:42:14 +08:00
ShowCode	f565f60bc3	[refactor](standard)BE:Initialize pointer variables in the class to nullptr by default (#27587 )	2023-11-28 13:02:30 +08:00
zclllyybb	fe7ff6f113	[Opt](functions) Opt tvf number for performance regression framework (#27582 ) Opt tvf number for performance regression framework	2023-11-28 10:43:51 +08:00
zy-kkk	d10a708fa2	[improve](jdbc catalog) add profile for jdbc scan (#27447 )	2023-11-27 10:33:39 +08:00
zhangdong	dfe3a2dd01	[feature](mtmv)(3)Implementing multi table materialized views (#26146 ) Introduction to Main Classes： - MTMVService：MTMV services for other modules to call - MTMVHookService：All operations that affect the MTMV - MTMVJobManager：All operations that affect the MTMV job - MTMVCacheManager：All operations that affect the MTMV Cache - MTMVTask&MTMVJob：Inherit from job framework	2023-11-24 12:34:38 +08:00
Ashin Gau	dd65cc1d14	[opt](MergedIO) no need to merge large columns (#27315 ) 1. Fix a profile bug of `MergeRangeFileReader`, and add a profile `ApplyBytes` to show the total bytes of ranges. 2. There's no need to merge large columns, because `MergeRangeFileReader` will increase the copy time.	2023-11-23 19:15:47 +08:00
zclllyybb	2ea33518b0	[Opt](load) use batching to optimize auto partition (#26915 ) use batching to optimize auto partition	2023-11-23 19:12:28 +08:00
walter	b457856bd2	[chore](be) remove bthread scanner related codes (#27417 )	2023-11-23 15:18:49 +08:00
Pxl	301bfe4d5d	[Bug](mark-join) fix mark join report error when probe block have column do not output (#27360 ) fix mark join report error when probe block have column do not output	2023-11-23 11:16:02 +08:00
Gabriel	5442e8d1fc	[pipelineX](dependency) split different dependencies (#27366 )	2023-11-22 12:50:39 +08:00
TengJianPing	1ebb54afdc	[fix](null equal) fix coredump of pushing eq_for_null (#27341 )	2023-11-21 18:36:33 +08:00
Gabriel	459f75073f	[pipelineX](dependency) remove OrDependency (#27242 )	2023-11-20 13:05:34 +08:00
Jerry Hu	febd60c75f	[fix](join) incorrect result of left join with other conjuncts (#27238 )	2023-11-19 15:36:39 +08:00
Jerry Hu	b42828cf69	[fix](window_function) min/max/sum/avg should be always nullable (#27104 ) Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>	2023-11-18 18:41:42 +08:00
Gabriel	b1eef30b49	[pipelineX](dependency) Wake up task by dependencies (#26879 ) --------- Co-authored-by: Mryange <2319153948@qq.com>	2023-11-18 03:20:24 +08:00
huanghaibin	5d548935e0	[improvement](insert) support schema change and decommission for group commit (#26359 )	2023-11-17 21:41:38 +08:00
Mingyu Chen	c459408580	[fix](jni) avoid BE crash and NPE when close paimon reader (#27129 ) 1. Do not use FATAL log when jni encounter error, to avoid crash. 2. Fix NPE when closing PaimonReader, the reader may not be assigned if PaimonReader open failed.	2023-11-17 20:01:08 +08:00
Ashin Gau	52995c528e	[fix](iceberg) iceberg use customer method to encode special characters of field name (#27108 ) Fix two bugs: 1. Missing column is case sensitive, change the column name to lower case in FE for hive/iceberg/hudi 2. Iceberg use custom method to encode special characters in column name. Decode the column name to match the right column in parquet reader.	2023-11-17 18:38:55 +08:00
Qi Chen	a0661ed9d2	[Fix](multi-catalog) Fix complex type crash when using dict filter facility in the parquet-reader. (#27151 ) - Fix complex type crash when using the dict filter facility in the parquet-reader by turning off the dict filter facility in this case. - Add orc complex types regression test.	2023-11-17 13:43:58 +08:00
Qi Chen	0491437a86	[Opt](scanner-scheduler) Optimize `BlockingQueue`, `BlockingPriorityQueue` and change remote scan thread pool. (#26784 ) ## Proposed changes - Optimize `BlockingQueue`, `BlockingPriorityQueue` by swapping `notify` and `unlock` to reduce lock competition. Ref: https://www.boost.org/doc/libs/1_54_0/boost/thread/sync_bounded_queue.hpp - Change remote scan thread pool to `PriorityQueue`. ### Test result Before: ``` mysql> select sum(lo_partkey) from lineorder; +-----------------+ \| sum(lo_partkey) \| +-----------------+ \| 300021444265405 \| +-----------------+ 1 row in set (1.11 sec) ``` After: ``` mysql> select sum(lo_partkey) from lineorder; +-----------------+ \| sum(lo_partkey) \| +-----------------+ \| 300021444265405 \| +-----------------+ 1 row in set (0.80 sec) ```	2023-11-15 18:24:36 +08:00
daidai	3585c7e216	[test](parquet)append parquet reader byte_array_decimal and rle_bool case (#26751 )	2023-11-14 15:05:10 +08:00
Ashin Gau	ec40603b93	[fix](parquet) compressed_page_size has the same meaning in page v1 and v2 (#26783 ) 1. Parquet with page v2 is parsed error when using other codec except snappy. Because `compressed_page_size` has the same meaning in page v1 and v2, it always contains the bytes of definition level, repetition level and compressed data. 2. Add regression test for `fix_length_byte_array` stored decimal type, and dictionary encoded date/datetime type.	2023-11-14 08:30:42 +08:00
Yongqiang YANG	5ad49dceaa	[fix](scanner_schedule) scanner hangs due to negative num_running_scanners (#26816 ) * [fix] scanner hangs due to negative num_running_scanners Before the patch, num_running_scanners is increased after submitting, then it may be decreased before increasing then negative values can be seen by get_block_from_queue and a expected submit does not happend. Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>	2023-11-13 23:03:49 +08:00
TengJianPing	504ec324bb	Revert "[refactor](scan) delete bloom_filter_predicate (#26499 )" (#26851 ) This reverts commit 2bb3ef198144954583aea106591959ee09932cba.	2023-11-13 16:27:23 +08:00
zy-kkk	2f32a721ee	[refactor](jni) unified jni framework for jdbc catalog (#26317 ) This commit overhauls the JDBC connector logic within our project, transitioning from the previous mechanism of fetching data through JNI calls for individual ResultSet items to a more efficient and unified approach using the VectorTable data structure.	2023-11-13 14:28:15 +08:00
Yongqiang YANG	d9e0a9fa2e	[enhancement](230) print max version and spec version when -230 happens (#26643 ) More information is provided.	2023-11-13 09:57:22 +08:00
Mingyu Chen	66054a5c78	[opt](scanner) increase the connection num of s3 client (#26795 )	2023-11-12 00:29:11 -06:00
Siyang Tang	196fadc044	[enhancement](metrics) enhance visibility of flush thread pool (#26544 )	2023-11-11 19:53:24 +08:00
Qi Chen	c07a70e22a	[Fix](orc-reader) Add missing `break` introduced by #26548 . (#26633 ) Add missing break introduced by #26548. Sorry for this mistake.	2023-11-09 18:29:44 +08:00
zhiqiang	a5565f68b2	[Refactor](opentelemetry) Remove opentelemetry (#26605 )	2023-11-09 18:05:34 +08:00
wudongliang	22bf2889e5	[feature](tvf)(jni-avro)jni-avro scanner add complex data types (#26236 ) Support avro's enum, record, union data types	2023-11-09 13:58:49 +08:00

1 2 3 4 5 ...

1144 Commits