Commit Graph

7069 Commits

Author SHA1 Message Date
52c45e38af [Refactor](RF) refactor the profile of rf and pipeline-x support local ignore (#31287)
* [Refactor](RF) refactor the profile of rf and pipeline-x support local ignore

* fix local merge filter
2024-02-23 19:05:06 +08:00
8f77e6363a [Feature](function) Support xxhash function like murmur hash function (#31193) 2024-02-23 19:03:28 +08:00
1456785aa1 [fix](join) incorrect result of mark join in nested loop join (#31280) 2024-02-23 19:03:28 +08:00
f65876d803 [Feature](explode) support explode map type (#30151) 2024-02-22 20:08:44 +08:00
4c3a96e7df [fix](memory) Fix LRU cache frequent prune (#31220) 2024-02-22 19:51:20 +08:00
90ab5ec2d9 [fix](invert index) fix the error issue in the unit test remove_element_only_in_table (#31238) 2024-02-22 13:01:49 +08:00
ad07dec0ed [Improve](InPredict) enhance in predict with struct type (#30840) 2024-02-22 13:01:49 +08:00
90a7f04349 [refine](pipelinex) get sink local state does not require an id. #31195 2024-02-22 13:01:49 +08:00
52b9af06fb [pipelineX](refactor) Delete subclasses inherited from Dependency (#31216) 2024-02-22 13:01:48 +08:00
b66583551c [fix](group_commit)Fix bound checking problem when reading wal block (#31112) 2024-02-22 13:01:48 +08:00
f2a38e6345 [chore](columns) remove update_hashes_with_value for SipHash (#31224) 2024-02-22 13:01:48 +08:00
c56cb0ac3e [Exec](RF) Support merge remote rf local first (#31067) 2024-02-22 13:01:48 +08:00
9c096710fa [Enhancement](group commit) Add bvar and log for group commit (#31017) 2024-02-22 13:01:32 +08:00
e183e9ac46 [fix](stream-load) print stream load profile for pipeline and pipelinex (#31198) 2024-02-22 13:01:32 +08:00
49dd411f87 [fix](datetime) fix datetime round on BE (#31205)
with tmp as (
            select CONCAT(
                YEAR('2024-02-06 03:37:07.157'), '-', 
                LPAD(MONTH('2024-02-06 03:37:07.157'), 2, '0'), '-',
                LPAD(DAY('2024-02-06 03:37:07.157'), 2, '0'), ' ',
                LPAD(HOUR('2024-02-06 03:37:07.157'), 2, '0'), ':',
                LPAD(MINUTE('2024-02-06 03:37:07.157'), 2, '0'), ':',
                LPAD(SECOND('2024-02-06 03:37:07.157'), 2, '0'), '.', "123456789" )
            AS generated_string)
            select generated_string, cast(generated_string as DateTime(6)) from tmp
before (incorrect round)

+-------------------------------+-----------------------------------------+
| generated_string              | cast(generated_string as DATETIMEV2(6)) |
+-------------------------------+-----------------------------------------+
| 2024-02-06 03:37:07.123456789 | 2024-02-06 03:37:07.123456              |
+-------------------------------+-----------------------------------------+
after (round up, keep consistent with mysql):

+-------------------------------+-----------------------------------------+
| generated_string              | cast(generated_string as DATETIMEV2(6)) |
+-------------------------------+-----------------------------------------+
| 2024-02-06 03:37:07.123456789 | 2024-02-06 03:37:07.123457              |
+-------------------------------+-----------------------------------------+
1 row in set (0.03 sec)
same work with #30744 but implemented on BE
2024-02-21 19:18:45 +08:00
8d889e434b [fix](topn) Fix key topn block reverse is missed in some cases (#31199)
* move reverse block row order operation after _next_batch_internal

* add testcase
2024-02-21 19:18:45 +08:00
8fc9d80479 [compatibility](MySQL) update charset to utf8mb4, collation to utf8mb4_0900_bin (#31046)
Doris's behaviour is more like utf8mb4 and utf8mb4_0900_bin than utf8 and utf8_general_ci
2024-02-21 17:01:39 +08:00
1abe9d4384 [fix](memory) Fix LRU cache stale sweep (#31122)
Remove LRUCacheValueBase, put last_visit_time into LRUHandle, and automatically update timestamp to last_visit_time during cache insert and lookup.

Do not rely on external modification of last_visit_time, which is often forgotten.
2024-02-21 17:01:29 +08:00
a8d8c6a271 [fix](file-writer) opt s3 file writer and fix empty file related issue #28983 #30703 #31169 (#31213)
* (feature)(cloud) Use dynamic allocator instead of static buffer pool for better elasticity. (#28983)

* [fix](outfile) Fix unable to export empty data (#30703)

Issue Number: close #30600
Fix unable to export empty data to hdfs / S3, this behavior is inconsistent with version 1.2.7,
version 1.2.7 can export empty data to hdfs/ S3, and there will be exported files on S3/HDFS.

* [fix](file-writer) avoid empty file for segment writer (#31169)

---------

Co-authored-by: AlexYue <yj976240184@gmail.com>
Co-authored-by: zxealous <zhouchangyue@baidu.com>
2024-02-21 16:48:54 +08:00
f121ee907f [fix](invert index) fix the inaccurate rows_inverted_index_filtered in the profile (#31158) 2024-02-21 13:53:39 +08:00
278b232e76 [Bug](json reader) object should stop processing when encounter error (#31159)
If DATA_QUALITY_ERROR encountered we should stop processing this document any more.Otherwise there will be UB in simdjson.
2024-02-21 13:53:32 +08:00
8b6d6d0165 [pipelineX](refactor) refactor streaming agg structure (#31151) 2024-02-21 13:53:32 +08:00
1e3968fe7e [fix](group_commit) Need to wait wal to be deleted when creating MaterializedView (#30956) 2024-02-21 13:53:19 +08:00
69c7750db2 [Improment](executor)Create workload thread pool without cgroup #31170 2024-02-21 13:53:18 +08:00
872e1f6687 [fix](backup) fix concurrent upload and release snapshot crash (#31144)
In upload implementation, filesystem::size throws an exception
if the specified file, which is removed by the release snapshot
task, does not exist.
2024-02-21 13:53:18 +08:00
99cdbc7e12 [fix](memory) Fix jemalloc lib name for thirdparty arrow (#28843)
after #28429
2024-02-21 13:52:10 +08:00
97c9d75af3 [Feature](executor)Add scan_thread_num property for workload group (#31106) 2024-02-20 16:24:05 +08:00
Pxl
493385c2c7 [Bug](AggState) fix not match function when agg combinator function has alias (#31101) 2024-02-20 16:23:53 +08:00
6778b4ed93 [pipelineX](refactor) make non-virtual function in Dependency (#31109)
* [pipelineX](refactor) make non-virtual function in Dependency

* update
2024-02-20 09:18:33 +08:00
7ca3be6d51 [fix](parquet) return error if schema changed in complex types (#31128)
Check the column type of complex type to prevent core dump in BE. ColumnReader will throw segmentation fault in the following case:
Change complex types in hive:

hive> create table struct_test(
           id int,
           sf struct<f1: int, f2: map<string, string>>) stored as parquet;

hive> insert into struct_test values
          (1, named_struct('f1', 1, 'f2', str_to_map('1:s2,2:s2'))),
          (2, named_struct('f1', 2, 'f2', str_to_map('k1:s3,k2:s4'))),
          (3, named_struct('f1', 3, 'f2', str_to_map('k1:s5,k2:s6')));

hive> alter table struct_test change sf sf struct<f1:int, f2: string>;
2024-02-20 09:12:38 +08:00
7a1bd6abb0 [improvment](group_commit) Refector scan wal function (#30939)
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
2024-02-20 09:12:38 +08:00
066d674358 [Fix](inverted index) fix inverted index read data opt not work on MOW (#31075) 2024-02-20 09:12:38 +08:00
9a708806e0 [fix](segcompaction) enable segcompaction by default (#30810) 2024-02-19 19:04:22 +08:00
7607bfc78d [bugfix](performance) fix performance problem (#31093) 2024-02-19 17:48:29 +08:00
8a3e6644d4 [fix](udf) fix java-udf coredump as get env return nullptr (#30986) 2024-02-19 17:23:24 +08:00
2f9bd3e3bb (enhance)(S3) Change s3 metric from bvar adder to latency recorder (#28861) 2024-02-19 17:22:03 +08:00
Pxl
bb4575a392 [Improvement](join) optimization for build_side_output_column (#30826)
optimization for build_side_output_column
2024-02-19 17:22:03 +08:00
2f960c49f5 [Fix](executor)Fix query runtime statistics report failed #31064 2024-02-19 17:20:21 +08:00
5ea46f210c [pipelineX](bug) Fix use-after-free when BE exits (#31042) 2024-02-18 14:45:25 +08:00
870a9342b7 [fix](function) fix extract_url_parameter's bug then get the last key (#30929)
fix extract_url_parameter's bug then get the last key
2024-02-18 14:45:25 +08:00
6cf7468073 [enhancement](function) change some function nullable mode (#30991)
change some function nullable mode
2024-02-18 14:45:25 +08:00
68102fd531 [Fix](auto-partition) fix a concurrent bug of extremely long values (#31005) 2024-02-18 14:45:25 +08:00
a3c78dd21a [chore](refactor) refactor some rf code and delete rpc file (#31031)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-02-18 11:50:17 +08:00
d70776af55 [feature](agg-func) support covar and covar_samp function (#30983) 2024-02-18 11:50:17 +08:00
7b79b77cc9 [Optimize](Variant) make tablet schema more well-organized (#99) (#30922) 2024-02-18 11:50:17 +08:00
b0d2ecbf52 [Improve](Tablet Schema) Use deterministic way to serialize protobuf (#101) (#30906) 2024-02-18 11:50:17 +08:00
b5012dc55a [Enhancement](group commit) optimize pre allocated calculation (#30893) 2024-02-18 11:50:17 +08:00
fc53c7210b [fix](chmod) change chmod to filesystem::permission to avoid race condition (#31032) 2024-02-18 11:50:16 +08:00
45b4189bb6 [Refactor](opt) Opt rf and remove unless code (#30900)
Opt rf and remove unless code
2024-02-18 11:50:16 +08:00
acdc9575ad [fix](function) incorrect result of 'equal_for_null' (#30990) 2024-02-18 11:50:16 +08:00