Commit Graph

7205 Commits

Author SHA1 Message Date
49dd411f87 [fix](datetime) fix datetime round on BE (#31205)
with tmp as (
            select CONCAT(
                YEAR('2024-02-06 03:37:07.157'), '-', 
                LPAD(MONTH('2024-02-06 03:37:07.157'), 2, '0'), '-',
                LPAD(DAY('2024-02-06 03:37:07.157'), 2, '0'), ' ',
                LPAD(HOUR('2024-02-06 03:37:07.157'), 2, '0'), ':',
                LPAD(MINUTE('2024-02-06 03:37:07.157'), 2, '0'), ':',
                LPAD(SECOND('2024-02-06 03:37:07.157'), 2, '0'), '.', "123456789" )
            AS generated_string)
            select generated_string, cast(generated_string as DateTime(6)) from tmp
before (incorrect round)

+-------------------------------+-----------------------------------------+
| generated_string              | cast(generated_string as DATETIMEV2(6)) |
+-------------------------------+-----------------------------------------+
| 2024-02-06 03:37:07.123456789 | 2024-02-06 03:37:07.123456              |
+-------------------------------+-----------------------------------------+
after (round up, keep consistent with mysql):

+-------------------------------+-----------------------------------------+
| generated_string              | cast(generated_string as DATETIMEV2(6)) |
+-------------------------------+-----------------------------------------+
| 2024-02-06 03:37:07.123456789 | 2024-02-06 03:37:07.123457              |
+-------------------------------+-----------------------------------------+
1 row in set (0.03 sec)
same work with #30744 but implemented on BE
2024-02-21 19:18:45 +08:00
8d889e434b [fix](topn) Fix key topn block reverse is missed in some cases (#31199)
* move reverse block row order operation after _next_batch_internal

* add testcase
2024-02-21 19:18:45 +08:00
8fc9d80479 [compatibility](MySQL) update charset to utf8mb4, collation to utf8mb4_0900_bin (#31046)
Doris's behaviour is more like utf8mb4 and utf8mb4_0900_bin than utf8 and utf8_general_ci
2024-02-21 17:01:39 +08:00
1abe9d4384 [fix](memory) Fix LRU cache stale sweep (#31122)
Remove LRUCacheValueBase, put last_visit_time into LRUHandle, and automatically update timestamp to last_visit_time during cache insert and lookup.

Do not rely on external modification of last_visit_time, which is often forgotten.
2024-02-21 17:01:29 +08:00
a8d8c6a271 [fix](file-writer) opt s3 file writer and fix empty file related issue #28983 #30703 #31169 (#31213)
* (feature)(cloud) Use dynamic allocator instead of static buffer pool for better elasticity. (#28983)

* [fix](outfile) Fix unable to export empty data (#30703)

Issue Number: close #30600
Fix unable to export empty data to hdfs / S3, this behavior is inconsistent with version 1.2.7,
version 1.2.7 can export empty data to hdfs/ S3, and there will be exported files on S3/HDFS.

* [fix](file-writer) avoid empty file for segment writer (#31169)

---------

Co-authored-by: AlexYue <yj976240184@gmail.com>
Co-authored-by: zxealous <zhouchangyue@baidu.com>
2024-02-21 16:48:54 +08:00
f121ee907f [fix](invert index) fix the inaccurate rows_inverted_index_filtered in the profile (#31158) 2024-02-21 13:53:39 +08:00
278b232e76 [Bug](json reader) object should stop processing when encounter error (#31159)
If DATA_QUALITY_ERROR encountered we should stop processing this document any more.Otherwise there will be UB in simdjson.
2024-02-21 13:53:32 +08:00
8b6d6d0165 [pipelineX](refactor) refactor streaming agg structure (#31151) 2024-02-21 13:53:32 +08:00
1e3968fe7e [fix](group_commit) Need to wait wal to be deleted when creating MaterializedView (#30956) 2024-02-21 13:53:19 +08:00
69c7750db2 [Improment](executor)Create workload thread pool without cgroup #31170 2024-02-21 13:53:18 +08:00
872e1f6687 [fix](backup) fix concurrent upload and release snapshot crash (#31144)
In upload implementation, filesystem::size throws an exception
if the specified file, which is removed by the release snapshot
task, does not exist.
2024-02-21 13:53:18 +08:00
99cdbc7e12 [fix](memory) Fix jemalloc lib name for thirdparty arrow (#28843)
after #28429
2024-02-21 13:52:10 +08:00
97c9d75af3 [Feature](executor)Add scan_thread_num property for workload group (#31106) 2024-02-20 16:24:05 +08:00
Pxl
493385c2c7 [Bug](AggState) fix not match function when agg combinator function has alias (#31101) 2024-02-20 16:23:53 +08:00
6778b4ed93 [pipelineX](refactor) make non-virtual function in Dependency (#31109)
* [pipelineX](refactor) make non-virtual function in Dependency

* update
2024-02-20 09:18:33 +08:00
7ca3be6d51 [fix](parquet) return error if schema changed in complex types (#31128)
Check the column type of complex type to prevent core dump in BE. ColumnReader will throw segmentation fault in the following case:
Change complex types in hive:

hive> create table struct_test(
           id int,
           sf struct<f1: int, f2: map<string, string>>) stored as parquet;

hive> insert into struct_test values
          (1, named_struct('f1', 1, 'f2', str_to_map('1:s2,2:s2'))),
          (2, named_struct('f1', 2, 'f2', str_to_map('k1:s3,k2:s4'))),
          (3, named_struct('f1', 3, 'f2', str_to_map('k1:s5,k2:s6')));

hive> alter table struct_test change sf sf struct<f1:int, f2: string>;
2024-02-20 09:12:38 +08:00
7a1bd6abb0 [improvment](group_commit) Refector scan wal function (#30939)
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
2024-02-20 09:12:38 +08:00
066d674358 [Fix](inverted index) fix inverted index read data opt not work on MOW (#31075) 2024-02-20 09:12:38 +08:00
9a708806e0 [fix](segcompaction) enable segcompaction by default (#30810) 2024-02-19 19:04:22 +08:00
7607bfc78d [bugfix](performance) fix performance problem (#31093) 2024-02-19 17:48:29 +08:00
8a3e6644d4 [fix](udf) fix java-udf coredump as get env return nullptr (#30986) 2024-02-19 17:23:24 +08:00
2f9bd3e3bb (enhance)(S3) Change s3 metric from bvar adder to latency recorder (#28861) 2024-02-19 17:22:03 +08:00
Pxl
bb4575a392 [Improvement](join) optimization for build_side_output_column (#30826)
optimization for build_side_output_column
2024-02-19 17:22:03 +08:00
2f960c49f5 [Fix](executor)Fix query runtime statistics report failed #31064 2024-02-19 17:20:21 +08:00
5ea46f210c [pipelineX](bug) Fix use-after-free when BE exits (#31042) 2024-02-18 14:45:25 +08:00
870a9342b7 [fix](function) fix extract_url_parameter's bug then get the last key (#30929)
fix extract_url_parameter's bug then get the last key
2024-02-18 14:45:25 +08:00
6cf7468073 [enhancement](function) change some function nullable mode (#30991)
change some function nullable mode
2024-02-18 14:45:25 +08:00
68102fd531 [Fix](auto-partition) fix a concurrent bug of extremely long values (#31005) 2024-02-18 14:45:25 +08:00
a3c78dd21a [chore](refactor) refactor some rf code and delete rpc file (#31031)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-02-18 11:50:17 +08:00
d70776af55 [feature](agg-func) support covar and covar_samp function (#30983) 2024-02-18 11:50:17 +08:00
7b79b77cc9 [Optimize](Variant) make tablet schema more well-organized (#99) (#30922) 2024-02-18 11:50:17 +08:00
b0d2ecbf52 [Improve](Tablet Schema) Use deterministic way to serialize protobuf (#101) (#30906) 2024-02-18 11:50:17 +08:00
b5012dc55a [Enhancement](group commit) optimize pre allocated calculation (#30893) 2024-02-18 11:50:17 +08:00
fc53c7210b [fix](chmod) change chmod to filesystem::permission to avoid race condition (#31032) 2024-02-18 11:50:16 +08:00
45b4189bb6 [Refactor](opt) Opt rf and remove unless code (#30900)
Opt rf and remove unless code
2024-02-18 11:50:16 +08:00
acdc9575ad [fix](function) incorrect result of 'equal_for_null' (#30990) 2024-02-18 11:50:16 +08:00
0d4b8386a2 [bugfix][be][cppcheck] Possible NULL pointer access (#31025) (#31026) 2024-02-16 10:16:40 +08:00
e68019c10a [Function](Exec) Support windows function cume_dist (#30997) 2024-02-16 10:16:40 +08:00
f65844fae4 [Enhencement](Outfile/Export) Export data to csv file format with BOM (#30533)
The UTF8 format of the Windows system has BOM. 

We add a new user property to `Outfile/Export`。Therefore, when exporting Doris data, users can choose whether to bring BOM on the beginning of the CSV file.

**Usage:**
```sql
-- outfile:
select * from demo.student
into outfile "file:///xxx/export/exp_"
format as csv
properties(
    "column_separator" = ",",
    "with_bom" = "true"
);

-- Export:
EXPORT TABLE student TO "file:///xx/tmpdata/export/exp_"
PROPERTIES(
    "format" = "csv",
    "with_bom" = "true"
);
```
2024-02-16 10:16:40 +08:00
eaaab33f0a [Fix](Top-N opt) evicting quering rowsets in prior to correct use_count (#102) (#30904)
This addresses the scenario where a rowset cannot be removed.
2024-02-16 10:16:40 +08:00
2573150f6d [refactor](runtime filter) do not wait runtime filter rpc finished when hash node or pipeline finished (#30970)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-02-16 10:16:40 +08:00
7f50998406 fix compile 2024-02-16 10:12:25 +08:00
40e1326bc9 [feature](window-func) support percent_rank window function (#30926) 2024-02-16 10:12:25 +08:00
5cfd7c2a1c [improvement](memtracker) should counter memory usage to query when exchange sink buffer rpc (#30964)
* [improvement](memtracker) should counter memory usage to query when rpc callback

* f

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-02-16 10:12:25 +08:00
d60ecdba6f [fix](regex) fix wrong escape of function LIKE (#30557)
fix wrong escape of function LIKE
2024-02-16 10:12:25 +08:00
ff82e2ab59 [improvement](group_commit) Add bvar to monitor the count of replaying wal fail on group commit (#30941) 2024-02-16 10:12:24 +08:00
bbbe3e666a [improvement](group_commit) Rename fail wal to tmp should only use in test P0 scenario (#30959) 2024-02-16 10:12:24 +08:00
e8f614791e [fix](pipeline) Set the flag of short circuit only when hash join sink finished (#30977) 2024-02-16 10:12:24 +08:00
1437348040 [fix](group_commit) Wal file should be removed from _wal_path_map when renaming it to tmp directory (#30974) 2024-02-16 10:12:24 +08:00
02c37b8ead opt the rf code and remove rf unless code (#30861) 2024-02-16 10:12:24 +08:00