Commit Graph

234 Commits

Author SHA1 Message Date
0b9b1be1f1 [fix](function) Fix from_second functions overflow and wrong result (#28685) 2023-12-22 10:22:49 +08:00
e8d0569d8b [refine](pipelineX)Make the 'set ready' logic of SenderQueue in pipelineX the same as that in the pipeline (#28488) 2023-12-20 19:26:00 +08:00
c00dca70e6 [pipelineX](local shuffle) Support parallel execution despite of tablet number (#28266) 2023-12-14 12:53:54 +08:00
78b0fec33a [Fix](Outfile) Support export nested complex type data to orc file format (#28182) 2023-12-13 11:55:27 +08:00
ea275e687a [pipelineX](minor) remove unused code (#28016) 2023-12-05 19:41:40 +08:00
10483ea12c [fix](profile) fix error set with peak_memory_usage in pipeline #27749 2023-12-02 14:12:38 +08:00
Pxl
d969047b50 [Refactor](join) refactor of hash join (#27557)
Improve the performance under the tpch data set by reconstructing the join related code and the use of hash table

Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: BiteTheDDDDt <pxl290@qq.com>
2023-11-28 19:46:00 +08:00
f565f60bc3 [refactor](standard)BE:Initialize pointer variables in the class to nullptr by default (#27587) 2023-11-28 13:02:30 +08:00
ea7eca9345 [pipelineX](bug) Add some logs (#27596) 2023-11-28 10:02:13 +08:00
baadc14e60 [Enhancement](function) support unix_timestamp with float (#26827)
---------

Co-authored-by: YangWithU <plzw8@outlook.com>
2023-11-27 09:58:53 +08:00
1b3512d942 [pipelineX](bug) Fix cancel timeout (#27396) 2023-11-22 22:31:34 +08:00
5442e8d1fc [pipelineX](dependency) split different dependencies (#27366) 2023-11-22 12:50:39 +08:00
b1eef30b49 [pipelineX](dependency) Wake up task by dependencies (#26879)
---------

Co-authored-by: Mryange <2319153948@qq.com>
2023-11-18 03:20:24 +08:00
d988193d39 [pipelineX](shuffle) block exchange sink by memory usage (#26595) 2023-11-09 21:28:22 +08:00
5d80e7dc2f [Improvement](pipelineX) Improve local exchange on pipelineX engine (#26464) 2023-11-07 22:11:44 +08:00
99b45e1938 [fix](Outfile) Export DateTimev2 type of doris to ORC's TimeStamp type (#25470)
Previously,doris's `DateTimev2` was exported to orc as a `String` type.
Now, export doris's `DateTimev2` to orc timestamp type.
2023-10-29 15:59:38 +08:00
c1d64a7128 [Feature](datatype) Add IPv4/v6 data type for doris (#24965) 2023-10-26 17:33:28 +08:00
d6c64d305f [chore](log) Add log to trace query execution #25739 2023-10-26 14:09:25 +08:00
31d2a9a4f5 [Enhancement](function) support fractions for convert_tz(datetimev2) (#25915)
mysql> select convert_tz('2019-08-01 01:01:02.123' , '+00:00', '+07:00');
+----------------------------------------------------------------------------------+
| convert_tz(cast('2019-08-01 01:01:02.123' as DATETIMEV2(3)), '+00:00', '+07:00') |
+----------------------------------------------------------------------------------+
| 2019-08-01 08:01:02.123                                                          |
+----------------------------------------------------------------------------------+
1 row in set (0.18 sec)
2023-10-26 10:46:47 +08:00
7f66be84d5 [fix](Outfile) Infer the column name if the column is expression in select into outfile (#25854)
This pr do two things:
1. Infer the column name if the column is expression in `select into outfile`. The rule for column name generation can be refered in pr: #24990 
2. fix bug that it will core dump if the `_schema` fails to build in the open phase in vorc_transformer.cpp


TODO:
1. Support infer the column name if the column is expression in `select into outfile` in new optimizer(Nereids).
2023-10-25 22:49:04 +08:00
e8f479882d [pipelineX](local exchange) Add local exchange operator (#25846) 2023-10-25 18:45:02 +08:00
6b2eed779c [feature](AuditLog) add scanRows scanBytes in auditlog (#25435) 2023-10-25 10:00:35 +08:00
08832d9f3a [Fix](exec) Fix date dict dead loop. (#25570) 2023-10-24 02:51:43 +08:00
Pxl
642c149e6a remove datetime_value and move vecdatetime_value to doris namespace (#25695)
remove datetime_value and move vecdatetime_value to doris namespace
2023-10-20 22:08:17 +08:00
dc47087560 [fix](function) fix str_to_date default return type scale for nereids (#24932)
fix str_to_date default return type scale for nereids
2023-10-20 12:55:49 +08:00
b964ab76b3 [refactor](shuffle) Simplify hash partitioning strategy (#25596) 2023-10-19 19:28:22 +08:00
e77b98be88 [fix](months_diff) fix wrong result of months_diff (#25577) 2023-10-19 14:29:47 +08:00
08f305dd79 [chore](build) Fix compilation errors reported by GCC-13 (#25439)
1. Fix lots of compilation errors reported by GCC-13.
2. Fix the workflow BE UT (macOS).
2023-10-15 07:57:36 -05:00
37dbda6209 [pipelineX](refactor) Use class template to simplify join (#25369) 2023-10-13 16:51:55 +08:00
6f9a084d99 [Fix](Outfile) Use data_type_serde to export data to parquet file format (#24998) 2023-10-13 13:58:34 +08:00
f960b8c989 [bugfix](stream receiver) be will core during stop because receiver is not closed (#25298)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-10-11 19:49:40 +08:00
b58010c48e [fix](export) BufferWritable must be committed before deconstruct (#25185)
F20231009 16:03:47.659968 3342535 string_buffer.hpp:48] Check failed: _now_offset == 0
*** Check failure stack trace: ***
@ 0x561a6f8e21e6 google::LogMessage::SendToLog()
@ 0x561a6f8de7b0 google::LogMessage::Flush()
@ 0x561a6f8e2a29 google::LogMessageFatal::~LogMessageFatal()
@ 0x561a4a409233 doris::vectorized::BufferWritable::~BufferWritable()
@ 0x561a6e202853 doris::vectorized::VCSVTransformer::write()
@ 0x561a6e1f19ba doris::vectorized::VFileResultWriter::_write_file()
@ 0x561a6e1f1522 doris::vectorized::VFileResultWriter::append_block()
@ 0x561a6e121bed

The error will occur in DEBUG mode, and doing export will invalid data.
It has been covered by baidu case.
2023-10-09 22:39:45 +08:00
0df32c8e3e [Fix](Outfile) Use data_type_serde to export data to csv file format (#24721)
Modify the outfile logic, use the data type serde framework.
2023-10-07 22:50:44 +08:00
93eedaff62 [opt](function) Use Dict to opt the function of time_round (#25029)
Before:

select hour_floor(`@timestamp`, 7) as t, count() as cnt from httplogs_date group by t order by t limit 10;
+---------------------+--------+
| t                   | cnt    |
+---------------------+--------+
| 1998-04-30 21:00:00 |    324 |
| 1998-05-01 04:00:00 | 286156 |
| 1998-05-01 11:00:00 | 266130 |
| 1998-05-01 18:00:00 | 483765 |
| 1998-05-02 01:00:00 | 276706 |
| 1998-05-02 08:00:00 | 169945 |
| 1998-05-02 15:00:00 | 223593 |
| 1998-05-02 22:00:00 | 272616 |
| 1998-05-03 05:00:00 | 188689 |
| 1998-05-03 12:00:00 | 184405 |
+---------------------+--------+
10 rows in set (3.39 sec)
after:

select hour_floor(`@timestamp`, 7) as t, count() as cnt from httplogs_date group by t order by t limit 10;
+---------------------+--------+
| t                   | cnt    |
+---------------------+--------+
| 1998-04-30 21:00:00 |    324 |
| 1998-05-01 04:00:00 | 286156 |
| 1998-05-01 11:00:00 | 266130 |
| 1998-05-01 18:00:00 | 483765 |
| 1998-05-02 01:00:00 | 276706 |
| 1998-05-02 08:00:00 | 169945 |
| 1998-05-02 15:00:00 | 223593 |
| 1998-05-02 22:00:00 | 272616 |
| 1998-05-03 05:00:00 | 188689 |
| 1998-05-03 12:00:00 | 184405 |
+---------------------+--------+
10 rows in set (2.19 sec)
2023-10-04 23:34:24 +08:00
f8a3034dca [Opt](performance) refactor and opt time round floor function (#25026)
refactor and opt time round floor function
2023-10-01 11:51:26 +08:00
642e5cdb69 [Fix](Status) Make Status [[nodiscard]] and handle returned Status correctly (#23395) 2023-09-29 22:38:52 +08:00
864a0f9bcb [opt](pipeline) Make pipeline fragment context send_report asynchronized (#23142) 2023-09-28 17:55:53 +08:00
aa4dbbedc7 [pipelineX](bug) Fix dead lock in exchange sink operator (#24947) 2023-09-27 15:40:25 +08:00
28869b0f82 [fix](Outfile) Use data_type_serde to export data to orc file format (#24812) 2023-09-26 19:46:42 +08:00
a3427cb822 [pipelineX](fix) Fix nested loop join operator (#24885) 2023-09-26 13:27:34 +08:00
a48b19ceb6 [feature](Outfile) select into outfile supports to export struct/map/array type data to orc file format (#24350)
We do not support nested complex type in this pr.
2023-09-21 20:15:18 +08:00
85fb46bb71 [refactor](cache) Refactor preloaded timezone global cache (#24694)
Refactor preloaded timezone global cache
2023-09-21 17:26:41 +08:00
e59aa49f28 [feature](datetime-func)support milliseconds_add/sub/diff and microseconds_diff (#24114) 2023-09-20 10:38:56 +08:00
e54c4ef258 [pipelineX](dependency) refactor write dependency (#24555) 2023-09-19 18:01:42 +08:00
d24f3efd4a [pipelineX](profile) Phase 1: refactor pipelineX detailed profile (#24322) 2023-09-15 16:14:05 +08:00
c5e7f55b63 [performance](executor) optimize time_round function (#23058)
optimize time_round function
2023-09-15 10:49:22 +08:00
4fbb25bc55 [Enhancement](function) Support date_trunc(date) and use it in auto partition (#24341)
Support date_trunc(date) and use it in auto partition
2023-09-14 16:53:09 +08:00
0896aefce3 [fix](local exchange) fix bug of accesssing released counter of local data stream receiver (#24148) 2023-09-11 09:52:31 +08:00
e140938d81 [Perfomance][export] Opt the export of CSV tranformer (#24003) 2023-09-08 20:26:54 +08:00
1a8913f8f4 [fix](shared hash table) fix p0 test failure (#23907) 2023-09-05 14:48:46 +08:00