72c20d3ccc
[branch-2.1](function) fix date_format and from_unixtime core when meet long format string ( #35883 ) ( #36158 )
...
pick #35883
2024-07-01 20:35:31 +08:00
cb80ae906f
[Bug](runtime-filter) disable sync filter when pipeline engine is off ( #36994 )
...
## Proposed changes
1. disable sync filter when pipeline engine is off
2. reduce some warning log
2024-06-28 16:59:26 +08:00
c84b56140c
[Fix](outfile) Add a configuration for exporting data in Parquet format using select into outfile ( #36143 )
...
backport: #36142
2024-06-13 11:49:46 +08:00
1715bae26f
[opt](parquet-writer) Specify the row group size when writing data to Parquet files. ( #35081 ) ( #36042 )
...
bp #35081
Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com >
2024-06-07 17:57:11 +08:00
b91d2caab8
[Feature](iceberg-writer) Implements iceberg sink basic functionality for inserting into table. ( #35587 )
...
backport #34929
2024-05-29 16:40:54 +08:00
b143f0dfe2
[Improvement](date) shortcut for str to date parse ( #35288 )
...
shortcut for str to date parse
2024-05-25 17:47:20 +08:00
a6f7747d29
[feature](datatype) add BE config to allow zero date ( #34961 )
...
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com >
2024-05-23 19:12:39 +08:00
a8c24d7698
[Fix](function) fix overflow of date_add function ( #35080 )
...
fix overflow of date_add function
2024-05-22 10:02:59 +08:00
b96148c9cd
[Fix](function) fix days/weeks_diff result wrong on BE #35104
...
select days_diff('2024-01-01 00:00:00', '2023-12-31 23:59:59');
should be 0 but got 1 on BE.
2024-05-22 10:00:26 +08:00
c7134faea9
[Fix](outfile) Fix the timing of setting the _is_closed flag in Parquet/ORC writer ( #34668 )
2024-05-15 10:28:22 +08:00
4dd5379951
[bugfix](hive)fix error for writing to hive for 2.1 ( #34518 )
...
mirror #34520
2024-05-14 23:27:29 +08:00
520774a24b
[fix](serde) fix ipv4/v6 serde functions for arrow, orc, parquet format ( #34042 )
...
this PR is from @sjyango work in #32326 ,
wants merge #32326 into master branch, but it's draft and not maintain long time. so have this new PR.
Co-authored-by: sjyango <sjyang2022@zju.edu.cn >
2024-05-10 14:37:04 +08:00
804586b342
[Improvement](sort) insert data by batch on VSortedRunMerger::get_next ( #34363 )
...
insert data by batch on VSortedRunMerger::get_next
2024-05-10 14:36:53 +08:00
a173513e27
[fix](pipelinex) exchange sink not set ready when source limit #34241
2024-04-29 20:58:50 +08:00
946d28646a
[fix](outfile)Fixed orcOutputStream.close() throwing an exception during destruction causing the program to hang. ( #34254 )
...
bp #34243
2024-04-28 19:54:34 +08:00
30a68c1240
[fix](spill) use different algorithm to avoid partition data skew ( #34162 )
2024-04-27 11:20:36 +08:00
60e20a3afe
[fix](pipeline_x) Crc32HashPartitioner should use ShuffleChannelIds ( #34147 )
2024-04-26 15:03:11 +08:00
25358564ca
[Fix](compile) Fix gcc compile on master ( #33864 )
...
This is imported by #33511 . wrongly used
ColumnStr<T> ();
which violate C++20 standard(see https://wg21.cmeerw.net/cwg/issue2237 ) but still supported by clang up until now(see llvm/llvm-project#58112 )
2024-04-19 23:41:37 +08:00
657a29fd9e
[refactor](partitioner) refine get channel id logics ( #33765 )
2024-04-18 19:05:24 +08:00
4863167f90
[refactor](pipelineX) Reduce prepare overhead (PART I) ( #33550 )
2024-04-17 23:42:12 +08:00
341cb40693
[Chore](log) adjust output order on PrintInstanceStandardInfo and reduce warning log when rpc finished ( #33652 )
...
adjust output order on PrintInstanceStandardInfo and reduce warning log when rpc finished
2024-04-17 23:42:12 +08:00
48880c3e1a
[Fix](timezone) fix miss of expected rounding of Date type with timezone #33553
2024-04-17 23:42:11 +08:00
3c9c6c18a8
[Enhancement](hive-writer) Write only regular fields to file in the hive-writer. ( #33000 )
2024-04-12 10:29:08 +08:00
3081fc584d
[Improvement](runtime-filter) support sync join node build side's size to init bloom runtime filter ( #32180 )
...
support sync join node build side's size to init bloom runtime filter
2024-04-11 09:31:50 +08:00
4963d60a07
[Fix](multi-catalog)Fix the issue of not initializing the writer caused by refactoring and add hive writing regression test. ( #32721 ) ( #33446 )
...
backport #32721 .
2024-04-10 11:42:22 +08:00
59aa923bce
[bug](function) fix milliseconds_diff function return wrong result ( #32897 )
...
* [bug](function) fix milliseconds_diff function return wrong result
2024-04-10 11:34:30 +08:00
2a0644f442
[Fix](function) Fix unix_timestamp core for string input ( #32871 )
2024-04-09 12:48:35 +08:00
d7a3ff1ddf
[Fix](Outfile) Fix the column type mapping in the orc/parquet file format ( #32281 )
...
| Doris Type | Orc Type | Parquet Type |
|---------------------|--------------------|------------------------|
| Date | Long (logical: DATE) | int32 (Logical: Date) |
| DateTime | TIMESTAMP (logical: TIMESTAMP) | int96 |
2024-03-22 08:52:16 +08:00
0990014e94
[fix](datetime) fix datetime rounding on BE ( #32075 )
2024-03-21 14:07:19 +08:00
ef2151ae66
[Feature-WIP](multi-catalog) Add Hive sink on BE side. ( #32306 ) ( #32364 )
...
bp #32306
Co-authored-by: Qi Chen <kaka11.chen@gmail.com >
2024-03-18 11:23:01 +08:00
20d6698c27
[bugfix](arm compile) could not compile on arm because -Werror=maybe-uninitialized
2024-03-14 12:11:25 +08:00
0159a75ced
[bugfix](becore) be will core when stop because the map is modified during iterator ( #32105 )
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-03-12 18:50:26 +08:00
4268634115
[fix](memory) Fix Allocator cancel pipelinex query #32048
2024-03-12 14:20:18 +08:00
68a5319da3
[fix](pipelineX) _local_channel_dependency is null in non pipelineX ( #32054 )
2024-03-12 14:19:04 +08:00
c0f2d0188b
[feature](pipelineX) add mem control in local exchange sink ( #31982 )
2024-03-12 14:17:48 +08:00
808563470f
[pipelineX](debug) Refactor code and complete debug string ( #31733 )
2024-03-06 13:07:49 +08:00
3451cd6c23
[fix](datetime) fix hour 24 on be ( #31304 )
2024-02-26 19:07:10 +08:00
52b9af06fb
[pipelineX](refactor) Delete subclasses inherited from Dependency ( #31216 )
2024-02-22 13:01:48 +08:00
49dd411f87
[fix](datetime) fix datetime round on BE ( #31205 )
...
with tmp as (
select CONCAT(
YEAR('2024-02-06 03:37:07.157'), '-',
LPAD(MONTH('2024-02-06 03:37:07.157'), 2, '0'), '-',
LPAD(DAY('2024-02-06 03:37:07.157'), 2, '0'), ' ',
LPAD(HOUR('2024-02-06 03:37:07.157'), 2, '0'), ':',
LPAD(MINUTE('2024-02-06 03:37:07.157'), 2, '0'), ':',
LPAD(SECOND('2024-02-06 03:37:07.157'), 2, '0'), '.', "123456789" )
AS generated_string)
select generated_string, cast(generated_string as DateTime(6)) from tmp
before (incorrect round)
+-------------------------------+-----------------------------------------+
| generated_string | cast(generated_string as DATETIMEV2(6)) |
+-------------------------------+-----------------------------------------+
| 2024-02-06 03:37:07.123456789 | 2024-02-06 03:37:07.123456 |
+-------------------------------+-----------------------------------------+
after (round up, keep consistent with mysql):
+-------------------------------+-----------------------------------------+
| generated_string | cast(generated_string as DATETIMEV2(6)) |
+-------------------------------+-----------------------------------------+
| 2024-02-06 03:37:07.123456789 | 2024-02-06 03:37:07.123457 |
+-------------------------------+-----------------------------------------+
1 row in set (0.03 sec)
same work with #30744 but implemented on BE
2024-02-21 19:18:45 +08:00
f65844fae4
[Enhencement](Outfile/Export) Export data to csv file format with BOM ( #30533 )
...
The UTF8 format of the Windows system has BOM.
We add a new user property to `Outfile/Export`。Therefore, when exporting Doris data, users can choose whether to bring BOM on the beginning of the CSV file.
**Usage:**
```sql
-- outfile:
select * from demo.student
into outfile "file:///xxx/export/exp_"
format as csv
properties(
"column_separator" = ",",
"with_bom" = "true"
);
-- Export:
EXPORT TABLE student TO "file:///xx/tmpdata/export/exp_"
PROPERTIES(
"format" = "csv",
"with_bom" = "true"
);
```
2024-02-16 10:16:40 +08:00
be31b8dc61
[Refactor](exchange) remove unless code in exchange and opt some code ( #30813 )
2024-02-05 21:59:52 +08:00
8ff8d94697
[fix](ip) change IPv6 to little-endian byte order storage (like IPv4) ( #30730 )
2024-02-05 21:56:57 +08:00
3315c16383
[enhance](function) refactor from_format_str and support more format ( #30452 )
2024-02-01 19:08:37 +08:00
713798d549
[feature](nereids)support mark join ( #30133 )
...
Co-authored-by: Jerry Hu <mrhhsg@gmail.com >
2024-01-27 09:09:53 +08:00
24ed3e4103
[Fix](Expr&code-style) check prepare&open before every VExpr execute ( #26673 )
2024-01-23 10:09:54 +08:00
4d97f8ea75
[enhance](function) support two special format for str_to_date ( #29823 )
2024-01-12 12:00:32 +08:00
3cf95d0fdf
[Improvement](execute) optimize for ColumnNullable's serialize_vec/deserialize_vec ( #28788 )
...
optimize for ColumnNullable's serialize_vec/deserialize_vec
2024-01-12 11:59:52 +08:00
0d691c638b
[Feature](profile)Support report runtime workload statistics #29591
2024-01-12 11:59:27 +08:00
fc4ca712ed
[bugfix](core) using weak ptr in data stream receiver to avoid runtime state is deconstructed ( #29410 )
2024-01-12 11:48:39 +08:00
7287c0ca15
[Opt](exec)(multi-catalog) Opt date type reading. ( #29571 )
2024-01-12 11:48:39 +08:00