946d28646a
[fix](outfile)Fixed orcOutputStream.close() throwing an exception during destruction causing the program to hang. ( #34254 )
...
bp #34243
2024-04-28 19:54:34 +08:00
30a68c1240
[fix](spill) use different algorithm to avoid partition data skew ( #34162 )
2024-04-27 11:20:36 +08:00
60e20a3afe
[fix](pipeline_x) Crc32HashPartitioner should use ShuffleChannelIds ( #34147 )
2024-04-26 15:03:11 +08:00
25358564ca
[Fix](compile) Fix gcc compile on master ( #33864 )
...
This is imported by #33511 . wrongly used
ColumnStr<T> ();
which violate C++20 standard(see https://wg21.cmeerw.net/cwg/issue2237 ) but still supported by clang up until now(see llvm/llvm-project#58112 )
2024-04-19 23:41:37 +08:00
657a29fd9e
[refactor](partitioner) refine get channel id logics ( #33765 )
2024-04-18 19:05:24 +08:00
4863167f90
[refactor](pipelineX) Reduce prepare overhead (PART I) ( #33550 )
2024-04-17 23:42:12 +08:00
341cb40693
[Chore](log) adjust output order on PrintInstanceStandardInfo and reduce warning log when rpc finished ( #33652 )
...
adjust output order on PrintInstanceStandardInfo and reduce warning log when rpc finished
2024-04-17 23:42:12 +08:00
48880c3e1a
[Fix](timezone) fix miss of expected rounding of Date type with timezone #33553
2024-04-17 23:42:11 +08:00
3c9c6c18a8
[Enhancement](hive-writer) Write only regular fields to file in the hive-writer. ( #33000 )
2024-04-12 10:29:08 +08:00
3081fc584d
[Improvement](runtime-filter) support sync join node build side's size to init bloom runtime filter ( #32180 )
...
support sync join node build side's size to init bloom runtime filter
2024-04-11 09:31:50 +08:00
4963d60a07
[Fix](multi-catalog)Fix the issue of not initializing the writer caused by refactoring and add hive writing regression test. ( #32721 ) ( #33446 )
...
backport #32721 .
2024-04-10 11:42:22 +08:00
59aa923bce
[bug](function) fix milliseconds_diff function return wrong result ( #32897 )
...
* [bug](function) fix milliseconds_diff function return wrong result
2024-04-10 11:34:30 +08:00
2a0644f442
[Fix](function) Fix unix_timestamp core for string input ( #32871 )
2024-04-09 12:48:35 +08:00
d7a3ff1ddf
[Fix](Outfile) Fix the column type mapping in the orc/parquet file format ( #32281 )
...
| Doris Type | Orc Type | Parquet Type |
|---------------------|--------------------|------------------------|
| Date | Long (logical: DATE) | int32 (Logical: Date) |
| DateTime | TIMESTAMP (logical: TIMESTAMP) | int96 |
2024-03-22 08:52:16 +08:00
0990014e94
[fix](datetime) fix datetime rounding on BE ( #32075 )
2024-03-21 14:07:19 +08:00
ef2151ae66
[Feature-WIP](multi-catalog) Add Hive sink on BE side. ( #32306 ) ( #32364 )
...
bp #32306
Co-authored-by: Qi Chen <kaka11.chen@gmail.com >
2024-03-18 11:23:01 +08:00
20d6698c27
[bugfix](arm compile) could not compile on arm because -Werror=maybe-uninitialized
2024-03-14 12:11:25 +08:00
0159a75ced
[bugfix](becore) be will core when stop because the map is modified during iterator ( #32105 )
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-03-12 18:50:26 +08:00
4268634115
[fix](memory) Fix Allocator cancel pipelinex query #32048
2024-03-12 14:20:18 +08:00
68a5319da3
[fix](pipelineX) _local_channel_dependency is null in non pipelineX ( #32054 )
2024-03-12 14:19:04 +08:00
c0f2d0188b
[feature](pipelineX) add mem control in local exchange sink ( #31982 )
2024-03-12 14:17:48 +08:00
808563470f
[pipelineX](debug) Refactor code and complete debug string ( #31733 )
2024-03-06 13:07:49 +08:00
3451cd6c23
[fix](datetime) fix hour 24 on be ( #31304 )
2024-02-26 19:07:10 +08:00
52b9af06fb
[pipelineX](refactor) Delete subclasses inherited from Dependency ( #31216 )
2024-02-22 13:01:48 +08:00
49dd411f87
[fix](datetime) fix datetime round on BE ( #31205 )
...
with tmp as (
select CONCAT(
YEAR('2024-02-06 03:37:07.157'), '-',
LPAD(MONTH('2024-02-06 03:37:07.157'), 2, '0'), '-',
LPAD(DAY('2024-02-06 03:37:07.157'), 2, '0'), ' ',
LPAD(HOUR('2024-02-06 03:37:07.157'), 2, '0'), ':',
LPAD(MINUTE('2024-02-06 03:37:07.157'), 2, '0'), ':',
LPAD(SECOND('2024-02-06 03:37:07.157'), 2, '0'), '.', "123456789" )
AS generated_string)
select generated_string, cast(generated_string as DateTime(6)) from tmp
before (incorrect round)
+-------------------------------+-----------------------------------------+
| generated_string | cast(generated_string as DATETIMEV2(6)) |
+-------------------------------+-----------------------------------------+
| 2024-02-06 03:37:07.123456789 | 2024-02-06 03:37:07.123456 |
+-------------------------------+-----------------------------------------+
after (round up, keep consistent with mysql):
+-------------------------------+-----------------------------------------+
| generated_string | cast(generated_string as DATETIMEV2(6)) |
+-------------------------------+-----------------------------------------+
| 2024-02-06 03:37:07.123456789 | 2024-02-06 03:37:07.123457 |
+-------------------------------+-----------------------------------------+
1 row in set (0.03 sec)
same work with #30744 but implemented on BE
2024-02-21 19:18:45 +08:00
f65844fae4
[Enhencement](Outfile/Export) Export data to csv file format with BOM ( #30533 )
...
The UTF8 format of the Windows system has BOM.
We add a new user property to `Outfile/Export`。Therefore, when exporting Doris data, users can choose whether to bring BOM on the beginning of the CSV file.
**Usage:**
```sql
-- outfile:
select * from demo.student
into outfile "file:///xxx/export/exp_"
format as csv
properties(
"column_separator" = ",",
"with_bom" = "true"
);
-- Export:
EXPORT TABLE student TO "file:///xx/tmpdata/export/exp_"
PROPERTIES(
"format" = "csv",
"with_bom" = "true"
);
```
2024-02-16 10:16:40 +08:00
be31b8dc61
[Refactor](exchange) remove unless code in exchange and opt some code ( #30813 )
2024-02-05 21:59:52 +08:00
8ff8d94697
[fix](ip) change IPv6 to little-endian byte order storage (like IPv4) ( #30730 )
2024-02-05 21:56:57 +08:00
3315c16383
[enhance](function) refactor from_format_str and support more format ( #30452 )
2024-02-01 19:08:37 +08:00
713798d549
[feature](nereids)support mark join ( #30133 )
...
Co-authored-by: Jerry Hu <mrhhsg@gmail.com >
2024-01-27 09:09:53 +08:00
24ed3e4103
[Fix](Expr&code-style) check prepare&open before every VExpr execute ( #26673 )
2024-01-23 10:09:54 +08:00
4d97f8ea75
[enhance](function) support two special format for str_to_date ( #29823 )
2024-01-12 12:00:32 +08:00
3cf95d0fdf
[Improvement](execute) optimize for ColumnNullable's serialize_vec/deserialize_vec ( #28788 )
...
optimize for ColumnNullable's serialize_vec/deserialize_vec
2024-01-12 11:59:52 +08:00
0d691c638b
[Feature](profile)Support report runtime workload statistics #29591
2024-01-12 11:59:27 +08:00
fc4ca712ed
[bugfix](core) using weak ptr in data stream receiver to avoid runtime state is deconstructed ( #29410 )
2024-01-12 11:48:39 +08:00
7287c0ca15
[Opt](exec)(multi-catalog) Opt date type reading. ( #29571 )
2024-01-12 11:48:39 +08:00
be56bf06cf
[feature](function) support ip function named is_ip_address_in_range(addr, cidr) ( #29681 )
2024-01-12 11:44:21 +08:00
767de7afe8
Revert "[feature](pipelineX) control exchange sink by memory usage ( #28814 )" ( #29652 )
...
This reverts commit e326ebb63e4e07d8ee6595561ab19dc5d411f592.
2024-01-08 21:48:51 +08:00
eb4c389b0b
[feature](function) support ip functions isipv4string and isipv6string ( #28556 )
2024-01-07 13:03:11 +08:00
f54f79515c
[Bug](fix) str_to_date "" should be null ( #29402 )
2024-01-03 08:25:22 +08:00
3dc3e81734
[Improvement](datatype) Update Parser for IPv4/v6 data types ( #29044 )
...
Transforming from parsing std:: string to parsing char * to accelerate the parsing of ipv4/v6 data types.
2023-12-28 11:00:38 +08:00
6d26aca4ca
[fix](pipeline) sort_merge should throw exception in has_next_block if got failed status ( #29076 )
...
Test in regression-test/suites/datatype_p0/decimalv3/test_decimalv3_overflow.groovy::249 sometimes failed when there are multiple BEs and FE process report status slowly for some reason.
explain select k1, k2, k1 * k2 from test_decimal128_overflow2 order by 1,2,3
--------------
+----------------------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner) |
+----------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS: |
| k1[#5 ] |
| k2[#6 ] |
| (k1 * k2)[#7 ] |
| PARTITION: UNPARTITIONED |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| VRESULT SINK |
| MYSQL_PROTOCAL |
| |
| 111:VMERGING-EXCHANGE |
| offset: 0 |
| |
| PLAN FRAGMENT 1 |
| |
| PARTITION: HASH_PARTITIONED: k1[#0 ], k2[#1 ] |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 111 |
| UNPARTITIONED |
| |
| 108:VSORT |
| | order by: k1[#5 ] ASC, k2[#6 ] ASC, (k1 * k2)[#7 ] ASC |
| | offset: 0 |
| | |
| 102:VOlapScanNode |
| TABLE: regression_test_datatype_p0_decimalv3.test_decimal128_overflow2(test_decimal128_overflow2), PREAGGREGATION: ON |
| partitions=1/1 (test_decimal128_overflow2), tablets=8/8, tabletList=22841,22843,22845 ... |
| cardinality=6, avgRowSize=0.0, numNodes=1 |
| pushAggOp=NONE |
| projections: k1[#0 ], k2[#1 ], (k1[#0 ] * k2[#1 ]) |
| project output tuple id: 1 |
+----------------------------------------------------------------------------------------------------------------------------+
36 rows in set (0.03 sec)
Why failed:
Multiple BEs
Fragments 0 and 1 are MUST on different BEs
Pipeline task of VOlapScanNode which executes k1*k2 failed sets query status to cancelled
Pipeline task of VSort call try close, send Cancelled status to VMergeExchange
sort_curso did not throw exception when it meets error
2023-12-27 10:06:01 +08:00
7081139bdc
[fix](block) fix be core while mutable block merge may cause different row size between columns in origin block ( #27943 )
2023-12-25 20:35:22 +08:00
e326ebb63e
[feature](pipelineX) control exchange sink by memory usage ( #28814 )
2023-12-25 10:31:50 +08:00
0b9b1be1f1
[fix](function) Fix from_second functions overflow and wrong result ( #28685 )
2023-12-22 10:22:49 +08:00
e8d0569d8b
[refine](pipelineX)Make the 'set ready' logic of SenderQueue in pipelineX the same as that in the pipeline ( #28488 )
2023-12-20 19:26:00 +08:00
c00dca70e6
[pipelineX](local shuffle) Support parallel execution despite of tablet number ( #28266 )
2023-12-14 12:53:54 +08:00
78b0fec33a
[Fix](Outfile) Support export nested complex type data to orc file format ( #28182 )
2023-12-13 11:55:27 +08:00
ea275e687a
[pipelineX](minor) remove unused code ( #28016 )
2023-12-05 19:41:40 +08:00
10483ea12c
[fix](profile) fix error set with peak_memory_usage in pipeline #27749
2023-12-02 14:12:38 +08:00