f65844fae4
[Enhencement](Outfile/Export) Export data to csv file format with BOM ( #30533 )
...
The UTF8 format of the Windows system has BOM.
We add a new user property to `Outfile/Export`。Therefore, when exporting Doris data, users can choose whether to bring BOM on the beginning of the CSV file.
**Usage:**
```sql
-- outfile:
select * from demo.student
into outfile "file:///xxx/export/exp_"
format as csv
properties(
"column_separator" = ",",
"with_bom" = "true"
);
-- Export:
EXPORT TABLE student TO "file:///xx/tmpdata/export/exp_"
PROPERTIES(
"format" = "csv",
"with_bom" = "true"
);
```
2024-02-16 10:16:40 +08:00
be31b8dc61
[Refactor](exchange) remove unless code in exchange and opt some code ( #30813 )
2024-02-05 21:59:52 +08:00
8ff8d94697
[fix](ip) change IPv6 to little-endian byte order storage (like IPv4) ( #30730 )
2024-02-05 21:56:57 +08:00
3315c16383
[enhance](function) refactor from_format_str and support more format ( #30452 )
2024-02-01 19:08:37 +08:00
713798d549
[feature](nereids)support mark join ( #30133 )
...
Co-authored-by: Jerry Hu <mrhhsg@gmail.com >
2024-01-27 09:09:53 +08:00
24ed3e4103
[Fix](Expr&code-style) check prepare&open before every VExpr execute ( #26673 )
2024-01-23 10:09:54 +08:00
4d97f8ea75
[enhance](function) support two special format for str_to_date ( #29823 )
2024-01-12 12:00:32 +08:00
3cf95d0fdf
[Improvement](execute) optimize for ColumnNullable's serialize_vec/deserialize_vec ( #28788 )
...
optimize for ColumnNullable's serialize_vec/deserialize_vec
2024-01-12 11:59:52 +08:00
0d691c638b
[Feature](profile)Support report runtime workload statistics #29591
2024-01-12 11:59:27 +08:00
fc4ca712ed
[bugfix](core) using weak ptr in data stream receiver to avoid runtime state is deconstructed ( #29410 )
2024-01-12 11:48:39 +08:00
7287c0ca15
[Opt](exec)(multi-catalog) Opt date type reading. ( #29571 )
2024-01-12 11:48:39 +08:00
be56bf06cf
[feature](function) support ip function named is_ip_address_in_range(addr, cidr) ( #29681 )
2024-01-12 11:44:21 +08:00
767de7afe8
Revert "[feature](pipelineX) control exchange sink by memory usage ( #28814 )" ( #29652 )
...
This reverts commit e326ebb63e4e07d8ee6595561ab19dc5d411f592.
2024-01-08 21:48:51 +08:00
eb4c389b0b
[feature](function) support ip functions isipv4string and isipv6string ( #28556 )
2024-01-07 13:03:11 +08:00
f54f79515c
[Bug](fix) str_to_date "" should be null ( #29402 )
2024-01-03 08:25:22 +08:00
3dc3e81734
[Improvement](datatype) Update Parser for IPv4/v6 data types ( #29044 )
...
Transforming from parsing std:: string to parsing char * to accelerate the parsing of ipv4/v6 data types.
2023-12-28 11:00:38 +08:00
6d26aca4ca
[fix](pipeline) sort_merge should throw exception in has_next_block if got failed status ( #29076 )
...
Test in regression-test/suites/datatype_p0/decimalv3/test_decimalv3_overflow.groovy::249 sometimes failed when there are multiple BEs and FE process report status slowly for some reason.
explain select k1, k2, k1 * k2 from test_decimal128_overflow2 order by 1,2,3
--------------
+----------------------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner) |
+----------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS: |
| k1[#5 ] |
| k2[#6 ] |
| (k1 * k2)[#7 ] |
| PARTITION: UNPARTITIONED |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| VRESULT SINK |
| MYSQL_PROTOCAL |
| |
| 111:VMERGING-EXCHANGE |
| offset: 0 |
| |
| PLAN FRAGMENT 1 |
| |
| PARTITION: HASH_PARTITIONED: k1[#0 ], k2[#1 ] |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 111 |
| UNPARTITIONED |
| |
| 108:VSORT |
| | order by: k1[#5 ] ASC, k2[#6 ] ASC, (k1 * k2)[#7 ] ASC |
| | offset: 0 |
| | |
| 102:VOlapScanNode |
| TABLE: regression_test_datatype_p0_decimalv3.test_decimal128_overflow2(test_decimal128_overflow2), PREAGGREGATION: ON |
| partitions=1/1 (test_decimal128_overflow2), tablets=8/8, tabletList=22841,22843,22845 ... |
| cardinality=6, avgRowSize=0.0, numNodes=1 |
| pushAggOp=NONE |
| projections: k1[#0 ], k2[#1 ], (k1[#0 ] * k2[#1 ]) |
| project output tuple id: 1 |
+----------------------------------------------------------------------------------------------------------------------------+
36 rows in set (0.03 sec)
Why failed:
Multiple BEs
Fragments 0 and 1 are MUST on different BEs
Pipeline task of VOlapScanNode which executes k1*k2 failed sets query status to cancelled
Pipeline task of VSort call try close, send Cancelled status to VMergeExchange
sort_curso did not throw exception when it meets error
2023-12-27 10:06:01 +08:00
7081139bdc
[fix](block) fix be core while mutable block merge may cause different row size between columns in origin block ( #27943 )
2023-12-25 20:35:22 +08:00
e326ebb63e
[feature](pipelineX) control exchange sink by memory usage ( #28814 )
2023-12-25 10:31:50 +08:00
0b9b1be1f1
[fix](function) Fix from_second functions overflow and wrong result ( #28685 )
2023-12-22 10:22:49 +08:00
e8d0569d8b
[refine](pipelineX)Make the 'set ready' logic of SenderQueue in pipelineX the same as that in the pipeline ( #28488 )
2023-12-20 19:26:00 +08:00
c00dca70e6
[pipelineX](local shuffle) Support parallel execution despite of tablet number ( #28266 )
2023-12-14 12:53:54 +08:00
78b0fec33a
[Fix](Outfile) Support export nested complex type data to orc file format ( #28182 )
2023-12-13 11:55:27 +08:00
ea275e687a
[pipelineX](minor) remove unused code ( #28016 )
2023-12-05 19:41:40 +08:00
10483ea12c
[fix](profile) fix error set with peak_memory_usage in pipeline #27749
2023-12-02 14:12:38 +08:00
d969047b50
[Refactor](join) refactor of hash join ( #27557 )
...
Improve the performance under the tpch data set by reconstructing the join related code and the use of hash table
Co-authored-by: HappenLee <happenlee@hotmail.com >
Co-authored-by: BiteTheDDDDt <pxl290@qq.com >
2023-11-28 19:46:00 +08:00
f565f60bc3
[refactor](standard)BE:Initialize pointer variables in the class to nullptr by default ( #27587 )
2023-11-28 13:02:30 +08:00
ea7eca9345
[pipelineX](bug) Add some logs ( #27596 )
2023-11-28 10:02:13 +08:00
baadc14e60
[Enhancement](function) support unix_timestamp with float ( #26827 )
...
---------
Co-authored-by: YangWithU <plzw8@outlook.com >
2023-11-27 09:58:53 +08:00
1b3512d942
[pipelineX](bug) Fix cancel timeout ( #27396 )
2023-11-22 22:31:34 +08:00
5442e8d1fc
[pipelineX](dependency) split different dependencies ( #27366 )
2023-11-22 12:50:39 +08:00
b1eef30b49
[pipelineX](dependency) Wake up task by dependencies ( #26879 )
...
---------
Co-authored-by: Mryange <2319153948@qq.com >
2023-11-18 03:20:24 +08:00
d988193d39
[pipelineX](shuffle) block exchange sink by memory usage ( #26595 )
2023-11-09 21:28:22 +08:00
5d80e7dc2f
[Improvement](pipelineX) Improve local exchange on pipelineX engine ( #26464 )
2023-11-07 22:11:44 +08:00
99b45e1938
[fix](Outfile) Export DateTimev2 type of doris to ORC's TimeStamp type ( #25470 )
...
Previously,doris's `DateTimev2` was exported to orc as a `String` type.
Now, export doris's `DateTimev2` to orc timestamp type.
2023-10-29 15:59:38 +08:00
c1d64a7128
[Feature](datatype) Add IPv4/v6 data type for doris ( #24965 )
2023-10-26 17:33:28 +08:00
d6c64d305f
[chore](log) Add log to trace query execution #25739
2023-10-26 14:09:25 +08:00
31d2a9a4f5
[Enhancement](function) support fractions for convert_tz(datetimev2) ( #25915 )
...
mysql> select convert_tz('2019-08-01 01:01:02.123' , '+00:00', '+07:00');
+----------------------------------------------------------------------------------+
| convert_tz(cast('2019-08-01 01:01:02.123' as DATETIMEV2(3)), '+00:00', '+07:00') |
+----------------------------------------------------------------------------------+
| 2019-08-01 08:01:02.123 |
+----------------------------------------------------------------------------------+
1 row in set (0.18 sec)
2023-10-26 10:46:47 +08:00
7f66be84d5
[fix](Outfile) Infer the column name if the column is expression in select into outfile ( #25854 )
...
This pr do two things:
1. Infer the column name if the column is expression in `select into outfile`. The rule for column name generation can be refered in pr: #24990
2. fix bug that it will core dump if the `_schema` fails to build in the open phase in vorc_transformer.cpp
TODO:
1. Support infer the column name if the column is expression in `select into outfile` in new optimizer(Nereids).
2023-10-25 22:49:04 +08:00
e8f479882d
[pipelineX](local exchange) Add local exchange operator ( #25846 )
2023-10-25 18:45:02 +08:00
6b2eed779c
[feature](AuditLog) add scanRows scanBytes in auditlog ( #25435 )
2023-10-25 10:00:35 +08:00
08832d9f3a
[Fix](exec) Fix date dict dead loop. ( #25570 )
2023-10-24 02:51:43 +08:00
642c149e6a
remove datetime_value and move vecdatetime_value to doris namespace ( #25695 )
...
remove datetime_value and move vecdatetime_value to doris namespace
2023-10-20 22:08:17 +08:00
dc47087560
[fix](function) fix str_to_date default return type scale for nereids ( #24932 )
...
fix str_to_date default return type scale for nereids
2023-10-20 12:55:49 +08:00
b964ab76b3
[refactor](shuffle) Simplify hash partitioning strategy ( #25596 )
2023-10-19 19:28:22 +08:00
e77b98be88
[fix](months_diff) fix wrong result of months_diff ( #25577 )
2023-10-19 14:29:47 +08:00
08f305dd79
[chore](build) Fix compilation errors reported by GCC-13 ( #25439 )
...
1. Fix lots of compilation errors reported by GCC-13.
2. Fix the workflow BE UT (macOS).
2023-10-15 07:57:36 -05:00
37dbda6209
[pipelineX](refactor) Use class template to simplify join ( #25369 )
2023-10-13 16:51:55 +08:00
6f9a084d99
[Fix](Outfile) Use data_type_serde to export data to parquet file format ( #24998 )
2023-10-13 13:58:34 +08:00
f960b8c989
[bugfix](stream receiver) be will core during stop because receiver is not closed ( #25298 )
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2023-10-11 19:49:40 +08:00