Commit Graph

16 Commits

Author SHA1 Message Date
a7156ee775 [fix](parquet)Fix the be core issue when reading parquet unsigned types. (#39926) (#40123)
bp #39926
2024-08-29 21:52:52 +08:00
3c535e80dd [fix](compatibility) type toSql should return lowercase string (#38012) (#38517)
pick from master #38012

revert #25951
2024-08-09 11:35:42 +08:00
b21b906306 [Fix](outfile) FE check the hdfs URI of outfile (#38602)
bp: #38203

1. Previously, if the root path of the HDFS URI started with two
slashes, the outfile would be successfully exported without errors, but
the exported path would not be the expected path.
Currently, we will delete repeated '/' which specified by users in FE.

2. move the test case for outfile HDFS from p2 to p0.
2024-07-31 22:46:37 +08:00
ceef9ee123 [feature](serde) support presto compatible output format (#37039) (#37253)
bp #37039
2024-07-04 13:56:05 +08:00
cbaff8a700 [fix](nereids)change the decimal's precision and scale for cast(xx as decimal) (#36540)
pick from master #36316

expression cast( xx as decimal )'s datatype maybe decimalv3 or decimalv2
depending on enable_decimal_conversion value in fe conf file. if
enable_decimal_conversion is true, the datatype is decimalv3(9, 0), but
the datatype was decimalv3(38, 9) in 2.0 releases. So this pr change the
datatype same as 2.0 releases to keep the behavior consistent.
2024-06-20 17:46:11 +08:00
3ef5ed1ad0 [opt](Nereids) normalize column name of output file (#34650)
when do export to output file, normalize column name.
For example

> SELECT 1 > 2 INTO OUTFILE "..."

the column name of 1 > 2 will be __greater_than_0
2024-05-13 22:12:46 +08:00
d7a3ff1ddf [Fix](Outfile) Fix the column type mapping in the orc/parquet file format (#32281)
| Doris Type             | Orc Type                     |  Parquet Type                |
|---------------------|--------------------|------------------------|
| Date                            | Long (logical: DATE)                 |       int32 (Logical: Date)                                        |
| DateTime                    | TIMESTAMP (logical: TIMESTAMP)    |       int96                          |
2024-03-22 08:52:16 +08:00
8bd101129a [behavior change](output) change float output format (#32049) 2024-03-21 14:07:22 +08:00
a8d8c6a271 [fix](file-writer) opt s3 file writer and fix empty file related issue #28983 #30703 #31169 (#31213)
* (feature)(cloud) Use dynamic allocator instead of static buffer pool for better elasticity. (#28983)

* [fix](outfile) Fix unable to export empty data (#30703)

Issue Number: close #30600
Fix unable to export empty data to hdfs / S3, this behavior is inconsistent with version 1.2.7,
version 1.2.7 can export empty data to hdfs/ S3, and there will be exported files on S3/HDFS.

* [fix](file-writer) avoid empty file for segment writer (#31169)

---------

Co-authored-by: AlexYue <yj976240184@gmail.com>
Co-authored-by: zxealous <zhouchangyue@baidu.com>
2024-02-21 16:48:54 +08:00
78b0fec33a [Fix](Outfile) Support export nested complex type data to orc file format (#28182) 2023-12-13 11:55:27 +08:00
3dcbf16404 [Fix](Outfile) The Struct type data exported from select outfile to the csv file format should contain a column name #28068
If the original data is:
```sql
+-----------------------------------------------------+
| s_info                                              |
+-----------------------------------------------------+
| {"s_id": 2, "s_name": "nereids", "s_address": "20"} |
| {"s_id": 1, "s_name": "doris", "s_address": "18"}   |
+-----------------------------------------------------+
```

In the original logic, the struct type data exported to a csv file format did not contain column names,like
```
{2, "nereids", "20"} 
{1, "doris", "18"}
```

This pr do not need to be merged into branch-2.0
2023-12-07 18:23:36 +08:00
3e10e5af39 [Fix](Serde) Fix content displayed by complex types in MySQL Client (#25946)
This pr makes three changes to the display of complex types:
1. NULL value in complex types refers to being displayed as `null`, not `NULL`
2. struct type is displayed as "column_name": column_value
3. Time types such as `datetime` and `date`, are displayed with double quotes in complex types. like
    `{1, "2023-10-26 12:12:12"}`

This pr also do a code refactor:
1. nesting_level is set to a member variable of the `DataTypeSerDe`, rather than a parameter in methods.

What's more, this pr fix a bug that fileSize is not correct, introduced by this pr: #25854
2023-11-01 23:48:55 +08:00
99b45e1938 [fix](Outfile) Export DateTimev2 type of doris to ORC's TimeStamp type (#25470)
Previously,doris's `DateTimev2` was exported to orc as a `String` type.
Now, export doris's `DateTimev2` to orc timestamp type.
2023-10-29 15:59:38 +08:00
7f66be84d5 [fix](Outfile) Infer the column name if the column is expression in select into outfile (#25854)
This pr do two things:
1. Infer the column name if the column is expression in `select into outfile`. The rule for column name generation can be refered in pr: #24990 
2. fix bug that it will core dump if the `_schema` fails to build in the open phase in vorc_transformer.cpp


TODO:
1. Support infer the column name if the column is expression in `select into outfile` in new optimizer(Nereids).
2023-10-25 22:49:04 +08:00
6f9a084d99 [Fix](Outfile) Use data_type_serde to export data to parquet file format (#24998) 2023-10-13 13:58:34 +08:00
c6b1c903e4 [fix](Regression-test) fix that the String type in a nested type should contain double quotes and add regression-test (#25115) 2023-10-11 18:30:26 +08:00