doris

Author	SHA1	Message	Date
walter	263135c193	[fix](case) fix export data consistency case (#32005 )	2024-03-09 19:45:50 +08:00
walter	1f825ee2d6	[improve](export) Support partition data consistency (#31290 )	2024-03-01 04:25:43 +08:00
Mingyu Chen	a8d8c6a271	[fix](file-writer) opt s3 file writer and fix empty file related issue #28983 #30703 #31169 (#31213 ) * (feature)(cloud) Use dynamic allocator instead of static buffer pool for better elasticity. (#28983) * [fix](outfile) Fix unable to export empty data (#30703) Issue Number: close #30600 Fix unable to export empty data to hdfs / S3, this behavior is inconsistent with version 1.2.7, version 1.2.7 can export empty data to hdfs/ S3, and there will be exported files on S3/HDFS. * [fix](file-writer) avoid empty file for segment writer (#31169) --------- Co-authored-by: AlexYue <yj976240184@gmail.com> Co-authored-by: zxealous <zhouchangyue@baidu.com>	2024-02-21 16:48:54 +08:00
Tiewei Fang	f65844fae4	[Enhencement](Outfile/Export) Export data to csv file format with BOM (#30533 ) The UTF8 format of the Windows system has BOM. We add a new user property to `Outfile/Export`。Therefore, when exporting Doris data, users can choose whether to bring BOM on the beginning of the CSV file. Usage: ```sql -- outfile: select * from demo.student into outfile "file:///xxx/export/exp_" format as csv properties( "column_separator" = ",", "with_bom" = "true" ); -- Export: EXPORT TABLE student TO "file:///xx/tmpdata/export/exp_" PROPERTIES( "format" = "csv", "with_bom" = "true" ); ```	2024-02-16 10:16:40 +08:00
xueweizhang	203daba19d	[fix](outfile) fix outfile csv did not write json column with string (#29067 )	2024-02-01 19:01:08 +08:00
Tiewei Fang	78b0fec33a	[Fix](Outfile) Support export nested complex type data to orc file format (#28182 )	2023-12-13 11:55:27 +08:00
Tiewei Fang	3dcbf16404	[Fix](Outfile) The Struct type data exported from select outfile to the csv file format should contain a column name #28068 If the original data is： ```sql +-----------------------------------------------------+ \| s_info \| +-----------------------------------------------------+ \| {"s_id": 2, "s_name": "nereids", "s_address": "20"} \| \| {"s_id": 1, "s_name": "doris", "s_address": "18"} \| +-----------------------------------------------------+ ``` In the original logic, the struct type data exported to a csv file format did not contain column names,like ``` {2, "nereids", "20"} {1, "doris", "18"} ``` This pr do not need to be merged into branch-2.0	2023-12-07 18:23:36 +08:00
Tiewei Fang	9b59bc14b5	[test](Export) add `show export` regression testes (#27140 )	2023-11-22 00:13:30 +08:00
Tiewei Fang	3e10e5af39	[Fix](Serde) Fix content displayed by complex types in MySQL Client (#25946 ) This pr makes three changes to the display of complex types： 1. NULL value in complex types refers to being displayed as `null`, not `NULL` 2. struct type is displayed as "column_name": column_value 3. Time types such as `datetime` and `date`, are displayed with double quotes in complex types. like `{1, "2023-10-26 12:12:12"}` This pr also do a code refactor: 1. nesting_level is set to a member variable of the `DataTypeSerDe`, rather than a parameter in methods. What's more, this pr fix a bug that fileSize is not correct, introduced by this pr: #25854	2023-11-01 23:48:55 +08:00
Tiewei Fang	99b45e1938	[fix](Outfile) Export `DateTimev2` type of doris to ORC's `TimeStamp` type (#25470 ) Previously,doris's `DateTimev2` was exported to orc as a `String` type. Now, export doris's `DateTimev2` to orc timestamp type.	2023-10-29 15:59:38 +08:00
Tiewei Fang	7f66be84d5	[fix](Outfile) Infer the column name if the column is expression in `select into outfile` (#25854 ) This pr do two things: 1. Infer the column name if the column is expression in `select into outfile`. The rule for column name generation can be refered in pr: #24990 2. fix bug that it will core dump if the `_schema` fails to build in the open phase in vorc_transformer.cpp TODO: 1. Support infer the column name if the column is expression in `select into outfile` in new optimizer(Nereids).	2023-10-25 22:49:04 +08:00
lsy3993	ade475a52b	[regression](outfile)add regression for select outfile with underscore prefix #25797	2023-10-24 17:58:38 +08:00
Tiewei Fang	6f9a084d99	[Fix](Outfile) Use data_type_serde to export data to `parquet` file format (#24998 )	2023-10-13 13:58:34 +08:00
Tiewei Fang	c6b1c903e4	[fix](Regression-test) fix that the String type in a nested type should contain double quotes and add regression-test (#25115 )	2023-10-11 18:30:26 +08:00
Tiewei Fang	a48b19ceb6	[feature](Outfile) `select into outfile` supports to export struct/map/array type data to orc file format (#24350 ) We do not support nested complex type in this pr.	2023-09-21 20:15:18 +08:00
Tiewei Fang	a946f99b8c	[Fix](regression-test) fix regression-test of export parquet file format (#24450 )	2023-09-20 15:41:49 +08:00
wudi	29fe87982f	[improve](outfile) add file_suffix options for outfile (#24334 )	2023-09-15 12:58:41 +08:00
Tiewei Fang	9847f7789f	[Feature](Export) `Export` sql supports to export data of `view` and `exrernal table` (#24070 ) Previously, EXPORT only supported the export of the olap table, This pr supports the export of view table and external table.	2023-09-13 22:55:19 +08:00
Tiewei Fang	a27349c83a	[fix](Export) Concatenation the outfile sql for Export (#23635 ) In the original logic, the `Export` statement generates `Selectstmt` for execution. But there is no way to make the `SelectStmt` use the new optimizer. Now, we change the `Export` statement to generate the `outfile SQL`, and then use the new optimizer to parse the SQL so that outfile can use the new optimizer.	2023-09-08 10:20:18 +08:00
Tiewei Fang	103fa4eb55	[feature](Export) support export with nereids (#23319 )	2023-08-29 19:36:19 +08:00
Tiewei Fang	f32efe5758	[Fix](Outfile) Fix that it does not report error when export table to S3 with an incorrect ak/sk/bucket (#23441 ) Problem: It will return a result although we use wrong ak/sk/bucket name, such as: ```sql mysql> select * from demo.student -> into outfile "s3://xxxx/exp_" -> format as csv -> properties( -> "s3.endpoint" = "https://cos.ap-beijing.myqcloud.com", -> "s3.region" = "ap-beijing", -> "s3.access_key"= "xxx", -> "s3.secret_key" = "yyyy" -> ); +------------+-----------+----------+----------------------------------------------------------------------------------------------------+ \| FileNumber \| TotalRows \| FileSize \| URL \| +------------+-----------+----------+----------------------------------------------------------------------------------------------------+ \| 1 \| 3 \| 26 \| s3://xxxx/exp_2ae166e2981d4c08-b577290f93aa82ba_ \| +------------+-----------+----------+----------------------------------------------------------------------------------------------------+ 1 row in set (0.15 sec) ``` The reason for this is that we did not catch the error returned by `close()` phase.	2023-08-26 00:19:30 +08:00
Tiewei Fang	18094511e7	[fix](Outfile/Nereids) fix that `csv_with_names` and `csv_with_names_and_types` file format could not be exported on nereids (#23387 ) This problem is casued by #21197 Fixed an issue that `csv_with_names` and `csv_with_names_and_types` file format could not be exported on nereids optimizer when using `select...into outfile`.	2023-08-25 11:12:04 +08:00
mch_ucchi	1d05feea1b	[Feature](Nereids) add executable function to support fold constant for functions (#18209 ) 1. Add date-time functions for fold constant for Nereids. This is the list of executable date-time function nereids supports up to now: - now() - now(int) - current_timestamp() - current_timestamp(int) - localtime() - localtimestamp() - curdate() - current_date() - curtime() - current_time() - date_{add/sub}(),{years/months/days/hours/minutes/seconds}_{add/sub}() - datediff() - {date/datev2}() - {year/quarter/month/day/hour/minute/second}() - dayof{year/month/week}() - date_format() - date_trunc() - from_days() - last_day() - to_monday() - from_unixtime() - unix_timestamp() - utc_timestamp() - to_date() - to_days() - str_to_date() - makedate() 2. solved problem: - enable datev2/datetimev2 default. - refactor Nereids foldConstantOnFE and support fold nested expression. - separate the executable into multi-files for easily-reading and adding new functions	2023-05-17 21:26:31 +08:00
Tiewei Fang	91cdb79d89	[Bugfix](Outfile) fix that export data to parquet and orc file format (#19436 ) 1. support export `LARGEINT` data type to parquet/orc file format. 2. Export the DORIS `DATE/DATETIME` type to the `Date/Timestamp` logic type of parquet file format. 3. Fix that the data is not correct when the DATE type data is exported to ORC.	2023-05-13 22:39:24 +08:00
Tiewei Fang	45d0f53529	[Regression-test](Export) add regression test for export #18897	2023-04-23 19:43:22 +08:00
Gabriel	c2fae109c3	[Improvement](outfile) Support output null in parquet writer (#12970 )	2022-09-29 13:36:30 +08:00
Gabriel	1f9eec5462	[Regression](datev2) Add test cases for datev2/datetimev2 (#11831 )	2022-08-19 10:57:55 +08:00
Yongqiang YANG	ff1971f916	[improvement](test) add dryRun option and group all cases into either p0 or p1 (#11576 ) 1. add dryRun option to list tests 2. group all cases into p0 p1 p2	2022-08-17 22:45:53 +08:00

28 Commits