doris

Author	SHA1	Message	Date
Xin Liao	bf71943605	[feature](load) stream load trim double quotes for csv (#15241 )	2022-12-26 11:45:54 +08:00
camby	ada091b5d2	[fix](array-type) forbid implicit cast of array type while load (#15325 ) * forbit array cast while load * add regression test Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-12-26 09:24:44 +08:00
Tiewei Fang	ec055e1acb	[feature](new file reader) Integrate new file reader (#15175 )	2022-12-26 08:55:52 +08:00
Gabriel	4dbe30d37b	[regression](vectorized) delete vectorized config in regression tests (#15126 )	2022-12-16 17:08:29 +08:00
Mingyu Chen	0e1e5a802b	[config](load) enable new load scan node by default (#14808 ) Set FE `enable_new_load_scan_node` to true by default. So that all load tasks(broker load, stream load, routine load, insert into) will use FileScanNode instead of BrokerScanNode to read data 1. Support loading parquet file in stream load with new load scan node. 2. Fix bug that new parquet reader can not read column without logical or converted type. 3. Change jsonb parser function to "jsonb_parse_error_to_null" So that if the input string is not a valid json string, it will return null for jsonb column in load task.	2022-12-16 09:41:43 +08:00
Mingyu Chen	dd7ec8f4ca	[improvement](test) add tpch1 orc for hive catalog and refactor some test dir (#14669 ) Add tpch 1g orc test case in hive docker Refactor some suites dir of catalog test cases. And "-internal" for dlf endpoint, to support access oss with aliyun vpc.	2022-11-30 10:03:58 +08:00
lsy3993	f7a827c06b	[fix](new-scan) fix some bugs about new scan node and readers (#14504 ) json reader DCHECK fail because of missing TYPE_STRING fix bug that if no file is found, the tvf will throw NPE. The predicate conjuncts can not be pushed down to parquet reader if this is a load task. Because the predicate should be applied on column of dest table, not on column of source file. Add a temp property "use_new_load_scan_node" of broker load to make regression test happy. So that we can use new load scan node for a certain job and avoid setting global FE config.	2022-11-29 10:21:41 +08:00
huangzhaowei	5badd70db2	[fix](csv-reader) Fix core dump when load text into doris with special delimiter (#14196 )	2022-11-15 16:06:59 +08:00
Tiewei Fang	c418bbd2d1	[feature-wip](new-scan) support Json reader (#13546 ) Issue Number: close #12574 This pr adds `NewJsonReader` which implements GenericReader interface to support read json format file. TODO: 1. modify `_scann_eof` later. 2. Rename `NewJsonReader` to `JsonReader` when `JsonReader` is deleted.	2022-10-26 12:52:21 +08:00
zhengyu	b85c78ee00	[fix](regression) add 'if not exists' to 'create table' to support parallel test (#13576 ) (#13578 ) Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2022-10-25 16:37:07 +08:00
Mingyu Chen	de5bc6a8a5	[fix](regression-test) set label for stream load (#13620 )	2022-10-25 14:13:24 +08:00
Mingyu Chen	32b1456b28	[feature-wip](array) remove array config and check array nested depth (#13428 ) 1. remove FE config `enable_array_type` 2. limit the nested depth of array in FE side. 3. Fix bug that when loading array from parquet, the decimal type is treated as bigint 4. Fix loading array from csv(vec-engine), handle null and "null" 5. Change the csv array loading behavior, if the array string format is invalid in csv, it will be converted to null. 6. Remove `check_array_format()`, because it's logic is wrong and meaningless 7. Add stream load csv test cases and more parquet broker load tests	2022-10-20 15:52:31 +08:00
Mingyu Chen	dbf71ed3be	[feature-wip](new-scan) Support stream load with csv in new scan framework (#13354 ) 1. Refactor the file reader creation in FileFactory, for simplicity. Previously, FileFactory had too many `create_file_reader` interfaces. Now unified into two categories: the interface used by the previous BrokerScanNode, and the interface used by the new FileScanNode. And separate the creation methods of readers that read `StreamLoadPipe` and other readers that read files. 2. Modify the StreamLoadPlanner on FE side to support using ExternalFileScanNode 3. Now for generic reader, the file reader will be created inside the reader, not passed from the outside. 4. Add some test cases for csv stream load, the behavior is same as the old broker scanner.	2022-10-17 23:33:41 +08:00
AlexYue	e0cff02c1a	add sync for stream load test (#13185 )	2022-10-09 11:36:01 +08:00
weizuo93	8b03977689	fix bug that last line of data lost for stream load when line delimiter is more than one character (#13066 )	2022-10-07 16:12:05 +08:00
Yongqiang YANG	d10ab474f4	[fix](test) try to let cases run in parallel (#13114 )	2022-10-04 20:56:22 +08:00
Yongqiang YANG	6fb9337095	[fix](test) add sync for some cases and adjust data path for tpch_unique_sql_zstd_p0 (#13102 )	2022-10-01 21:26:50 +08:00
yinzhijian	fb9e48a34a	[fix](vstream load) Fix bug when load json with jsonpath (#12660 )	2022-09-19 10:13:18 +08:00
luozenglin	3030a3606a	[fix](load) fix stream load fail when setting strict mode (#12684 )	2022-09-17 17:02:11 +08:00
yinzhijian	2a063355ad	[fix](vstream load) Fix the default value insertion problem when importing json (#12601 ) * [fix](vstream load) Fix the default value insertion problem when importing json * update	2022-09-16 09:54:45 +08:00
yinzhijian	353f9e3782	[regression](json) add a nullable case for stream load with json format (#12505 )	2022-09-13 10:45:01 +08:00
Gabriel	a536030979	[FOLLOWUP](load) fix nullable and add regression (#12375 ) * [FOLLOWUP](load) fix nullable and add regression	2022-09-08 00:05:04 +08:00
Zhengguo Yang	f3f17eb222	[Bugfix](load) fix be will coredump when parsing malformed json file using simdjson (#12062 ) * [Bugfix](load) fix be will coredump when parsing malformed json file using simdjson	2022-08-26 18:01:19 +08:00
lihangyu	cfb90b39c7	(vec-stream-load-json) simdjson throw execption lead to core dump (#11880 ) when config::enable_simdjson_parser=true in vec streamload, may lead to core dump when json input invalid format string like '{ "a', or all the fields is null like '{}', this may lead to simdjson lib throw some unhandled expection like `Objects and arrays can only be iterated when they are first encountered`.We should take care of these cases Signed-off-by: eldenmoon <15605149486@163.com>	2022-08-18 10:27:34 +08:00
Yongqiang YANG	ff1971f916	[improvement](test) add dryRun option and group all cases into either p0 or p1 (#11576 ) 1. add dryRun option to list tests 2. group all cases into p0 p1 p2	2022-08-17 22:45:53 +08:00

25 Commits