doris

Author	SHA1	Message	Date
Zhengguo Yang	7297b275f1	[Optimize] Optimize cpu consumption when importing parquet files (#6782 ) Remove part of dynamic_cast, reduce the overhead caused by type conversion, and probably reduce the cpu consumption of parquet file import by about 10%	2021-10-03 12:14:35 +08:00
Mingyu Chen	6b0521032d	[Bug] Fix the problem of floating point precision when importing parquet data (#5360 ) The double data "4206.9" in parquet is converted to decimal data "4206.8999" in Doris, which is not right.	2021-02-07 22:40:51 +08:00
Zhengguo Yang	93a4c7efc1	[LOG] Standardize the use of VLOG in code (#5264 ) At present, the application of vlog in the code is quite confusing. It is inherited from impala VLOG_XX format, and there is also VLOG(number) format. VLOG(number) format does not have a unified specification, so this pr standardizes the use of VLOG	2021-01-21 12:09:09 +08:00
sduzh	6fedf5881b	[CodeFormat] Clang-format cpp sources (#4965 ) Clang-format all c++ source files.	2020-11-28 18:36:49 +08:00
xinghuayu007	2331ce10f1	[Bug]Parquet map/list/struct structure recognize (#4968 ) When a parquet file contains a `Map/List/Struct` structure, Doris can not recognize the column correctly, and throws exception 'Invalid column: xxxx', that means Doris can not find the column. The `Map` structure will be recognized into two columns: `key and value`. The follow is the schema of a parquet file recognized by Doris. This patch tries to solve this problem.	2020-11-28 09:56:29 +08:00
sduzh	10e1e29711	Remove header file common/names.h (#4945 )	2020-11-26 17:00:48 +08:00
Zhengguo Yang	da921928d0	[Code]Fix some spell problem (#4066 ) fix some spell problem	2020-07-13 20:54:31 +08:00
xy720	c50a310f8f	[optimize] Optimize spark load/broker load reading parquet format file (#3878 ) Add BufferedReader for reading parquet file via broker	2020-06-23 13:42:22 +08:00
HangyuanLiu	d01b58bff6	Support 64 bit timestamp in from_unixtime (#3069 ) Support 64 bit timestamp in from_unixtime	2020-03-17 17:30:42 +08:00
HangyuanLiu	64e99f29e6	Fix parquet arrow read batch bug (#2812 ) Fix parquet arrow read batch bug #2811 The original code was to determine the number of rows in the batch based on the number of rows in the parquet RowGroup.But now it's a batch take 65535 lines. So when parquet row greater than 65535，the number of batch don't match the number of rowgroup. The code using the field "_current_line_of_group" as a position of array can cause the data to be out of array cause be crash	2020-01-21 10:57:56 +08:00
ZHAO Chun	87a50070c4	Fix bug: parquet scanner don't seek (#2661 )	2020-01-06 13:55:40 +08:00
ZHAO Chun	1648226927	Adapt arrow 0.15 API (#2657 ) This CL supports arrow's zero copy read interface, which can make code comply with arrow 0.15. And the schema change unit test has some problem, I disable it in run-ut.sh	2020-01-04 15:54:29 +08:00
yuanli	ba6d728f26	Enable parsing columns from file path for Broker Load (#1582 ) (#1635 ) Currently, we do not support parsing encoded/compressed columns in file path, eg: extract column k1 from file path /path/to/dir/k1=1/xxx.csv This patch is able to parse columns from file path like in Spark(Partition Discovery). This patch parse partition columns at BrokerScanNode.java and save parsing result of each file path as a property of TBrokerRangeDesc, then the broker reader of BE can read the value of specified partition column.	2019-08-19 09:39:21 +08:00
worker24h	a6d3099a68	Fix bug: localtime is not thread-safe,then changed to localtime_r. (#1614 )	2019-08-08 22:00:43 +08:00
worker24h	dc4a5e6c10	Support Decimal Type when load Parquet File (#1595 )	2019-08-07 19:52:23 +08:00
HangyuanLiu	9402456f5b	Fix parquet directory have empty file (#1593 )	2019-08-07 15:08:22 +08:00
worker24h	aff1559c4d	FixBug: if columns of doris table less than parquet file columns , BE will be crash (#1464 )	2019-07-12 15:23:13 +08:00
HangyuanLiu	b9c79d4b1b	Fix importing non-parquet format file causing be crash (#1454 )	2019-07-11 16:04:36 +08:00
worker24h	7eab12a40e	Support reading Parquet file when loading data (#1173 )	2019-07-01 18:39:27 +08:00

19 Commits