Commit Graph

10 Commits

Author SHA1 Message Date
64e99f29e6 Fix parquet arrow read batch bug (#2812)
Fix parquet arrow read batch bug
#2811

The original code was to determine the number of rows in the batch based on the number of rows in the parquet RowGroup.But now it's a batch take 65535 lines. So when parquet row greater than 65535,the number of batch don't match the number of rowgroup. The code using the field "_current_line_of_group" as a position of array can cause the data to be out of array cause be crash
2020-01-21 10:57:56 +08:00
87a50070c4 Fix bug: parquet scanner don't seek (#2661) 2020-01-06 13:55:40 +08:00
1648226927 Adapt arrow 0.15 API (#2657)
This CL supports arrow's zero copy read interface, which can make code
comply with arrow 0.15.
And the schema change unit test has some problem, I disable it in run-ut.sh
2020-01-04 15:54:29 +08:00
ba6d728f26 Enable parsing columns from file path for Broker Load (#1582) (#1635)
Currently, we do not support parsing encoded/compressed columns in file path, eg: extract column k1 from file path /path/to/dir/k1=1/xxx.csv

This patch is able to parse columns from file path like in Spark(Partition Discovery).

This patch parse partition columns at BrokerScanNode.java and save parsing result of each file path as a property of TBrokerRangeDesc, then the broker reader of BE can read the value of specified partition column.
2019-08-19 09:39:21 +08:00
a6d3099a68 Fix bug: localtime is not thread-safe,then changed to localtime_r. (#1614) 2019-08-08 22:00:43 +08:00
dc4a5e6c10 Support Decimal Type when load Parquet File (#1595) 2019-08-07 19:52:23 +08:00
9402456f5b Fix parquet directory have empty file (#1593) 2019-08-07 15:08:22 +08:00
aff1559c4d FixBug: if columns of doris table less than parquet file columns , BE will be crash (#1464) 2019-07-12 15:23:13 +08:00
b9c79d4b1b Fix importing non-parquet format file causing be crash (#1454) 2019-07-11 16:04:36 +08:00
7eab12a40e Support reading Parquet file when loading data (#1173) 2019-07-01 18:39:27 +08:00