Commit Graph

15 Commits

Author SHA1 Message Date
6eb8ac0ebf [feature-wip][multi-catalog]Support caseSensitive field name in file scan node (#11310)
* Impl case sentive in file scan node
2022-08-05 08:03:16 +08:00
0ac5228c05 [feature-wip][multi-catalog]Support prefetch for orc file format (#11292)
Refactor the prefetch code in parquet and support prefetch for orc file format
2022-08-02 11:01:15 +08:00
54f878b781 [feature-wip](multi-catalog) Support orc format file split for file scan node (#11046) 2022-07-25 11:41:46 +08:00
56e036e68b [feature-wip](multi-catalog) Support runtime filter for file scan node (#11000)
* [feature-wip](multi-catalog) Support runtime filter for file scan node

Co-authored-by: morningman <morningman@apache.org>
2022-07-20 12:36:57 +08:00
989e6d1cf9 [chore]fix clang compile error (#11021) 2022-07-20 08:28:47 +08:00
8a366c9ba2 [feature](multi-catalog) read parquet file by start/offset (#10843)
To avoid reading the repeat row group, we should align offsets
2022-07-18 20:51:08 +08:00
f21ce35059 [refactor]remove unused private field _profile (#10732) 2022-07-11 14:04:09 +08:00
639f1cd26c [improvement](parquet-reader) Add some profile for parquet reader (#10740) 2022-07-11 12:19:06 +08:00
c358a43f35 [feature-wip] support parquet predicate push down (#10512) 2022-07-08 23:11:25 +08:00
7c330e38d9 [fix](multi-catalog)Fix coredump when reading the parquet file for multi-thread (#10635)
There is two issue fixed in this pr:

**The first issue** is the C++ code rule of `do not call virtual function in constructor or deconstructor`. 
The deconstructor function of `ArrowReaderWrap` call the virtual function named `close()`.
When deconstructing, it will never call `ParquetReaderWrap::close()` just call the `ArrowReaderWrap::close()`

**The second issue** is parallelism deconstructing for `ParquetReaderWrap` and `prefetch_batch`.
`prefetch_batch` use `thread.detach()` to separate the control from `ParquetReaderWrap`, but it rely on some local vars from `ParquetReaderWrap` such as **`_closed ` /`_total_groups ` and `_reader`**

In this case, `ParquetReaderWrap` may call deconstructor before `prefetch_batch` and then get the core dump.
2022-07-08 14:54:10 +08:00
Pxl
fd0bd395ac [Enhancement] Remove some unused include (#10035) 2022-06-17 10:47:25 +08:00
19bc14cf8d [feature-wip](array-type) Add array type support for vectorized parquet-orc scanner (#9856)
Only support one level array now.
for example:
- nullable(array(nullable(tinyint))) is **support**.
- nullable(array(nullable(array(xx))) is **not support**.
2022-06-09 12:11:47 +08:00
94089b9192 [Refactor] Use file factory to replace create file reader/writer (#9505)
1. Simplify code logic and improve abstraction
2. Fix the mem leak of raw pointer

Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-06-08 15:07:39 +08:00
f377c26bf7 [refactor][be] Optimize headers (#9708) 2022-05-30 16:12:10 +08:00
cbbda7857b [feature-wip](parquet-orc) Support orc scanner in vectorized engine (#9541) 2022-05-26 21:39:12 +08:00