Commit Graph

14 Commits

Author SHA1 Message Date
dc80a993bc [feature-wip](new-scan) New load scanner. (#12275)
Related pr:
https://github.com/apache/doris/pull/11582
https://github.com/apache/doris/pull/12048

Using new file scan node and new scheduling framework to do the load job, replace the old broker scan node.
The load part (Be part) is work in progress. Query part (Fe) has been tested using tpch benchmark.

Please review only the FE code in this pr, BE code has been disabled by enable_new_load_scan_node configuration. Will send another pr soon to fix be side code.
2022-09-13 13:36:34 +08:00
8c8078ad28 [fix](projections) get error row_descriptor when have projections on ExecNode (#12232)
When ExecNode's projections is not empty, it use output row descriptor to initialize the block before doing projection. But we should use original row descriptor. This PR fix it.
2022-09-01 10:48:10 +08:00
dec576a991 [feature-wip](parquet-reader) generate null values and NullMap for parquet column (#12115)
Generate null values and NullMap for the nullable column by analyzing the definition levels.
2022-08-29 09:30:32 +08:00
5eb5444476 [fix](memtracker) Remove useless memory exceed check #11939 2022-08-22 08:40:19 +08:00
f39f57636b [feature-wip](parquet-reader) update column read model and add page index (#11601) 2022-08-16 15:04:07 +08:00
54f878b781 [feature-wip](multi-catalog) Support orc format file split for file scan node (#11046) 2022-07-25 11:41:46 +08:00
4960043f5e [enhancement] Refactor to improve the usability of MemTracker (step2) (#10823) 2022-07-21 17:11:28 +08:00
56e036e68b [feature-wip](multi-catalog) Support runtime filter for file scan node (#11000)
* [feature-wip](multi-catalog) Support runtime filter for file scan node

Co-authored-by: morningman <morningman@apache.org>
2022-07-20 12:36:57 +08:00
60dd322aba [feature-wip](multi-catalog) Optimize threads and thrift interface of FileScanNode (#10942)
FileScanNode in be will launch as many threads as the number of splits.
The thrift interface of FileScanNode is excessive redundant.
2022-07-18 20:50:34 +08:00
24d824a783 [improvement](multi-catalog) Impl parallel for file scanner to improve the scanner performance (#10620)
Add multi-thread support in FileScanNode on be and impl the file spilt logic in fe.
2022-07-09 15:52:53 +08:00
c358a43f35 [feature-wip] support parquet predicate push down (#10512) 2022-07-08 23:11:25 +08:00
cff9ffa0e1 fix the inaccurate comments (#10617)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-07-06 17:54:43 +08:00
4ec6e3ee81 [refactor] Remove debug action since it is never used. (#10484)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-06-29 20:37:51 +08:00
abd10f0f3e [feature-wip](multi-catalog) Impl FileScanNode in be (#10402)
Define a new file scanner node for hms table in be.
This file scanner node is different from broker scan node as blow:
1. Broker scan node will define src slot and dest slot, there is two memory copy in it: first is from file to src slot
    and second from src to dest slot. Otherwise FileScanNode only have one stemp memory copy just from file to dest slot.
2. Broker scan node will read all the filed in the file to src slot and FileScanNode only read the need filed.
3. Broker scan node will convert type into string type for src slot and then use cast to convert to dest slot type,
    but FileScanNode will have the final type.

Now FileScanNode is a standalone code, but we will uniform the file scan and broker scan in the feature.
2022-06-29 11:04:01 +08:00