doris

Author	SHA1	Message	Date
xy720	370b92a2ea	[Bug](compile) fix clang build for be (#12183 )	2022-08-30 17:58:54 +08:00
Kikyou1997	9a74ad1702	[feature](Nereids)add the ability of projection on each ExecNode and add column prune on OlapScan (#11842 ) We have added logical project before, but to actually finish the prune to reduce the data IO, we need to add related supports in translator and BE. This PR: - add projections on each ExecNode in BE - translate PhysicalProject into projections on PlanNode in FE - do column prune on ScanNode in FE Co-authored-by: HappenLee <happenlee@hotmail.com>	2022-08-30 16:17:10 +08:00
Mingyu Chen	a16cf0e2c8	[feature-wip](scan) add profile for new olap scan node (#12042 ) Copy most of profiles from VOlapScanNode and VOlapScanner to NewOlapScanNode and NewOlapScanner. Fix some blocking bug of new scan framework. TODO: Memtracker Opentelemetry spen The new framework is still disabled by default, so it will not effect other feature.	2022-08-30 10:55:48 +08:00
Xinyi Zou	8370115cf6	[enhancement](memtracker) Improve performance of tracking real physical memory of PODArray #12168	2022-08-30 10:22:12 +08:00
yiguolei	2f192019d3	[bugfix](delete hanlder) delete predicate is merged and could not find schema cause core dump (#12161 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-08-30 09:18:21 +08:00
Pxl	67e94d2aea	[Enhancement](compaction) add compaction use time count (#12141 )	2022-08-30 09:18:02 +08:00
Xinyi Zou	09b8d32421	[fix](memtracker) Fix mem limit exceed return wrong format (#12139 )	2022-08-29 21:07:02 +08:00
Zhengguo Yang	ed131b8eb0	[Bugfix](coredump) fix coredump cause by fmt::format param malformt (#12138 ) fix coredump cause by fmt::format param malformt	2022-08-29 12:45:22 +08:00
Gabriel	af09c1f4eb	[Improvement](window funnel) restrict timestamp to datetime type in window funnel (#12123 )	2022-08-29 12:14:04 +08:00
Gabriel	5f7d6e8f2b	[Refactor](predicate) Unify Conditions and ColumnPredicate (#11985 )	2022-08-29 12:11:22 +08:00
minghong	62e3bd338e	[refactor](BE) return error status when vslot_ref contains invalid slot_id (#12106 ) In current implementation, we detect invalid slot at execute phase. At execute phase, it is hard to get useful information for further debug. This pr moves error detection ahead to prepare phase, so that we can log related tuple descriptors.	2022-08-29 12:07:08 +08:00
plat1ko	db07e51cd3	[refactor](status) Refactor status handling in agent task (#11940 ) Refactor TaggableLogger Refactor status handling in agent task: Unify log format in TaskWorkerPool Pass Status to the top caller, and replace some OLAPInternalError with more detailed error message Status Premature return with the opposite condition to reduce indention	2022-08-29 12:06:01 +08:00
pengxiangyu	ac425d4bf3	[fix](remote)Fix bug for cache reader (#12104 )	2022-08-29 11:28:17 +08:00
carlvinhust2012	44c4a45f72	[fix](array-type) fix the wrong data when use stream load to import '\N' (#12102 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-08-29 09:53:37 +08:00
Ashin Gau	dec576a991	[feature-wip](parquet-reader) generate null values and NullMap for parquet column (#12115 ) Generate null values and NullMap for the nullable column by analyzing the definition levels.	2022-08-29 09:30:32 +08:00
Gabriel	6e6269c682	[Improvement](load) accelerate streamload and compaction (#12119 ) * [Improvement](load) accelerate streamload and compaction	2022-08-28 23:10:47 +08:00
pengxiangyu	a6e2e2f3bc	[feature](remote)Add cache files cleaner for remote olap files (#11959 )	2022-08-26 23:59:36 +08:00
Ashin Gau	0b5bb565a7	[feature-wip](parquet-reader) parquet dictionary decoder (#11981 ) Parse parquet data with dictionary encoding. Using the PLAIN_DICTIONARY enum value is deprecated in the Parquet 2.0 specification. Prefer using RLE_DICTIONARY in a data page and PLAIN in a dictionary page for Parquet 2.0+ files. refer: https://github.com/apache/parquet-format/blob/master/Encodings.md	2022-08-26 19:24:37 +08:00
Zhengguo Yang	f3f17eb222	[Bugfix](load) fix be will coredump when parsing malformed json file using simdjson (#12062 ) * [Bugfix](load) fix be will coredump when parsing malformed json file using simdjson	2022-08-26 18:01:19 +08:00
carlvinhust2012	fba2658a1d	[fix](array-type) fix the be core dump when use collect_list result to insert (#12045 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-08-26 18:00:43 +08:00
Pxl	3af0745c8f	[Bug](function) fix aggFnParams set not correct (#12006 )	2022-08-26 14:29:56 +08:00
Xinyi Zou	22157077e9	[fix](memtracker) Optimize the return msg of process memory limit exceed #12086 Return the real process memory information when the process exceeds mem limit Optimize the memory exceed limit log printing logic process tracker does not participate in process memory limit.	2022-08-26 14:28:46 +08:00
Xinyi Zou	9caaa4bfbd	[fix](memory) fix set disable_chunk_allocator_in_vec=false performance #12092	2022-08-26 14:28:12 +08:00
yiguolei	ccff3f5711	[bugfix](light weight schema change) support delete condition in schema change (#11869 ) * [bugfix](light weight schema change) support delete condition in schema change Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-08-26 11:45:55 +08:00
Xinyi Zou	82ca62dfcc	[fix](memory) Fix disable_mem_pools to disable cache #12087	2022-08-26 11:43:19 +08:00
camby	0f4a1e811b	[Enhancement](table_function) table function node enhancement (#12038 ) * table function node enhancement * also avoid copy for non-vec table function node * fix table function node output slots calculation while lateral view involves subquery Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-08-26 10:37:15 +08:00
zhannngchen	ba11d8dc67	[feature-wip](unique-key-merge-on-write) fix bugs on tablet clone #12067	2022-08-26 10:37:00 +08:00
slothever	0c16740f5c	[feature-wip](parquet-reader) parquert scanner can read data (#11970 ) Co-authored-by: jinzhe <jinzhe@selectdb.com>	2022-08-26 09:43:46 +08:00
Xin Liao	721d418a2f	[feature-wip](unique-key-merge-on-write) fix that version is awlays 0 when update delete bitmap (#12044 )	2022-08-26 09:41:55 +08:00
zhannngchen	e5bfbbe761	[feature-wip](unique-key-merge-on-write) support alter table column for MoW (#12052 )	2022-08-26 09:40:11 +08:00
Gabriel	17b809210a	[Bug](runtime filter) fix bug for late-arrival runtime filters (#12049 )	2022-08-26 09:13:10 +08:00
Jerry Hu	e3ab2caef8	[improvement](sink) Support local exchange for multi fragment instances (#12017 )	2022-08-25 19:28:23 +08:00
plat1ko	588dc5f12a	[feature](cold_on_s3) Show remote data usage via SHOW BACKENDS and SHOW TABLETS statements (#11450 )	2022-08-25 15:36:15 +08:00
Mingyu Chen	003fdf2b36	[fix](scan) use serial scan thread token only for scan node (#12058 ) Only the scan node's limit is less than 1024, we can use serial thread token to submit scanners. Or it will slow down the query.	2022-08-25 14:54:02 +08:00
Pxl	620d33a763	[Enchancement](optimize) set result_size_hint to filter_block (#11972 )	2022-08-25 11:42:52 +08:00
Gabriel	73a3471fbd	[minor](conjuncts) remove row-based conjuncts from vectorized engine (#12053 )	2022-08-25 10:13:20 +08:00
zxealous	54fc038dc5	[Fix](remote) Fix thread safety issue in cache (#11984 )	2022-08-24 18:14:14 +08:00
Jerry Hu	f875684345	[fix](agg) Crashing caused by serialization in streaming aggregation (#12027 )	2022-08-24 14:38:25 +08:00
Xinyi Zou	1304a17600	[fix](memtracker) Improve performance of tracking real physical memory of PodArray #12021	2022-08-24 14:24:14 +08:00
Userwhite	fb3c00c943	[Improvement](storage) reuse schema and rowblockv2 on single scanner_thread (#11392 ) * support reuse rowblockv2 on single thread	2022-08-24 13:42:10 +08:00
Xin Liao	ba85c06a68	[feature-wip](unique-key-merge-on-write) fix that IndexedColumnIterator next batch may return empty result (#11928 )	2022-08-24 08:53:44 +08:00
HappenLee	3abc4f357f	[Bug](bitmap) intersect_count function use in string cause ASAN error (#11936 )	2022-08-24 08:51:53 +08:00
carlvinhust2012	5d627e41a4	[fix](array-type) fix the be core dump when import number larger than uint64 (#11853 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-08-24 08:51:12 +08:00
Xinyi Zou	1fc5515a78	[enhancement](memory) Remove unused reservation tracker (#11969 )	2022-08-24 08:49:34 +08:00
Mingyu Chen	d06edd4b8b	[minor](runtime-filter) add DCHECK for runtimefilter bug (#11996 ) Not a fix, just add debug info to try find root cause of #11995	2022-08-24 07:53:30 +08:00
carlvinhust2012	cbbf4e10ff	[fix](array-type) fix be occasional coredump when use stream load (#11997 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-08-23 21:54:00 +08:00
HappenLee	1056a6d8c7	[bug](compaction) fix bug of coredump of filter delete chose wrong filter column (#12002 ) * [bug](compaction) fix bug of coredump of filter delete chose wrong filter column * clang format	2022-08-23 21:52:11 +08:00
TengJianPing	55fdb555be	[bugfix](dict) fix coredump of dict colum range predicate when there is null value (#11967 )	2022-08-23 16:07:48 +08:00
yixiutt	60fddd56e7	[feature-wip](unique-key-merge-on-write) opt lock and only save valid delete_bitmap (#11953 ) 1. use rlock in most logic instead of wrlock 2. filter stale rowset's delete bitmap in save meta 3. add a delete_bitmap lock to handle compaction and publish_txn confict Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-08-23 14:43:40 +08:00
Mingyu Chen	05da3d947f	[feature-wip](new-scan) add scanner scheduling framework (#11582 ) There are currently many types of ScanNodes in Doris. And most of the logic of these ScanNodes is the same, including: Runtime filter Predicate pushdown Scanner generation and scheduling So I intend to unify the common logic of all ScanNodes. Different data sources only need to implement different Scanners for data access. So that the future optimization for scan can be applied to the scan of all data sources, while also reducing the code duplication. This PR mainly adds 4 new class: VScanner All Scanners' parent class. The subclasses can inherit this class to implement specific data access methods. VScanNode The unified ScanNode, and is responsible for common logic including RuntimeFilter, predicate pushdown, Scanner generation and scheduling. ScannerContext ScannerContext is responsible for recording the execution status of a group of Scanners corresponding to a ScanNode. Including how many scanners are being scheduled, and maintaining a producer-consumer blocks queue between scanners and scan nodes. ScannerContext is also the scheduling unit of ScannerScheduler. ScannerScheduler schedules a ScannerContext at a time, and submits the Scanners to the scanner thread pool for data scanning. ScannerScheduler Unified responsible for all Scanner scheduling tasks Test: This work is still in progress and default is disabled. I tested it with jmeter with 50 concurrency, but currently the scanner is just return without data. The QPS can reach about 9000. I can't compare it to origin implement because no data is read for now. I will test it when new olap scanner is ready. Co-authored-by: morningman <morningman@apache.org>	2022-08-23 08:45:18 +08:00

1 2 3 4 5 ...

2684 Commits