doris

Author	SHA1	Message	Date
lihangyu	3894de49d2	[Enhancement](topn) support two phase read for topn query (#15642 ) This PR optimize topn query like `SELECT * FROM tableX ORDER BY columnA ASC/DESC LIMIT N`. TopN is is compose of SortNode and ScanNode, when user table is wide like 100+ columns the order by clause is just a few columns.But ScanNode need to scan all data from storage engine even if the limit is very small.This may lead to lots of read amplification.So In this PR I devide TopN query into two phase: 1. The first phase we just need to read `columnA`'s data from storage engine along with an extra RowId column called `__DORIS_ROWID_COL__`.The other columns are pruned from ScanNode. 2. The second phase I put it in the ExchangeNode beacuase it's the central node for topn nodes in the cluster.The ExchangeNode will spawn a RPC to other nodes using the RowIds(sorted and limited from SortNode) read from the first phase and read row by row from storage engine. After the second phase read, Block will contain all the data needed for the query	2023-01-19 10:01:33 +08:00
yiguolei	42b5d17fa1	[refactor](remove non vec) remove column block and column view (#16022 ) * [refactor](remove non vec) remove column block and column view and column vectorized batch Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-18 12:40:53 +08:00
lihangyu	95397ff05d	[refactor](array) remove depandancy of ColumnBlock, ColumnBlockView (#16002 ) change to vectorized::MutableColumnPtr	2023-01-17 19:16:16 +08:00
YueW	edecc2e706	[feature-wip](inverted index) API for inverted index reader and syntax for fulltext match (#14211 ) * [feature-wip](inverted index)inverted index api: reader * [feature-wip](inverted index) Fulltext query syntax with MATCH/MATCH_ALL/MATCH_ALL * [feature-wip](inverted index) Adapt to index meta * [enhance] add more metrics * [enhance] add fulltext match query check for column type and index parser * [feature-wip](inverted index) Support apply inverted index in compound predicate which except leaf node of and node	2022-12-30 21:48:14 +08:00
yixiutt	81fece5360	[improvement](cache) close compaction&schema_change&checksum index meta cache (#14586 )	2022-11-26 12:15:32 +08:00
pengxiangyu	d55faa7f6a	[feature](remote)Only query can use local cache when reading remote files. (#13865 ) When calling select on remote files, download cache files to local disk. When calling alter table on remote files, read files directly from remote storage. So if tablet is too large, it will not take up too many local disk when creating local cache file.	2022-11-14 10:30:15 +08:00
xy720	20b583c91e	[Bug](array-type) Fix memory buffer overflow (#13074 )	2022-10-10 11:42:13 +08:00
HappenLee	f7e3ca29b5	[Opt](Vectorized) Support push down no grouping agg (#12803 ) Support push down no grouping agg	2022-09-23 18:29:54 +08:00
yixiutt	b136d80e1a	[enhancement](compress) reuse compression ctx and buffer (#12573 ) Reuse compression ctx and buffer. Use a global instance for every compression algorithm, and use a thread saft buffer pool to reuse compression buffer, pool size is equal to max parallel thread num in compression, and this will not be too large. Test shows this feature increase 5% of data import and compaction. Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-09-15 10:59:46 +08:00
lihangyu	f50054f547	[Enhancement](array-type) record offsets info to speed up the seek performance (#12293 ) Store the offset rather than the length in file for the data with array type. The new file format can improve the seek performance. Please refer to #12246 to get the performance report. Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>	2022-09-14 22:41:54 +08:00
Gabriel	5f7d6e8f2b	[Refactor](predicate) Unify Conditions and ColumnPredicate (#11985 )	2022-08-29 12:11:22 +08:00
Xinyi Zou	4960043f5e	[enhancement] Refactor to improve the usability of MemTracker (step2) (#10823 )	2022-07-21 17:11:28 +08:00
plat1ko	eec142ae90	[Enhancement] Use shared file reader when read a segment (#10896 ) * readers under a segment use a shared FileReader * no need to cache fd in LocalFileReader	2022-07-17 07:54:58 +08:00
camby	7be2ef79ed	array column support read by rowids (#10886 ) Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-07-15 21:19:02 +08:00
Gabriel	3b46242483	[feature-wip] Optimize Decimal type (#10794 ) * [feature-wip](decimalv3) support decimalv3 * [feature-wip] Optimize Decimal type Co-authored-by: liaoxin <liaoxinbit@126.com>	2022-07-14 10:50:50 +08:00
plat1ko	331fa50501	[feature](cold-data) move cold data to object storage without losing any feature(BE) (#10280 ) This PR supports rowset level data upload on the BE side, so that there can be both cold data and hot data in a tablet, and there is no necessary to prohibit loading new data to cooled tablets. Each rowset is bound to a `FileSystem`, so that the storage layer can read and write rowsets without perceiving the underlying filesystem. The abstracted `RemoteFileSystem` can try local caching strategies with different granularity, instead of caching segment files as before. To avoid conflicts with the code in be/src/io, we temporarily put the file system related code in the be/src/io/fs directory. In the future, `FileReader`s and `FileWriter`s should be unified.	2022-07-08 12:18:39 +08:00
Pxl	e68ab0084b	[bugfix]fix default value get wrong result because no implement read_by_rowids (#10582 )	2022-07-04 19:30:49 +08:00
Jerry Hu	18ad8ebfbb	[improvement]Add reading by rowids to speed up lazy materialization (#10506 )	2022-06-30 21:03:41 +08:00
Pxl	8e713ddfcf	[Feature] [Vectorized] support vectorized schema-change (#10187 )	2022-06-29 15:07:21 +08:00
Jerry Hu	7921320124	[fix]Make sure only call once set_dict_encoding_type for each ColumnReader (#10389 )	2022-06-25 04:31:19 +08:00
Jerry Hu	2e661ac63f	[improvement]Support vectorized predicates for dict columns (#10370 )	2022-06-24 07:21:26 +08:00
camby	0e404edf54	[improvement] Change array offset type from UInt32 to UInt64 (#10070 ) Now column `Array<T>` contains column `offsets` and `data`, and type of column `offsets` is UInt32 now. If we call array_union to merge arrays repeatedly, the size of array may overflow. So we need to extend it before `Array Data Type` release.	2022-06-19 10:24:08 +08:00
Adonis Ling	983cdc7b0d	[feature-wip](array-type) Support loading data in vectorized format (#10065 )	2022-06-15 14:40:28 +08:00
Kang	8b7bb2d07c	[bugfix]fix column reader compress codec unsafe problem (#9741 ) by moving codec from shared reader to unshared iterator	2022-05-23 20:25:49 +08:00
Pxl	6951c42d5c	[Bug][Vectorized] fix schema change add varchar type column default value get wrong result (#9523 )	2022-05-19 23:38:57 +08:00
Kang	e0c790094c	[enhancement][betarowset]optimize lz4 compress and decompress speed by reusing context (#9566 )	2022-05-15 21:18:32 +08:00
camby	88115ffcb3	[feature-wip](array-type) ArrayFileColumnIterator bug fix (#9114 )	2022-04-26 09:35:46 +08:00
camby	a2edc6fd8b	[feature-wip](array-type) replicate impl for ColumnArray to support join with array column (#9070 ) SQL with JOIN and columns ARRAY, will call function ColumnArray::replicate. At this pr, we implement replicate for ARRAY type, to support SQL like this: `SELECT count(lo_array),count(d_array),SUM(lo_extendedprice*lo_discount) AS REVENUE FROM lineorder, date WHERE lo_orderdate = d_datekey AND d_year = 1993 AND lo_discount BETWEEN 1 AND 3 AND lo_quantity < 25;`	2022-04-20 14:50:34 +08:00
Adonis Ling	bd126f0679	[improvement] Refactor type info for further optimizations. (#8786 ) ## Design: For now, there are two categories of types in Doris, one is for scalar types (such as int, char and etc.) and the other is for composite types (array and etc.). For the sake of performance, we can cache type info of scalar types globally (unique objects) due to the limited number of scalar types. When we consider the composite types, normally, the type info is generated in runtime (we can also use some cache strategy to speed up). The memory thereby should be reclaimed when we create type info for composite types. There are a lots of interfaces to get the type info of a specific type. I reorganized those as the following describes. 1. `const TypeInfo* get_scalar_type_info(FieldType field_type)` The function is used to get the type info of scalar types. Due to the cache, the caller uses the result WITHOUT considering the problems about memory reclaim. 2. `const TypeInfo* get_collection_type_info(FieldType sub_type)` The function is used to get the type info of array types with just ONE depth. Due to the cache, the caller uses the result WITHOUT considering the problems about memory reclaim. 3. `TypeInfoPtr get_type_info(segment_v2::ColumnMetaPB* column_meta_pb)` 4. `TypeInfoPtr get_type_info(const TabletColumn* col)` These functions are used to get the type info of BOTH scalar types and composite types. The caller should be responsible to manage the resources returned. #### About the new type `TypeInfoPtr` `TypeInfoPtr` is an alias type to `unique_ptr` with a custom deleter. 1. For scalar types, the deleter does nothing. 2. For composite types, the deleter reclaim the memory. By analyzing the callers of `get_type_info`, these classes should hold TypeInfoPtr: 1. `Field` 2. `ColumnReader` 3. `DefaultValueColumnIterator` Other classes are either constructed by the foregoing classes or hold those, so they can just use the raw pointer of `TypeInfo` directly for the sake of performance. 1. `ScalarColumnWriter` - holds `Field` 1. `ZoneMapIndexWriter` - created by `ScalarColumnWriter`, use `type_info` from the field in `ScalarColumnWriter` 1. `IndexedColumnWriter` - created by `ZoneMapIndexWriter`, only uses scalar types. 2. `BitmapIndexWriter` - created by `ScalarColumnWriter`, uses `type_info` from the field in `ScalarColumnWriter` 1. `IndexedColumnWriter` - created by `BitmapIndexWriter`, uses `type_info` in `BitmapIndexWriter` and `BitmapIndexWriter` doesn't support `ArrayType`. 3. `BloomFilterIndexWriter` - created by `ScalarColumnWriter`, uses `type_info` from the field in `ScalarColumnWriter` 1. `IndexedColumnWriter` - created by `BloomFilterIndexWriter`, only uses scalar types. 2. `IndexedColumnReader` initializes `type_info` by the field type in meta (only scalar types). 3. `ColumnVectorBatch` 1. `ZoneMapIndexReader` creates `ColumnVectorBatch`, `ColumnVectorBatch` uses `type_info` in `IndexedColumnReader` 2. `BitmapIndexReader` supports scalar types only and it creates `ColumnVectorBatch`, `ColumnVectorBatch` uses `type_info` in `BitmapIndexReader` 3. `BloomFilterIndexWriter` supports scalar types only and it creates `ColumnVectorBatch`, `ColumnVectorBatch` uses `type_info` in `BloomFilterIndexWriter`	2022-04-20 14:47:29 +08:00
zbtzbtzbt	6ed59bb98b	[refactor](code_style) remove useless inline #8933 1.Member functions defined in a class are inline by default (implicitly), and do not need to be added 2.inline is a keyword used for implementation, which has no effect when placed before the function declaration	2022-04-10 18:29:55 +08:00
caiconghui	4076c5466b	[refactor][improvement](type_info) use template and single instance to refactor get type info logic (#8680 ) 1. use const pointer instead of shared_ptr 2. Restrict array types to support only primitive types and nest up to 9 levels.	2022-04-03 10:10:36 +08:00
Adonis Ling	38ec3cbbdf	[feature-wip](array-type) Support ArrayLiteral in SQL. (#8089 ) (#8582 ) Please refer to #8074	2022-03-22 15:07:06 +08:00
Xinyi Zou	eeae516e37	[Feature](Memory) Hook TCMalloc new/delete automatically counts to MemTracker (#8476 ) Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G Implement a new way of memory statistics based on TCMalloc New/Delete Hook, MemTracker and TLS, and it is expected that all memory new/delete/malloc/free of the BE process can be counted.	2022-03-20 23:06:54 +08:00
Xinyi Zou	e17aef9467	[refactor] refactor the implement of MemTracker, and related usage (#8322 ) Modify the implementation of MemTracker: 1. Simplify a lot of useless logic; 2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing; 3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes; 4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection 5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently; 6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later; 7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env; 8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.; Modify where MemTracker is used: 1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code; 2. Added trackers for global objects such as ChunkAllocator and StorageEngine; 3. Added more fine-grained trackers such as ExprContext; 4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode; 5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;	2022-03-11 22:04:23 +08:00
Adonis Ling	b40e9144cb	[feature-wip][array-type] Refactor type info for nested array. (#8279 )	2022-03-02 14:20:39 +08:00
zuochunwei	cce721ad5b	[improvement](olap) using placement-new to avoid dynamic mallocing for ParsedPage (#8172 ) use C++ placement-new feature to save consuming, placement-new can separate mallocing from constructing.	2022-02-25 11:09:53 +08:00
zuochunwei	802fcbbb05	(#8162 )refactor binary dict Co-authored-by: zuochunwei <zuochunwei@meituan.com>	2022-02-22 11:23:54 +08:00
Zhengguo Yang	7a73645eee	[refactor] remove some unused code (#8022 )	2022-02-12 15:17:28 +08:00
wangbo	3f221e1d0b	[fix](memory-leak) using unique_ptr to refactor some fields (#7933 ) Using unique_ptr to refactor some class members. Fix mem leak for `SegmentIterator`'s `_pre_eval_block_predicate`.	2022-01-30 16:49:04 +08:00
wangbo	cf02e43ec1	[improvement](vectorized) optimize dict read (#7805 )	2022-01-22 10:18:30 +08:00
HappenLee	e1d7233e9c	[feature](vectorization) Support Vectorized Exec Engine In Doris (#7785 ) # Proposed changes Issue Number: close #6238 Co-authored-by: HappenLee <happenlee@hotmail.com> Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com> Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com> Co-authored-by: wangbo <506340561@qq.com> Co-authored-by: emmymiao87 <522274284@qq.com> Co-authored-by: Pxl <952130278@qq.com> Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com> Co-authored-by: thinker <zchw100@qq.com> Co-authored-by: Zeno Yang <1521564989@qq.com> Co-authored-by: Wang Shuo <wangshuo128@gmail.com> Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com> Co-authored-by: Gabriel <gabrielleebuaa@gmail.com> Co-authored-by: xinghuayu007 <1450306854@qq.com> Co-authored-by: weizuo93 <weizuo@apache.org> Co-authored-by: yiguolei <guoleiyi@tencent.com> Co-authored-by: anneji-dev <85534151+anneji-dev@users.noreply.github.com> Co-authored-by: awakeljw <993007281@qq.com> Co-authored-by: taberylyang <95272637+taberylyang@users.noreply.github.com> Co-authored-by: Cui Kaifeng <48012748+azurenake@users.noreply.github.com> ## Problem Summary: ### 1. Some code from clickhouse ClickHouse is an excellent implementation of the vectorized execution engine database, so here we have referenced and learned a lot from its excellent implementation in terms of data structure and function implementation. We are based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community and developers. The following comment has been added to the code from Clickhouse, eg: // This file is copied from // https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h // and modified by Doris ### 2. Support exec node and query: * vaggregation_node * vanalytic_eval_node * vassert_num_rows_node * vblocking_join_node * vcross_join_node * vempty_set_node * ves_http_scan_node * vexcept_node * vexchange_node * vintersect_node * vmysql_scan_node * vodbc_scan_node * volap_scan_node * vrepeat_node * vschema_scan_node * vselect_node * vset_operation_node * vsort_node * vunion_node * vhash_join_node You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set. ### 3. Data Model Vec Exec Engine Support Dup/Agg/Unq table, Support Block Reader Vectorized. Segment Vec is working in process. ### 4. How to use 1. Set the environment variable `set enable_vectorized_engine = true; `(required) 2. Set the environment variable `set batch_size = 4096; ` (recommended) ### 5. Some diff from origin exec engine https://github.com/doris-vectorized/doris-vectorized/issues/294 ## Checklist(Required) 1. Does it affect the original behavior: (No) 2. Has unit tests been added: (Yes) 3. Has document been added or modified: (No) 4. Does it need to update dependencies: (No) 5. Are there any changes that cannot be rolled back: (Yes)	2022-01-18 10:07:15 +08:00
pengxiangyu	20ef8a6e21	[feature-wip](remote storage)(step1) use a struct instead of string for parameter path, add basic remote method (#7098 ) For the first, we need to make a parameter to discribe the data is local or remote. At then, we need to support some basic function to support the operation for remote storage.	2021-12-22 22:58:23 +08:00
wangbo	d8cde8c044	(#6454 ) Remove useless code for Segment V2 (#6455 )	2021-09-02 09:59:21 +08:00
Mingyu Chen	5cfe081b05	[Bug] Remove duplicate memtracker (#6041 ) * [Enhanece] Remove duplicate memtracker This problem will cause frequent creation of memtracker and affect query concurrency.	2021-06-18 11:28:37 +08:00
Lijia Liu	4d64612b96	[ARRAY]Save array's size instead of offset. (#5983 ) * Save array's size instead of offset. * Optimize variable name * Fix comment	2021-06-10 12:32:58 +08:00
HappenLee	b423274f17	[Enhance] Make MemTracker more accurate (#5515 ) (#5516 ) * [Enhance] Make MemTracker more accurate (#5515) This PR main about: 1. Improve the readability of MemTrackers' name 2. Add the MemTracker of: * Load * Compaction * SchemaChange * StoragePageCache * TabletManager 3. Change SchemaChange to a Singleon * revise some code for Code Review * change the name of mem_tracker * keep reader_context have the same lifetime of rowset_reader in schema change. * change vlog notice to log(warning) in schema change	2021-04-08 09:14:55 +08:00
Skysheepwang	6c098e45fc	[Optimize][Cache]Implementation of Separated Page Cache (#5008 ) #4995 Implementation of Separated Page Cache - Add config "index_page_cache_ratio" to set the ratio of capacity of index page cache - Change the member of StoragePageCache to maintain two type of cache - Change the interface of StoragePageCache for selecting type of cache - Change the usage of page cache in read_and_decompress_page in page_io.cpp - add page type as argument - check if current page type is available in StoragePageCache (cover the situation of ratio == 0 or 1) - Add type as argument in superior call of read_and_decompress_page - Change Unit Test	2021-01-04 12:19:24 +08:00
sduzh	6fedf5881b	[CodeFormat] Clang-format cpp sources (#4965 ) Clang-format all c++ source files.	2020-11-28 18:36:49 +08:00
Lijia Liu	b48c768dc7	[ComplexType] Restructure storage type to support complex types expending (#4905 ) This CL includes: * Change the column metadata to a tree structure. * Refactor the segment_v2.ColumnReader and sgment_v2.ColumnWriter to support complex type. * Implements the reading and writing of array type.	2020-11-16 21:59:41 +08:00
Zhengguo Yang	75e0ba32a1	Fixes some be typo (#4714 )	2020-10-13 09:37:15 +08:00

1 2

76 Commits