doris

Author	SHA1	Message	Date
HuangWei	10f822eb43	[MemTracker] make all MemTrackers shared (#4135 ) We make all MemTrackers shared, in order to show MemTracker real-time consumptions on the web. As follows: 1. nearly all MemTracker raw ptr -> shared_ptr 2. Use CreateTracker() to create new MemTracker(in order to add itself to its parent) 3. RowBatch & MemPool still use raw ptrs of MemTracker, it's easy to ensure RowBatch & MemPool destructor exec before MemTracker's destructor. So we don't change these code. 4. MemTracker can use RuntimeProfile's counter to calc consumption. So RuntimeProfile's counter need to be shared too. We add a shared counter pool to store the shared counter, don't change other counters of RuntimeProfile. Note that, this PR doesn't change the MemTracker tree structure. So there still have some orphan trackers, e.g. RowBlockV2's MemTracker. If you find some shared MemTrackers are little memory consumption & too time-consuming, you could make them be the orphan, then it's fine to use the raw ptr.	2020-07-31 21:57:21 +08:00
Mingyu Chen	0224d49842	[Fix][Bug] Fix compile bug (#3888 ) Co-authored-by: chenmingyu <chenmingyu@baidu.com>	2020-06-16 18:42:04 +08:00
lichaoyong	3086790e06	Fix bug when use ZoneMap/BloomFiter on column with REPLACE/REPLACE_IF_NOT_NULL (#3288 ) Now, column with REPLACE/REPLACE_IF_NOT_NULL can be filtered by ZoneMap/BloomFilter when the rowset is base(version starts with zero). Always we think is an optimization. But when some case, it will occurs bug. create table test( k1 int, v1 int replace, v2 int sum ); If I have two records on different two versions 1 2 2 on version [0-10] 1 3 1 on version 11 If I perform a query select * from test where k1 = 1 and v1 = 3; The result will be 1 3 1, this is not right because of the first record is filtered. The right answer is 1 3 3, the v2 should be summed. Remove this optimization is necessity to make the result is right.	2020-04-10 10:22:21 +08:00
LingBin	608917c04d	Use block layer to write files (#3064 ) This is the second patch following 58b8e3f574614433ea9e0c427961f2efb3476c2a, This patch use block-layer to write files.	2020-03-11 12:11:25 +08:00
Dayue Gao	d2d95bfa84	[segment_v2] Switch to Unified and Extensible Page Format (#2953 ) Fixes #2892 IMPORTANT NOTICE: this CL makes incompatible changes to V2 storage format, developers need to create new tables for test. This CL refactors the metadata and page format for segment_v2 in order to * make it easy to extend existing page type * make it easy to add new page type while not sacrificing code reuse * make it possible to use SIMD to speed up page decoding Here we summary the main code changes * Page and index metadata is redesigned, please see `segment_v2.proto` * The new class `PageIO` is the single place for reading and writing all pages. This removes lots of duplicated code. `PageCompressor` and `PageDecompressor` are now useless and removed. * The type of value ordinal is changed from `rowid_t` to 64-bits `ordinal_t`, this affects ordinal index as well. * Column's ordinal index is now implemented by IndexPage, the same with IndexedColumn. * Zone map index is now implemented by IndexedColumn	2020-02-27 15:09:57 +08:00
yangzhg	c098178f7a	[Index] Implements create drop show index syntax for bitmap index [#2487 ] (#2573 ) ### create table with index ``` CREATE TABLE table1 ( siteid INT DEFAULT '10', citycode SMALLINT, username VARCHAR(32) DEFAULT '', pv BIGINT SUM DEFAULT '0', INDEX index_name [USING BITMAP] (siteid, citycode) COMMENT 'balabala' ) AGGREGATE KEY(siteid, citycode, username) DISTRIBUTED BY HASH(siteid) BUCKETS 10 PROPERTIES("replication_num" = "1"); ``` ### create index ``` CREATE INDEX index_name ON table1 (siteid, citycod) [USING BITMAP] COMMENT 'balabala'; or ALTER TABLE table1 ADD INDEX index_name [USING BITMAP] (siteid, citycod) COMMENT 'balabala'; ``` ### drop index ``` DROP INDEX index_name ON table1; or ALTER TABLE table1 DROP INDEX index_name ``` ### show index ``` SHOW INDEX[ES] FROM table1 ``` output ``` +---------+-------------+-----------------+------------+---------+ \| Table \| Index_name \| Column_name \| Index_type \| Comment \| +---------+-------------+-----------------+------------+---------+ \| table1 \| index_name \| siteid,citycode \| BITMAMP \| balabala\| +---------+-------------+-----------------+------------+---------+ ```	2020-01-03 17:41:26 +08:00
Dayue Gao	da8c9b4429	[Segment V2] refactor SegmentReaderWriterTest and add UT for lazy materialization (#2614 )	2019-12-30 21:07:58 +08:00
kangpinghuang	d31f774852	Add block split bloom filter (#2471 ) [STORAGE][SEGMENTV2] use block split bloom filter build bloom filter against data page add distinct value to bloom filter add ordinal index to bloom filter index	2019-12-18 12:57:44 +08:00
kangkaisen	f828670245	Add Bitmap index reader (#2319 ) [STORAGE] [INDEX] For #2061 and #2062 Add bitmap index reader SegmentIterator support bitmap index Add some metrics	2019-12-03 23:01:40 +08:00
kangpinghuang	068eed8eb0	Add delete state of row block v2 for performance (#2055 )	2019-11-11 20:07:37 +08:00
kangpinghuang	9c2d149c36	add profile for segment v2 (#2015 )	2019-10-22 09:43:16 +08:00
Dayue Gao	8aa2cbe12d	Load Rowset only once in a thread-safe manner (#2022 ) [Storage] This PR implements thread-safe `Rowset::load()` for both AlphaRowset and BetaRowset. The main changes are 1. Introduce `DorisCallOnce<ReturnType>` to be the replacement for `DorisInitOnce` . It works for both Status and OLAPStatus. 2. `segment_v2::ColumnReader::init()` is now implemented by DorisCallOnce. 3. `segment_v2::Segment` is now created by a factory open() method. This guarantees all Segment instances are in opened state. 4. `segment_v2::Segment::_load_index()` is now implemented by DorisCallOnce. 5. Implement thread-safe load() for AlphaRowset and BetaRowset	2019-10-21 16:05:12 +08:00
ZHAO Chun	05643dc403	Replace Arena with MemPool (#2012 ) After replacing Arena with MemPool, we can achieve one copy for string value read from segment v2. We can exchange MemPool's chunk between RowBlockV2 and RowBlock. This change only replace Arena, this work will be done in other change list.	2019-10-19 15:53:24 +08:00
Lijia Liu	d68b1b287c	Support segment-level zone map (#1931 )	2019-10-13 22:06:09 +08:00
wangbo	80e9b21fb0	Make Segment v2 use string's real length(#1943 ) (#1944 )	2019-10-13 13:23:43 +08:00
wangbo	8aa8e08f27	v2 segment support string encode(#1766 ) (#1816 ) major change change data format of binary dict page, appending (dict page data) and (dict page offset) to binary dict page; add new decoding method for new binary dict page format add ut for segment test set the elements of initial array to 0 ,when calling arena.AllocateNewBlock hard code way to choose dict coding for string 0919 commit major change change dict file format:when saving binary dict page, separate dict page from dict page,one dict page may have multi data pages;when reading a binary dict page,one ColumnReader keeps one dict page loading dict when calling column_reader._read_page 3.rollback BinaryDictPage no longer using memset(0) to inital column_zonemap.max_value 0926 17 commit major change init column_zone_map min value column_zone_map slice's data array; set char/varchar column_zone_map'max value size to 0 add ut for char column zone map query hit/miss 0929 10 commit major change allocate mem for column_zone_map 's max and min value direct copy content to column_zone_map's max and min value	2019-09-30 16:25:31 +08:00
kangpinghuang	8d0fee7e64	Add default value column iterator #1834 (#1835 )	2019-09-24 14:39:10 +08:00
kangpinghuang	fe27969978	add delete predicate filter(#1636 ) (#1745 ) Delete predicate can be used to prune data by zone map.	2019-09-24 14:38:19 +08:00
ZHAO Chun	11eafe524f	Add ChunkAllocator to accelerate chunk allocation (#1792 ) I add ChunkAllocator in this CL to put unused memory chunk to a chunk pool other than return it to system allocator. Now we only change MemPool's chunk allocation and free to this. And two configuration are introduduced too. 'chunk_reserved_bytes_limit' is the limit of how many bytes this chunk pool can reserve in total and its default value is 2147483648(2GB). 'use_mmap_allocate_chunk': if chunk is allocated via mmap and default value is false. And in my test case with default configuration a simple like "select * from table limit 10", this can improve throughput from 280 QPS to to 650 QPS. And when I config 'chunk_reserved_bytes_limit' to 0, which means this is disabled, the throughput is the same with origin's.	2019-09-13 08:27:24 +08:00
wubiao	dad4def708	Support estimate size for v2 segment writer (#1787 )	2019-09-12 15:15:39 +08:00
Dayue Gao	5653822298	Writer magic number in footer instead of header (#1771 )	2019-09-10 09:54:13 +08:00
Dayue Gao	f76dad289e	Basic implementation for BetaRowsetReader (#1718 )	2019-09-03 13:52:16 +08:00
Dayue Gao	ae22d5e682	Support multiple key ranges in RowwiseIterator and StorageReadOptions (#1704 ) support multiple key ranges in RowwiseIterator and StorageReadOptions remove unused fields and member functions in RowBlock and ColumnData read num_rows_per_block from short key index footer	2019-08-27 17:57:42 +08:00
kangpinghuang	6d040a33af	Add zone map page(#1390 ) (#1633 )	2019-08-24 00:57:30 +08:00
Dayue Gao	af8256be2a	Implement BetaRowsetWriter (#1590 ) BetaRowsetWriter is used to write rowset in V2 segment format. This PR contains several interface changes 1. Rowset.make_snapshot() is renamed to `link_files_to` because hard links are also useful in copy task, linked schema change, etc 2. Rowset.copy_files_to_path() is renamed to `copy_files_to` to be consistent with other names 3. RowsetWriter.mem_pool() is removed because not all rowset writers use MemPool 4. RowsetWriter.garbage_collection() is removed because it's not used by clients 5. SegmentGroup's make_snapshot() is removed because link_segments_to_path() provides similar functionality	2019-08-12 16:41:47 +08:00
ZHAO Chun	b2e678dfc1	Support Segment for BetaRowset (#1577 ) We create a new segment format for BetaRowset. New format merge data file and index file into one file. And we create a new format for short key index. In origin code index is stored in format like RowCusor which is not efficient to compare. Now we encode multiple column into binary, and we assure that this binary is sorted same with the key columns.	2019-08-06 17:15:11 +08:00

26 Commits