doris

Author	SHA1	Message	Date
Yingchun Lai	9c9992e0aa	[Refactor] Refactor DeleteHandler and Cond module (#4925 ) This patch mainly do the following refactors: - Use int64_t instead of int32_t for 'version' in DeleteHandler - Move some comments from .cpp to .h file, add some new comments in .h files, and also remove some meaningless comments - Use switch...case... instead of multiple if..else.. for DeleteConditionHandler::is_condition_value_valid - Use range loop to simplify code - Reduce some compare operations in Cond::del_eval - Improve some branch predictions in Reader - Fix and improve some unit tests	2020-12-04 12:13:30 +08:00
Zhengguo Yang	1f236a5339	[BUG] Fix core when schema change (#5018 )	2020-12-04 09:53:19 +08:00
Zhengguo Yang	df1f06e60b	Optimized the read performance of the table when have multi versions (#4958 ) * Optimized the read performance of the table when have multi versions, changed the merge method of the unique table, merged the cumulative version data first, and then merged with the base version. For the data with only one base version, read directly without merging	2020-12-01 12:25:11 +08:00
sduzh	6fedf5881b	[CodeFormat] Clang-format cpp sources (#4965 ) Clang-format all c++ source files.	2020-11-28 18:36:49 +08:00
Yingchun Lai	d1c2b3ed0d	[Optimize] Add an unordered_map for TabletSchema to speed up column name lookup (#4779 ) Reduce column name lookup for TabletSchema and Tablet from O(N) to O(1).	2020-11-03 19:53:44 +08:00
Zhengguo Yang	09f97f8a05	[Refactor] Fixes some be typo part 2 (#4747 )	2020-10-20 09:28:57 +08:00
Youngwb	068707484d	Support sequence column for UNIQUE_KEYS Table (#4256 ) * add sequence col Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>	2020-09-04 10:10:17 +08:00
ZhangYu0123	c201cf6e4f	Support batch delete[part 2] (#4425 ) support batch delete for read compaction	2020-08-25 14:05:04 +08:00
lichaoyong	75ebe2b363	[Bug] Compaction row number cannot be matched between input rowsets and output rowsets. (#4139 ) Unique Key table will load duplicate rows for different loads. If exists duplicate row between loads. Compaction will merge this rows. The statistics should take this merged number into consideration. Now, We missed the merged number. So it will encounter error when compaction.	2020-07-23 10:28:56 +08:00
yangzhg	d3d835844f	[Performance] Improve performance of unique table read (#3974 ) Implements #3971 the test table as list: ``` mysql> desc test; +------------+---------+------+-------+---------+---------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +------------+---------+------+-------+---------+---------+ \| rid \| BIGINT \| No \| true \| 0 \| \| \| qid \| BIGINT \| No \| true \| 0 \| \| \| qidDeleted \| TINYINT \| No \| false \| 0 \| REPLACE \| \| type \| TINYINT \| No \| false \| 0 \| REPLACE \| \| uid \| BIGINT \| No \| false \| 0 \| REPLACE \| \| toUid \| BIGINT \| No \| false \| 0 \| REPLACE \| \| status \| INT \| No \| false \| 0 \| REPLACE \| \| createTime \| INT \| No \| false \| 0 \| REPLACE \| \| source \| INT \| No \| false \| 0 \| REPLACE \| \| misFlag \| INT \| No \| false \| 0 \| REPLACE \| \| anonymous \| TINYINT \| No \| false \| 0 \| REPLACE \| \| uv \| TINYINT \| No \| false \| 1 \| REPLACE \| +------------+---------+------+-------+---------+---------+ 12 rows in set (0.00 sec) mysql> select count() from test; +----------+ \| count() \| +----------+ \| 1093760 \| +----------+ 1 row in set (1.00 sec) ``` There is 29 versions at present ![image](https://user-images.githubusercontent.com/9098473/85992244-2aa26c80-ba27-11ea-918a-04701a58dbdf.png) I run the query `select sum(uv) from test` for 10 times, the average ScanTime reduced from `9s277ms` to `8s206ms`	2020-07-02 13:56:08 +08:00
Yingchun Lai	73c3de4313	[refactor] Simple refactor on class Reader (#3691 ) This is a simple refactor patch on class Reader without any functional changes. Main refactor points: - Remove some useless return value - Use range loop - Use empty() instead of size() for some STL containers size judgement - Use in-class initialization instead of initialize in constructor function - Some other small refactor	2020-06-03 19:55:53 +08:00
kangkaisen	6c33f80544	Add disable_storage_page_cache config (#2890 ) 1. when read column data page: for compaction, schema_change, check_sum: we don't use page cache for query and config::disable_storage_page_cache is false, we use page cache 2. when read column index page if config::disable_storage_page_cache is false, we use page cache	2020-02-16 19:13:30 +08:00
Lijia Liu	99ad56d1bf	Support bitmap index for more type (#2630 ) For #2589 1. date(uint24_t)/datetime(int64_t)/largeint(int128_t) use frame of reference code as dict. 2. decimal(decimal12_t) also uses frame of reference code as dict. 3. float/double use bitshuffle code as dict.	2020-01-31 21:09:29 +08:00
Mingyu Chen	13e5fdd512	[AlphaRowset] set num_segments field in rowset meta if missing (#2658 ) the num segments should be read from rowset meta pb. But the previous code error caused this value not to be set in some cases. So when init the rowset meta and find that the num_segments is 0(not set), we will try to calculate the num segments from AlphaRowsetExtraMetaPB, and then set the num_segments field. This should only happen in some rowsets converted from old version. and for all newly created rowsets, the num_segments field must be set.	2020-01-07 21:46:02 +08:00
kangpinghuang	f14cdacfd1	Fix single column read bug (#2122 )	2019-11-07 10:24:02 +08:00
kangkaisen	4e8d728e75	Remove unused code and unnecessary check (#1918 )	2019-09-30 18:35:30 +08:00
kangkaisen	cafb9f1e62	Replace Arena with MemPool first step (#1899 )	2019-09-28 01:12:22 +08:00
kangkaisen	b246d93128	Avoid SerDe for aggregation query with object pool (#1854 )	2019-09-26 13:51:13 +08:00
ZHAO Chun	a349409838	Move compare from RowCursor to row (#1764 )	2019-09-09 14:51:13 +08:00
kangkaisen	1e4dd77d2a	Add bitmap agg type and udaf (#1610 )	2019-08-26 14:24:42 +08:00
Dayue Gao	da8b9aad9a	Remove preaggregation and index stream cache stuff out of RowsetReaderContext (#1698 )	2019-08-26 14:19:03 +08:00
ZHAO Chun	c5edf9dae0	Unify Field and ColumnSchema in Storage (#1561 ) Currently, we have Field and ColumnSchema to access column data in a row. These two classes are mostly the same. So we should unify these to one class. Now, Field has offset information, which is an row attribute, so we remove offset in Field. RowCursor now has some logic which belong to Schema, so in this patch I add Schema attribute to RowCursor to make RowCursor simple. After this change, only Schema will handle Field/ColumnSchema. I extract some logic from RowCursor to be/src/olap/row.h, then we can use same logic to handle different types of row. Each type of row has same function that to get Cell of this row. A cell represent a column content with a null indicator.	2019-07-30 14:01:57 +08:00
lichaoyong	0d48a3961c	Refactor Storage Engine (#1478 ) NOTE: This patch would modify all Backend's data. And this will cause a very long time to restart be. So if you want to interferer your product environment, you should upgrade backend one by one. 1. Refactoring be is to clarify the structure the codes. 2. Use unique id to indicate a rowset. Nameing rowset with tablet_id and version will lead to many conflicts among compaction, clone, restore. 3. Extract an rowset interface to encapsulate rowsets with different format.	2019-07-15 21:18:22 +08:00
lichaoyong	8d87e36ff8	Place _init_seek_columns() in right place (#1302 )	2019-06-13 20:54:45 +08:00
ZHAO Chun	6ce8087916	Fix bug that RowCusor do NOT match with RowBlock's layout (#1249 )	2019-06-04 22:20:10 +08:00
李超勇	ff95f23615	Remove OLAP_LOG_DEBUG AND OLAP_LOG_TRACE log format (#378 ) Use VLOG(3) and VLOG(10) instead	2018-12-03 10:08:21 +08:00
kangpinghuang	85d0996b35	Rename Rowset to SegmentGroup (#364 ) * Rename Rowset to SegmentGroup * Modify protobuf related rowset to SegmentGroup	2018-11-29 17:30:41 +08:00
李超勇	1ba8a4ee4e	Transform row-oriented table to columnar-oriented table (#311 )	2018-11-16 16:03:56 +08:00
chenhao7253886	37b4cafe87	Change variable and namespace name in BE (#268 ) Change 'palo' to 'doris'	2018-11-02 10:22:32 +08:00
morningman	2868793b6b	Change license to Apache License 2.0 (#262 )	2018-11-01 09:06:01 +08:00
morningman	051aced48d	Missing many files in last commit In last commit, a lot of files has been missed	2018-10-31 16:19:21 +08:00
morningman	65fe7f65c1	Fixed: privilege logic error: 1. No one can set root password expect for root user itself 2. NODE_PRIV cannot be granted. 3. ADMIN_PRIV and GRANT_PRIV can only be granted or revoked on . 4. No one can modifly privs of default role 'operator' and 'admin'. 5. No user can be granted to role 'operator'. Fixed: the running load limit should not be applied to replay logic. It will cause replay or loading image fail. Changed: optimize the problem of too many directories under mini load directory. Fixed: missing password and auth check when handling mini load request in Frontend. Fixed: DomainResolver should start after Frontends transfer to a certain ROLE, not in Catalog construction methods. Fixed: a stupid bug that no one can set password for root user... fix it: only root user can set password for root. Fixed: read null data twice When reading data with a null value, in some cases, the same data will be read twice by the storage engine, resulting in a wrong result.The reason for this problem is that when splitting, and the start key is the minimum value, the data with null is read. Fixed: add a flag to prevent DomainResovler thread start twice. Fixed: fixed a mem leak of using ByteBuf when parsing auth info of http request. Fixed: add a new config 'disable_hadoop_load', default is false, set to true to disable hadoop load. Changed: add detail error msg of submitting hadoop load job in show load result. Fixed: Backend process should be crashed if failed to saving header. Added: exposure backend info to user when encounter error on Backend. for debugging it more convenient. Fixed: Should remove fd from map when inputstream or outputstream is closed in Broker process. Fixed: Change all files' LF to unix format. Internal commit id: merge from dfcd0aca18eed9ff99d188eb3d01c60d419be1b8	2018-10-01 19:58:41 +08:00
lide	bea10e4f06	1. hide password and other sensitive information in log and audit log 2. add 2 new proc '/current_queries' and '/current_backend_instances' to monitor the current running queries. 3. add a manual compaction api on Backend to trigger cumulative or base compaction manually. 4. add Frontend config 'max_bytes_per_broker_scanner' to limit to bytes per one broker scanner. This is to limit the memory cost of a single broker load job 5. add Frontend config 'max_unfinished_load_job' to limit load job number: if number of running load jobs exceed the limit, no more load job is allowed to be submmitted. 6. a log of bug fixed	2018-09-19 20:04:01 +08:00
morningman	cc74efb3c5	merge to ddb65b69f9c788e359e191889cb31f15279c41ec (#224 ) 1. Apache HDFS broker support HDFS HA and Hadoop kerberos authentication. 2. New Backup and Restore function. Use Fs Broker to backup your data to HDFS or restore them from HDFS. 3. Table-Level Privileges. Grant fine-grained privileges on table-level to specified user. 4. A lot of bugs fixed. 5. Performance improvement.	2018-08-24 17:12:26 +08:00
morningman	2419384e8a	push 3.3.19 to github (#193 ) * push 3.3.19 to github * merge to 20ed420122a8283200aa37b0a6179b6a571d2837	2018-05-15 20:38:22 +08:00
李超勇	6486be64c3	fix license statement (#29 ) * change picture to word * change picture to word * SHOW FULL TABLES WHERE Table_type != VIEW sql can not execute * change license description	2017-08-18 19:16:23 +08:00
cyongli	e2311f656e	baidu palo	2017-08-11 17:51:21 +08:00

37 Commits