doris

Author	SHA1	Message	Date
Mingyu Chen	4c98596283	[MysqlProtocol] Support MySQL multiple statements protocol (#3050 ) 2 Changes in this CL: ## Support multiple statements in one request like: ``` select 10; select 20; select 30; ``` ISSUE: #3049 For simple testing this CL, you can using mysql-client shell command tools: ``` mysql> delimiter // mysql> select 1; select 2; // +------+ \| 1 \| +------+ \| 1 \| +------+ 1 row in set (0.01 sec) +------+ \| 2 \| +------+ \| 2 \| +------+ 1 row in set (0.02 sec) Query OK, 0 rows affected (0.02 sec) ``` I add a new class called `OriginStatement.java`, to save the origin statement in string format with an index. This class is mainly for the following cases: 1. User send a multi-statement to the non-master FE: `DDL1; DDL2; DDL3` 2. Currently we cannot separate the original string of a single statement from multiple statements. So we have to forward the entire statement to the Master FE. So I add an index in the forward request. `DDL1`'s index is 0, `DDL2`'s index is 1,... 3. When the Master FE handle the forwarded request, it will parse the entire statement, got 3 DDL statements, and using the `index` to get the specified the statement. ## Optimized the display of syntax errors I have also optimized the display of syntax errors so that longer syntax errors can be fully displayed.	2020-03-13 22:21:40 +08:00
Mingyu Chen	9832024995	[Insert] Fix bug that insert meet unexpected "label already exists" exception (#3103 ) This CL will abort the transaction of an insert operation when encountering exception thrown in analysis phase. ISSUE: #3102	2020-03-13 20:51:44 +08:00
Seaven	5f18e99cdb	[Doc] Update add fe node description (#3100 )	2020-03-13 18:05:09 +08:00
kangkaisen	aa540966c6	Output null for hll and bitmap column when select * (#2991 )	2020-03-13 11:59:30 +08:00
kangkaisen	d8c756260b	Rewrite count distinct to bitmap and hll (#3096 )	2020-03-13 11:44:40 +08:00
WingC	c5660fcb9d	[UT]Fix unit test for cgroup_util (#3094 ) Co-authored-by: wangcong18 <wangcong18@xiaomi.com>	2020-03-12 22:59:40 +08:00
Yingchun Lai	8276c6d7f8	Show BE version in 'show backends;' (#3074 ) In a large scale cluster, we may rolling upgrade BEs, this patch add a column named 'Version' for command 'show backends;', as well as website '/system?path=//backends', to provide a method to check whether there is any BE missing upgraded.	2020-03-12 22:15:13 +08:00
LingBin	905070f4da	[CodeStyle] Fix compile warning (#3076 ) ``` be/src/olap/rowset/segment_v2/ordinal_page_index.cpp:103:22: warning: ‘ordinal’ may be used uninitialized in this function [-Wmaybe-uninitialized] _ordinals[i] = ordinal; ```	2020-03-11 18:17:29 +08:00
LingBin	bf9612e28b	[CodeStyle] Remove unnecessary forward declaration of WritableFile (#3075 )	2020-03-11 18:17:11 +08:00
EmmyMiao87	c8705ccf12	[MaterializedView] Support dropping materialized view (#3068 ) `DROP MATERIALIZE VIEW [ IF EXISTS ] <mv_name> ON [db_name].<table_name>` Parameters: IF EXISTS: Do not throw an error if the materialized view does not exist. A notice is issued in this case. mv_name: The name of the materialized view to remove. db_name: The name of db to which materialized view belongs. table_name: The name of table to which materialized view belongs.	2020-03-11 18:16:24 +08:00
Youngwb	a77515fe03	[Backup] Fix backup job block at SNAPSHOTING phase (#3058 ) This bug occurred when BE make snapshot, the version required by fe had been merged into the cumulative version, so the snapshot task could not complete the task even if it retried. In order to solve this problem, the BackupJob could be set to CANCELLED, and the user could continue to retry the job. Fix #3057	2020-03-11 14:05:02 +08:00
LingBin	608917c04d	Use block layer to write files (#3064 ) This is the second patch following 58b8e3f574614433ea9e0c427961f2efb3476c2a, This patch use block-layer to write files.	2020-03-11 12:11:25 +08:00
Mingyu Chen	cf219ddf18	[ConsistencyCheck] Support checking replica consistency of tablet manually (#3067 )	2020-03-10 15:25:25 +08:00
Youngwb	7400535b37	[Doc] Update compaction-action_EN.md (#3060 ) fix typo	2020-03-09 22:09:43 +08:00
WingC	b9b9a11eae	[Bug] Fix invalid rollback for stream load txn (#3054 )	2020-03-09 22:07:36 +08:00
Youngwb	6e46dccd39	[Doc] Update compaction-action.md (#3059 ) fix typo	2020-03-09 21:14:09 +08:00
Mingyu Chen	fdcbfbb793	[Bug] Fix bug that coalesce() function return null when there is constant value in parameter. (#3062 ) select coalesce(1, null); RETURNS: NULL EXPECTED: 1	2020-03-09 16:38:50 +08:00
caiconghui	a1f5b57011	Support sharding tablet_map_lock into more small map locks to make good performance for tablet manage task (#3051 ) Support sharding tablet_map_lock into more small map locks to make good performance for tablet manage task	2020-03-09 16:29:56 +08:00
yangzhg	dc07182bd4	[Intersect] Implements intersect node (#3034 ) imlement of the intersect node now can support statement like `select a from t intersect select b from t1 intersect select 1;`	2020-03-09 10:52:55 +08:00
Mingyu Chen	172838175f	[Bug] Fix bug that index name in MaterializedViewMeta is not changed after schema change (#3048 ) The index name in MaterializedViewMeta is still with `__doris_shadow` prefix after schema change finished. In this CL, I just remove the index name field in MaterializedViewMeta, so that it would makes managing change of names less error-prone.	2020-03-09 10:11:16 +08:00
Mingyu Chen	765f284dcd	[Doc] Add Downloads page to Doris website (#3039 )	2020-03-09 09:42:46 +08:00
wangbo	c8054ebe13	[Function] ifnull function supports new args (date,datetime) and (datetime, date) (#3043 )	2020-03-09 09:37:26 +08:00
lichaoyong	c83729435f	Write delete predicate into RowsetMeta upon upgrade from Doris-0.10 to Doris-0.11 (#3044 ) If delete predicate exists in meta in Doris-0.10, all of this predicates should be remained. There is an confused place in Doris-0.10. The delete predicate only exists in OLAPHeaderMessage and PPendingDelta, not in PDelta. This trick results this bug.	2020-03-07 11:16:48 +08:00
HangyuanLiu	1d296e907d	Fix orc load timestamp bug (#3047 ) The timestamp value load from orc file is error, the value has an offset with hive and spark. Becuase the time zone of orc's timestamp is stored inside orc's stripe information, so the timestamp obtained here is an offset timestamp, so parse timestamp with UTC is actual datetime literal.	2020-03-06 18:03:27 +08:00
kangkaisen	fca6c4e523	Fix bitmap null crash (#3042 )	2020-03-05 21:30:32 +08:00
Mingyu Chen	7b30bbea42	[MaterializedView] Support different keys type between MVs and base table (#3036 ) Firstly, add materialized index meta in olap table The materialized index meta include index name, schema, schemahash, keystype etc. The information itself scattered in each map is encapsulated into MaterializedIndexMeta. Also the keys type of index meta maybe not same as keys type of base index after materialized view enabled. Secondly, support the deduplicate mv. If there is group by or aggregation function in create mv stmt, the keys type of mv is agg. At the same time, the keys type of base table is duplicate. For example Duplicate table (k1, k2, v1) MV (k1, k2) group by k1, k2 It should be aggregated during executing mv.	2020-03-05 18:19:18 +08:00
ZHAO Chun	cd7207c869	Add ORC help doc (#3041 )	2020-03-05 12:44:47 +08:00
Mingyu Chen	c731c8b9bc	[Bug] Fix bug of NPE when get replication number from olap table (#3029 ) The default replication number of an olap table may not be set. Every time we call `getReplicationNum()`, we have to check if it returns null, which is inconvenience and may cause problem So in this PR, I set a default value to table's replication number. This bug is introduced by #2958	2020-03-05 12:18:38 +08:00
Mingyu Chen	4ed99e3c0c	[Compile] Fix BE compile failure (#3040 ) fix BE compile failure because of BloomFilterIndexWriter bug.	2020-03-05 11:38:42 +08:00
Mingyu Chen	63051a3b37	[Bug] Fix int128 bloom filter write bug (#2995 ) std::set.insert(int128) core dump because segment fault. the reason is the __int128 is not aligned.	2020-03-05 09:15:11 +08:00
Mingyu Chen	cc1a5fb8ea	[Function] Support '%' in date format string (#3037 ) eg: select str_to_date('2014-12-21 12%3A34%3A56', '%Y-%m-%d %H%%3A%i%%3A%s'); select unix_timestamp('2007-11-30 10:30%3A19', '%Y-%m-%d %H:%i%%3A%s'); This also enable us to extract column fields from HDFS file path with contains '%'.	2020-03-05 08:56:02 +08:00
Mingyu Chen	50af594c66	[MemLimit] Normalize the setting of mem limit (#3033 ) Normalize the setting of mem limit to avoid some unexpected exception. For example, use may not setting query mem limit in query plan, which may cause BE crash.	2020-03-05 08:47:45 +08:00
kangpinghuang	f17924650f	[Config] Modify brpc max_body_size to 200M (#3030 ) The default max size per row is 100K, and default row batch size is 2048. So we change the default brpc max_body_size to 200MB to avoid query failure.	2020-03-04 15:30:27 +08:00
Mingyu Chen	c032d634f4	[FsBroker] Fix bug that broker cannot read file with %3A in name (#3028 ) The hdfs support file with name like: "2018-01-01 00%3A00%3A00", we should support it. Also change the default broker log level to INFO.	2020-03-04 11:03:01 +08:00
yangzhg	70cc6df415	[Doc] Fix some typo (#3024 )	2020-03-02 22:13:47 +08:00
yangzhg	54aa0ed26b	[SetOperation] Change set operation from random shuffle to hash shuffle (#3015 ) use hash shuffle instead of random shuffle in set operation, prepare for intersect and except operation	2020-03-02 19:34:41 +08:00
EmmyMiao87	d151718e98	[MaterializedView] Fix bug that preAggregation is different between old and new selector (#3018 ) If there is no aggregated column in aggregate index, the index will be deduplicate table. For example: aggregate table (k1, k2, v1 sum) mv index (k1, k2) This kind of index is SPJG which same as `select k1, k2 from aggregate_table group by k1, k2`. It also need to check the grouping column using following steps. If there is no aggregated column in duplicate index, the index will be SPJ which passes the grouping verification directly. Also after the supplement of index, the new candidate index should be checked the output columns also.	2020-03-02 19:11:10 +08:00
Yingchun Lai	aa58cd99d9	Fix disks_total_capacity metric bug (#2988 ) Now disks_total_capacity metric is a user specified capacity, but disks_avail_capacity is the disk's actual available capacity, so disks_total_capacity may be less than disks_avail_capacity, and UsedPct on FE may be a negative number as a result. We'd better to use disk actual capacity for disks_total_capacity metric.	2020-03-02 19:09:50 +08:00
Mingyu Chen	511c5eed50	[Doc] Modify format of some docs (#3021 ) Format of some docs are incorrect for building the doc website. * fix a bug that `gensrc` dir can not be built with -j. * fix ut bug of CreateFunctionTest	2020-03-02 19:07:52 +08:00
worker24h	21b87ee23a	[Bug] Access follower FE's website got exception (#3020 ) QualifiedUser field is not set in ConnectContext	2020-03-02 13:53:35 +08:00
worker24h	ef4bb0c011	[RoutineLoad] Auto Resume RoutineLoadJob (#2958 ) When all backends restart, the routine load job can be resumed.	2020-03-02 13:27:35 +08:00
Mingyu Chen	df56588bb5	[Temp Partition] Support add/drop/replace temp partitions (#2828 ) This CL implements 3 new operations: ``` ALTER TABLE tbl ADD TEMPORARY PARTITION ...; ALTER TABLE tbl DROP TEMPORARY PARTITION ...; ALTER TABLE tbl REPLACE TEMPORARY PARTITION (p1, p2, ...); ``` User manual can be found in document: `docs/documentation/cn/administrator-guide/alter-table/alter-table-temp-partition.md` I did not update the grammar manual of `alter-table.md`. This manual is too confusing and too big, I will reorganize this manual after. This is the first part to implement the "overwrite load" feature mentioned in issue #2663. I will implement the "load to temp partition" feature in next PR. This CL also add GSON serialization method for the following classes (But not used): ``` Partition.java MaterializedIndex.java Tablet.java Replica.java ```	2020-03-01 21:30:34 +08:00
Lishi	0d1e28746e	[Function] Support null_or_empty function (#2977 ) It returns true if the string is empty or NULL. Otherwise it returns false.	2020-03-01 17:35:45 +08:00
frwrdt	078e35a62e	Support Amazon S3 data source in Broker Load (#3004 )	2020-03-01 12:53:50 +08:00
LingBin	58b8e3f574	[Fs Block] Add block layer to storage-engine (#2983 ) The abstraction of the Block layer, inspired by Kudu, lies between the "business layer" and the "underlying file storage layer" (`Env`), making them no longer strongly coupled. In this way, for the business layer (such as `SegmentWriter`), there is no need to directly do the file operation, which will bring better encapsulation. An ideal situation in the future is: when we need to support a new file storage system, we only need to add a corresponding type of BlockManager without modifying the business code (such as `SegmentWriter`). With the Block layer, there are some benefits: 1. First and foremost, the mapping relationship between data and `Env` is more flexible. For example, in the storage engine, the data of the tablet can be placed in multiple file systems (`Env`) at the same time. That is, one-to-many relationships can be supported. For example: one on the local and one on the remote storage. 2. The mapping relationship between blocks and files can be adjusted, for example, it may not be a one-to-one relationship. For example, the data of multiple blocks can be stored in a physical file, which can reduce the number of files that need to be opened during querying. It is like `LogBlockManager` in Kudu. 3. We can move the opened-file-cache under the Block layer, which can automatically close and open the files used by the upper layer, so that the upper business level does not need to be aware of the restrictions of the file handle at all (This problem is often encountered online now). 4. Better automatic cleanup logic when there are exceptions. For example, a block that is not closed explicitly can automatically clean up its corresponding file, thereby avoiding generating most garbage files. 5. More convenient for batch file creation and deletion. Some business operations create multiple files, such as compaction. At present, the processing flow that these files go through is executed one by one: 1) creation; 2) writing data; 3) fsync to disk. But in fact, this is not necessary, we only need to fsync this batch of files at the end. The advantage is that it can give the operating system more opportunities to perform IO merge, thereby improving performance. However, this operation is relatively tedious, there is no need to be coupled in the business code, it is an ideal place to put it in the Block layer. This is the first patch, just add related classes, laying the groundwork for later switching of read and write logic.	2020-03-01 10:48:00 +08:00
lichaoyong	f2d2e4bffd	[Unused] Remove unused GC function in DataDir (#3019 )	2020-02-28 21:47:41 +08:00
yangzhg	2ac07a8c07	[Doc] Fix docs mixed Chinese and English (#3017 )	2020-02-28 16:36:37 +08:00
EmmyMiao87	bd23f2cda2	[MaterializedView] Fix bug that result is double when new mv selector is enable (#3012 ) The issue is #3011. Reset the tablet and scan range info before compute it. The old rollup selector has computed tablet and scan range info. Then the new mv selector maybe compute tablet and scan range info again sometimes. So, we need to reset those info in here. Before this commit, the result is double when query is "select k1 ,k2 from aggregate_table "	2020-02-27 18:19:34 +08:00
yangzhg	3b5a0b6060	[TPCDS] Implement the planner for set operation (#2957 ) Implement intersect and except planner. This CL does not implement intersect and except node in execution level.	2020-02-27 16:03:31 +08:00
Dayue Gao	d2d95bfa84	[segment_v2] Switch to Unified and Extensible Page Format (#2953 ) Fixes #2892 IMPORTANT NOTICE: this CL makes incompatible changes to V2 storage format, developers need to create new tables for test. This CL refactors the metadata and page format for segment_v2 in order to * make it easy to extend existing page type * make it easy to add new page type while not sacrificing code reuse * make it possible to use SIMD to speed up page decoding Here we summary the main code changes * Page and index metadata is redesigned, please see `segment_v2.proto` * The new class `PageIO` is the single place for reading and writing all pages. This removes lots of duplicated code. `PageCompressor` and `PageDecompressor` are now useless and removed. * The type of value ordinal is changed from `rowid_t` to 64-bits `ordinal_t`, this affects ordinal index as well. * Column's ordinal index is now implemented by IndexPage, the same with IndexedColumn. * Zone map index is now implemented by IndexedColumn	2020-02-27 15:09:57 +08:00

1 2 3 4 5 ...

1601 Commits