doris

Author	SHA1	Message	Date
Mingyu Chen	fcb651329c	[Plugin] Making FE audit module pluggable (#3219 ) Currently we have implemented the plugin framework in FE. This CL make the original audit log logic pluggable. The following classes are mainly implemented: 1. AuditPlugin The interface of audit plugin 2. AuditEvent An AuditEvent contains all information about an audit event, such as a query, or a connection. 3. AuditEventProcessor Audit event processor receive all audit events and deliver them to all installed audit plugins. This CL implements two audit module plugins: 1. The builtin plugin `AuditLogBuilder`, which act same as the previous logic, to save the audit log to the `fe.audit.log` 2. An optional plugin `AuditLoader`, which will periodically inserts the audit log into a Doris table specified by the user. In this way, users can conveniently use SQL to query and analyze this audit log table. Some documents are added: 1. HELP docs of install/uninstall/show plugin. 2. Rename the `README.md` in `fe_plugins/` dir to `plugin-development-manual.md` and move it to the `docs/` dir 3. `audit-plugin.md` to introduce the usage of `AuditLoader` plugin. ISSUE： #3226	2020-04-03 09:53:50 +08:00
kangkaisen	c9ff6f68d1	Fix Rewrite count distinct bitmap and hll order by bug (#3251 )	2020-04-03 09:08:27 +08:00
yangzhg	d14726e05b	Fix join hints not work when need table reorder (#3188 ) * fix join hints not work when need table reorder fix cross join numNodes not computed * fix some typo * disable table reorder when has join hints	2020-04-02 17:13:35 +08:00
Mingyu Chen	390f462f55	[Bug] Fix read schema change job meta bug (#3244 )	2020-04-02 12:31:46 +08:00
kangkaisen	6252a271dd	Rewrite count distinct bitmap and hll in order by and having (#3232 )	2020-04-02 09:11:42 +08:00
EmmyMiao87	29b37dad49	Sql reference of materialized view (#3208 ) * Sql reference of materialized view Sql reference of Create and drop materialized view in English and Chinese. * Change discription	2020-04-01 21:22:19 +08:00
WingC	9c937180cd	[Alter]Clean SchemaChangeJobV2 when schema change CANCELLED or FINISHED (#3212 ) SchemaChangeJobV2 will use too much memory in FE, which may cause FullGC. But these data is useless after job is done, so we need to clean it up. NOTICE: update FE meta version to 80	2020-04-01 21:05:17 +08:00
kangkaisen	34993a69a8	Fix colocate relocateGroup bug after decommission (#3239 )	2020-04-01 18:50:36 +08:00
Stalary	028da655a9	Increased compatibility with mysql (#3235 ) Add divPrecisionIncrement and utf8-superset transform	2020-04-01 09:57:00 +08:00
wangbo	68a801ffbe	Support Java version 64 bits Integers for BITMAP type (#3090 ) Fork from roaringbitmap's Roaring64NavigableMap, overwrite serialize/deserialize method to keep compatibility with be's bitmap storage format	2020-03-31 15:29:41 +08:00
Mingyu Chen	0554e89645	[Alter] Fix bug of assertion failure when submitting schema change job (#3181 ) When creating a schema change job, we will create a corresponding shadow replica for each replica. Here we should check the state of the replica and only create replicas in the normal state. The process here may need to be modified later. We should completely allow users to submit alter jobs under any circumstances, and then in the job scheduling process, dynamically detect changes in the replicas and do replica repairs, instead of forcing a check on submission.	2020-03-31 12:06:30 +08:00
Mingyu Chen	e9b3584d45	[Bug] Fix bug that `desc tbl all` stmt throw error: Malformed packet (#3233 )	2020-03-31 10:29:53 +08:00
Mingyu Chen	4131afe316	[Bug] NPE when using unknown function in broker load process (#3225 ) This CL fix the bug described in issue #3224 by 1. Forbid UDF in broker load process 2. Improving the function checking logic to avoid NPE when trying to get default database from ConnectionContext.	2020-03-30 18:34:41 +08:00
HangyuanLiu	41f1ab006b	Add curdate/now function in fe (#3215 )	2020-03-28 13:39:54 +08:00
gengjun-git	d3555e3624	[Conf][API Change] Change the default FE meta dir and BE storage_root_path 1. Change word of palo to doris in conf file. 2. Set default meta_dir to ${DORIS_HOME}/doris-meta 3. Comment out FE meta_dir, leave it to ${DORIS_HOME}/doris-meta, as exsting in FE Config.java. 4. Comment out BE storage_root_path, leave it to ${DORIS_HOME}/storage, as exsting in BE config.h. NOTICE: default config is changed.	2020-03-27 20:42:12 +08:00
EmmyMiao87	cb68e10217	[MaterializedView] Add 'IndexKeysType' field in 'Desc all table stmt' (#3209 ) After doris support aggregation materialized view on duplicate table, desc stmt of metadata is confused in sometimes. The reason is that there is no grouping information in desc stmt of metadata. For example: There are two materialized view as following. 1. create materialized view k1_k2 as select k1, k2 from table; 2. create materialzied view deduplicated_k1_k2 as select k1, k2 from table group by k1, k2; Before this commit, the metatdata in desc stmt is the same. ``` +-----------------------+-------+----------+------+-------+---------+-------+ \| IndexName \| Field \| Type \| Null \| Key \| Default \| Extra \| +-----------------------+-------+----------+------+-------+---------+-------+ \| k1_k2 \| k1 \| TINYINT \| Yes \| true \| N/A \| \| \| \| k2 \| SMALLINT \| Yes \| true \| N/A \| \| \| deduplicated_k1_k2 \| k1 \| TINYINT \| Yes \| true \| N/A \| \| \| \| k2 \| SMALLINT \| Yes \| true \| N/A \| \| +-----------------------+-------+----------+------+-------+---------+-------+ ``` So, we need to show the KeysType of materialized view in desc stmt. Now, the desc stmt of all mvs is changed as following: ``` +-----------------------+---------------+-------+----------+------+-------+---------+-------+ \| IndexName \| IndexKeysType \| Field \| Type \| Null \| Key \| Default \| Extra \| +-----------------------+---------------+-------+----------+------+-------+---------+-------+ \| k1_k2 \| DUP_KEYS \| k1 \| TINYINT \| Yes \| true \| N/A \| \| \| \| \| k2 \| SMALLINT \| Yes \| true \| N/A \| \| \| deduplicated_k1_k2 \| AGG_KEYS \| k1 \| TINYINT \| Yes \| true \| N/A \| \| \| \| \| k2 \| SMALLINT \| Yes \| true \| N/A \| \| +-----------------------+---------------+-------+----------+------+-------+---------+-------+ ``` NOTICE: this modify the the column of `desc` stmt.	2020-03-27 20:36:02 +08:00
Mingyu Chen	aa8b2f86c4	[Bug][Refactor] Fix the conflict of temp partition and dynamic partition operations (#3201 ) The bug is described in issue: #3200. This CL solve the problem by: 1. Refactor the alter operation conflict checking logic by introducing new classes `AlterOperations` and `AlterOpType`. 2. Allow add/drop temporary partition when dynamic partition feature is enabled. 3. Allow modifying table's property when there is temporary partition in table. 4. Make the properties `dynamic_partition.enable` optional, and default is true.	2020-03-27 20:25:15 +08:00
WingC	c1969a3fb3	[Conf] Make default_storage_medium configurable (#2980 ) Doris support choose medium when create table, and the cluster balance strategy is dependent between different storage medium, and most use will not specify the storage medium when create table, even they kown that they should choose a storage medium, they have no idea about the cluster's storage medium, so, I think we should make storage_medium and storage_cooldown_time configurable, and this should be the admin's responsibility. For Example, if the cluster's storage medium is HDD, but we need to change part of machines to SSD, if we change the machine, the tablets before change is stored in HDD and they can't find a dest path to migrate, and user will create table as usual, it will make all tablets stored in old machines and the new machines will only store a little tablets. Without this config the only way is admin need to traverse all partitions in cluster and change the property of storage_medium, it will increase operational and maintenance costs. So I add a FE config default_storage_medium, so that user can set the default storage medium.	2020-03-27 20:22:18 +08:00
caiconghui	32c4fc691c	Support determine isPreviousLoadFinished for some alter jobs in table level (#3196 ) This PR is to reduce the time cost for waiting transactions to be completed in same db by filter the running transactions in table level. NOTICE: Update FE meta version to 79	2020-03-27 20:16:23 +08:00
EmmyMiao87	c4c37a4394	Rewritten subquery in having clause (#3206 ) The subquery in having clause should be rewritten too. If not, ExprRewriteRule will not be apply in subquery. For example: select k1, sum (k2) from table group by k1 having sum(k2) > (select t1 from table2 where t2 between 1 and 2); ```t1 between 1 and 2``` should be rewritten to ```t1 >=1 and t1<=2```. Fixed #3205. TPC-DS 14 will be passed after this commit.	2020-03-26 21:13:57 +08:00
Mingyu Chen	eda23b57f2	[Plugin] Create the FE plugin dir if missing (#3202 ) The FE plugin dir should be created when initializing. Also modify the pom.xml in fe_plugins dir to make it able to use custom maven setting.	2020-03-26 11:21:10 +08:00
Seaven	f585f30b1e	[Plugin] Add FE plugin framework (#2463 ) issue #2344 * Add install/unintall Plugin statement * Add show plugin statement * Support install plugin through two ways: * Built-in Plugin: use PluginMgr's register method. * Dynamic Plugin: install by SQL statement, and the process: 1. check Plugin has already install? 2. download Plugin file from remote source or copy from local source 3. extract Plugin's .zip 4. read Plugin's plugin.properties, and check Plugin's Value 5. dynamic load .jar and init Plugin's main Class 6. invoke Plugin's init method 7. register Plugin into PluginMgr. 8. update meta * Support FE Plugin dynamic uninstall process 1. check Plugin has install? 2. invoke Plugin's close method 3. delete Plugin from PluginMgr 4. update meta * Add audit plugin interface * Add plugin enable flags in Config * Add plugin install path in Config, default plugin will install in ${DORIS_FE_PATH}/plugins * Add FE plugins project * Add audit plugin demo The usage: ``` // install plugin and show plugins; mysql> mysql> install plugin from "/home/users/seaven/auditplugin.zip"; Query OK, 0 rows affected (0.05 sec) mysql> mysql> show plugins; +-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+ \| Name \| Type \| Description \| Version \| JavaVersion \| ClassName \| SoName \| Sources \| +-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+ \| audit_plugin_demo \| AUDIT \| just for test \| 0.11.0 \| 1.8.31 \| plugin.AuditPluginDemo \| NULL \| /home/users/hekai/auditplugindemo.zip \| +-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+ 1 row in set (0.00 sec) mysql> show plugins; +-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+ \| Name \| Type \| Description \| Version \| JavaVersion \| ClassName \| SoName \| Sources \| +-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+ \| audit_plugin_demo \| AUDIT \| just for test \| 0.11.0 \| 1.8.31 \| plugin.AuditPluginDemo \| NULL \| /home/users/hekai/auditplugindemo.zip \| +-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+ 1 row in set (0.00 sec) mysql> uninstall plugin audit_plugin_demo; Query OK, 0 rows affected (0.04 sec) mysql> show plugins; Empty set (0.00 sec) ``` TODO: *Config.plugin_dir should be created if missing	2020-03-25 22:57:05 +08:00
EmmyMiao87	c0282bbc58	Solve the problem of mv selector when there is having clause in query (#3176 ) All of columns which belong to top of tupleIds in query should be considered in mv selector. For example: `select k1 from table group by k1 having sum(v1) >1;` The candidate index should contain k1 and v1 columns instead of only k1. The rollup which only has k1 column should not be selected. The issue #3174 describe in detail.	2020-03-25 20:42:39 +08:00
WingC	dfd1a33712	[Dynamic Partition] Unify dynamic partition name and range (#3193 ) Generates partition names based on the granularity. eg: Year：prefix2020 Day: prefix20200325 Week: prefix2020_#, # is the week of year. At the same time, for all granularity, align the partition range to 00:00:00.	2020-03-25 18:37:05 +08:00
yangzhg	71bc815b20	[SQL] Support subquery in case when statement (#3135 ) #3153 implement subquery support for sub query in case when statement like ``` SELECT CASE WHEN ( SELECT COUNT() / 2 FROM t ) > k4 THEN ( SELECT AVG(k4) FROM t ) ELSE ( SELECT SUM(k4) FROM t ) END AS kk4 FROM t; ``` this statement will be rewrite to ``` SELECT CASE WHEN t1.a > k4 THEN t2.a ELSE t3.a END AS kk4 FROM t, ( SELECT COUNT() / 2 AS a FROM t ) t1, ( SELECT AVG(k4) AS a FROM t ) t2, ( SELECT SUM(k4) AS a FROM t ) t3; ```	2020-03-25 17:12:54 +08:00
EmmyMiao87	b2518fc285	[SQL] Support non-correlated subquery in having clause (#3150 ) This commit support the non-correlated subquery in having clause. For example: select k1, sum(k2) from table group by k1 having sum(k2) > (select avg(k1) from table); Also the non-scalar subquery is supportted in Doris. For example: select k1, sum(k2) from table group by k1 having sum(k2) > (select avg(k1) from table group by k2); Doris will check the result row numbers of subquery in executing. If more then one row returned by subquery, the query will thrown exception. The implement method: The entire outer query is regarded as inline view of new query. The subquery in having clause is changed to the where predicate in this new query. After this commit, tpc-ds 23,24,44 are supported. This commit also support the subquery in ArithmeticExpr. For example: select k1 from table where k1=0.9*(select k1 from t);	2020-03-25 16:29:09 +08:00
WingC	3cff89df7f	[Dynamic Partition] Support for automatically drop partitions (#3081 )	2020-03-25 10:24:46 +08:00
Dayue Gao	e794bb69b7	[BUG] Make default result ordering of SHOW PARTITIONS statement be consist with 0.11 (#3184 )	2020-03-24 17:14:27 +08:00
lichaoyong	e20d905d70	Remove unused KUDU codes (#3175 ) KUDU table is no longer supported long time ago. Remove code related to it.	2020-03-24 13:54:05 +08:00
HangyuanLiu	d4c1938b5c	Open datetime min value limit (#3158 ) the min_value in olap/type.h of datetime is 0000-01-01 00:00:00, so we don't need restrict datetime min in tablet_sink	2020-03-24 10:52:57 +08:00
Mingyu Chen	d837231fca	[RoutineLoad] Fix bug that job will be paused when table is altering (#3169 ) Also add some debug log to observe the cost time of the process of routine load task	2020-03-23 11:05:00 +08:00
Mingyu Chen	473a67a5b8	[Syntax] Remove all EmptyStmt from the end of multi-statements list (#3140 ) to resolve the ISSUE: #3139 When user execute query by some client library such as python MysqlDb, if user execute like: "select * from tbl1;" (with a comma at the end of statement) The sql parser will produce 2 statements: `SelectStmt` and `EmptyStmt`. Here we discard the `EmptyStmt` to make it act like one single statement. This is for some compatibility. Because in python MysqlDb, if the first `SelectStmt` results in some warnings, it will try to execute a `SHOW WARNINGS` statement right after the SelectStmt, but before the execution of `EmptyStmt`. So there will be an exception: `(2014, "Commands out of sync; you can't run this command now")` I though it is a flaw of python MysqlDb. However, in order to maintain the consistency of user use, here we remove all EmptyStmt at the end to prevent errors.(Leave at least one statement) But if user execute statements like: `"select * from tbl1;;select 2"` If first `select * from tbl1` has warnings, python MysqlDb will still throw exception.	2020-03-23 09:39:22 +08:00
yangzhg	12d1b072ef	[Bug] Fix bug that of union statement (#3137 ) fix a bug of const union query like `select null union select null`, this because the type of SlotDescriptor when clause is `select null` is null ,this will cause BE core dump, and FE find wrong cast function.	2020-03-20 09:51:38 +08:00
gengjun-git	c88e8ab1ab	Add some system variables (#3144 ) Add event_scheduler and storage_engine system variables to compatible with some mysql client connect, say DataGrip of JetBrains.	2020-03-20 09:28:34 +08:00
Mingyu Chen	d90c892bd8	[Bug] Get NPE when executing show alter table statement (#3146 )	2020-03-20 09:20:21 +08:00
Dayue Gao	4b3367636d	[Bug] Fix NPE when access follower Fe's web console (#3149 )	2020-03-19 20:34:34 +08:00
Mingyu Chen	0f14408f13	[Temp Partition] Support loading data into temp partitions (#3120 ) Related issue: #2663, #2828. This CL support loading data into specified temporary partitions. ``` INSERT INTO tbl TEMPORARY PARTITIONS(tp1, tp2, ..) ....; curl .... -H "temporary_partition: tp1, tp, .. " .... LOAD LABEL db1.label1 ( DATA INFILE("xxxx") INTO TABLE `tbl2` TEMPORARY PARTITION(tp1, tp2, ...) ... ``` NOTICE: this CL change the FE meta version to 77. There 3 major changes in this CL ## Syntax reorganization Reorganized the syntax related to the `specify-partitions`. Removed some redundant syntax definitions, and unified the syntax related to the `specify-partitions` under one syntax entry. ## Meta refactor In order to be able to support specifying temporary partitions, I made some changes to the way the partition information in the table is stored. Partition information is now organized as follows: The following two maps are reserved in OlapTable for storing formal partitions: ``` idToPartition nameToPartition ``` Use the `TempPartitions` class for storing temporary partitions. All the partition attributes of the formal partition and the temporary partition, such as the range, the number of replicas, and the storage medium, are all stored in the `partitionInfo` of the OlapTable. In `partitionInfo`, we use two maps to store the range of formal partition and temporary partition: ``` idToRange idToTempRange ``` Use separate map is because the partition ranges of the formal partition and the temporary partition may overlap. Separate map can more easily check the partition range. All partition attributes except the partition range are stored using the same map, and the partition id is used as the map key. ## Method to get partition A table may contain both formal and temporary partitions. There are several methods to get the partition of a table. Typically divided into two categories: 1. Get partition by id 2. Get partition by name According to different requirements, the caller may want to obtain a formal partition or a temporary partition. These methods are described below in order to obtain the partition by using the correct method. 1. Get by name This type of request usually comes from a user with partition names. Such as `select * from tbl partition(p1);`. This type of request has clear information to indicate whether to obtain a formal or temporary partition. Therefore, we need to get the partition through this method: `getPartition(String partitionName, boolean isTemp)` To avoid modifying too much code, we leave the `getPartition(String partitionName)`, which is same as: `getPartition(partitionName, false)` 2. Get by id This type of request usually means that the previous step has obtained certain partition ids in some way, so we only need to get the corresponding partition through this method: `getPartition(long partitionId)`. This method will try to get both formal partitions and temporary partitions. 3. Get all partition instances Depending on the requirements, the caller may want to obtain all formal partitions, all temporary partitions, or all partitions. Therefore we provide 3 methods, the caller chooses according to needs. `getPartitions()` `getTempPartitions()` `getAllPartitions()`	2020-03-19 15:07:01 +08:00
kangkaisen	9059afcc80	Delete, update and simplify some FE code (#3125 )	2020-03-19 12:27:08 +08:00
WingC	41815ef176	[Alter]Clear expire alterJobV2 (#3130 ) Too much AlterJobsV2 may consume too much memory, which may cause FullGC. Clear some data for finished or cancelled alterJobs and remove them when expired.	2020-03-18 20:27:10 +08:00
yangzhg	0959abc1dc	[ExceptNode] Implement except node (#3056 ) implement except node, support statement like: ``` select a from t1 except select b from t2 ```	2020-03-17 10:54:40 +08:00
caiconghui	cb87a54c2b	[Syntax] Support schema keyword to be compatible with the mysql syntax (#3115 ) create schema db1; drop schema db1;	2020-03-16 17:17:49 +08:00
Mingyu Chen	14c088161c	[New Stmt] Support setting replica status manually (#1522 ) Sometimes a replica is broken on BE, but FE does not notice that. In this case, we have to manually delete that replica on BE. If there are hundreds of replicas need to be handled, this is a disaster. So I add a new stmt: ADMIN SET REPLICA STATUS which support setting tablet on specified BE as BAD or OK.	2020-03-16 13:42:30 +08:00
Mingyu Chen	e01850e6ec	[Alter] Alter job got stuck because of table is untable (#3106 ) This CL solve the issue #3105 I add a new temporary table state WAITING_STABLE. When an alter job is ready to start, it checks whether the table is stable. If it is not stable, the table state is set to WAITING_STABLE. In this state, the tablet repair logic will continue to repair the tablet until the table becomes stable. After that, the table state will be reset to SCHEMA_CHANGE/ROLLUP and alter operations will begin. This is just a temporary state, it does not need to be persistent, and only the master FE can see this state.	2020-03-14 23:48:36 +08:00
wyb	01a4ab01c4	[Bug] Fix mapping columns not exist in the table schema (#3113 )	2020-03-14 22:45:39 +08:00
Youngwb	14757f61a0	[Backup] Fix table could not load data after restore (#3087 ) Backup job in BE only backup index which is visible, but the backup meta in FE contains the shadow index, after restore from this snapshot, the shadow index is visible to load process, and the tablets is not exist in BE, so load process would be cancelled. we could fix this bug by remove the useless shadow index at backup process.	2020-03-13 22:37:11 +08:00
Mingyu Chen	4c98596283	[MysqlProtocol] Support MySQL multiple statements protocol (#3050 ) 2 Changes in this CL: ## Support multiple statements in one request like: ``` select 10; select 20; select 30; ``` ISSUE: #3049 For simple testing this CL, you can using mysql-client shell command tools: ``` mysql> delimiter // mysql> select 1; select 2; // +------+ \| 1 \| +------+ \| 1 \| +------+ 1 row in set (0.01 sec) +------+ \| 2 \| +------+ \| 2 \| +------+ 1 row in set (0.02 sec) Query OK, 0 rows affected (0.02 sec) ``` I add a new class called `OriginStatement.java`, to save the origin statement in string format with an index. This class is mainly for the following cases: 1. User send a multi-statement to the non-master FE: `DDL1; DDL2; DDL3` 2. Currently we cannot separate the original string of a single statement from multiple statements. So we have to forward the entire statement to the Master FE. So I add an index in the forward request. `DDL1`'s index is 0, `DDL2`'s index is 1,... 3. When the Master FE handle the forwarded request, it will parse the entire statement, got 3 DDL statements, and using the `index` to get the specified the statement. ## Optimized the display of syntax errors I have also optimized the display of syntax errors so that longer syntax errors can be fully displayed.	2020-03-13 22:21:40 +08:00
Mingyu Chen	9832024995	[Insert] Fix bug that insert meet unexpected "label already exists" exception (#3103 ) This CL will abort the transaction of an insert operation when encountering exception thrown in analysis phase. ISSUE: #3102	2020-03-13 20:51:44 +08:00
kangkaisen	aa540966c6	Output null for hll and bitmap column when select * (#2991 )	2020-03-13 11:59:30 +08:00
kangkaisen	d8c756260b	Rewrite count distinct to bitmap and hll (#3096 )	2020-03-13 11:44:40 +08:00
Yingchun Lai	8276c6d7f8	Show BE version in 'show backends;' (#3074 ) In a large scale cluster, we may rolling upgrade BEs, this patch add a column named 'Version' for command 'show backends;', as well as website '/system?path=//backends', to provide a method to check whether there is any BE missing upgraded.	2020-03-12 22:15:13 +08:00

1 2 3 4 5 ...

809 Commits