doris

Author	SHA1	Message	Date
Mingyu Chen	a511042397	[Export] Forget to set timeout for export job (#2516 )	2019-12-23 18:14:41 +08:00
HangyuanLiu	11b78008cd	Timezone variable support one digital time (#2513 ) Support time zone variable like "-8:00","+8:00","8:00" Time zone variable like "-8:00" is illegal in time-zone ID ,so we mush transfer it to standard format	2019-12-20 07:45:29 +08:00
Mingyu Chen	5111f8cfe8	[Export] Fix bug that NPE may be thrown when executing "show export;" (#2509 ) Some export job from old version of Doris may not has timeout property, which will cause NPE. 2 more changes: 1. Change the default BE config "max_runnings_transactions" to 2000. 2. Add a new metric to FE to show the master ip:port.	2019-12-19 19:09:25 +08:00
EmmyMiao87	4220e3b3dc	Merge pull request #2486 from EmmyMiao87/assert_node Only specified function could be supported in correlated subquery	2019-12-19 10:21:06 +08:00
emmymiao87	53132b4199	Chnange the name of specified agg function	2019-12-18 19:35:49 +08:00
WingC	e1ff744a99	[Alter Job] Cancel the alter job after a task failed for 3 times (#2447 ) To avoid waiting timeout when it is a invalid alter job.	2019-12-18 19:17:34 +08:00
emmymiao87	8342eb0b02	Only UDA function could be supported in correlated subquery Those query of issue could not be supported. #2483 #2493 Those query is forbidden: query1: select * from t1 where k1=(select k1 from t2 where t1.k2=t2.k2); query2: select * from t1 where k1=(select distinct k1 from t2 where t1.k2=t2.k2); Only sum, max, min, avg and count function could appear on select clause for correlated subquery. #2420 Those query is legal: query1: select * from t1 where k1=(select avg(k1) from t2 where t1.k2=t2.k2);	2019-12-18 18:56:48 +08:00
kangpinghuang	63ea05f9c7	Add convert tablet rowset type (#2294 ) to solve the issue #2246. scheme is as following: add a optional preferred_rowset_type in TabletMeta for V2 format rollup index tablet add a boolean session variable use_v2_rollup, if set true, the query will v2 storage format rollup index to process the query. test queries will be sent to online service to verify the correctness of segment-v2 by send the the same queries to fe with use_v2_rollup set or not to check whether the returned results are the same.	2019-12-18 18:49:47 +08:00
WingC	c81b1db406	Support convert VARCHAR type to DATE type (#2489 )	2019-12-18 12:58:47 +08:00
EmmyMiao87	efd32f7a85	Remove unused import package (#2492 )	2019-12-18 10:55:56 +08:00
WingC	89003b774b	Support Convert Varchar to INT (#2481 )	2019-12-17 22:02:28 +08:00
EmmyMiao87	b1bac4d0cd	Support to create materialized view (#2431 ) Support to create materialized view This commit support to create materiliazed view. The syntax of stmt is following: CREATE Materialized View [MV name] AS SELECT select_expr[, select_expr ...] FROM [Base table name] GROUP BY column_name[, column_name ...] ORDER BY column_name[, column_name ...] The CreateMaterializedViewClause is used to check the semantic of stmt in the first step. Now, the where, having, limit clause is forbidden in CREATE MATERIALIZED VIEW. Also the aggregation function is restricted in SUM/MIN/MAX. The second step is to validate stmt according to metadata of base table. For example, the aggregate type of mv column must be same as the aggregate type of base column in aggregate table. The last step is to prepare index of mv and add this new mvJob in Handler. The handler will asynchronous process this new mvJob.	2019-12-17 21:12:24 +08:00
emmymiao87	3e58e2d543	Forbidden the distinct function of subquery in binary predicate	2019-12-17 19:38:15 +08:00
EmmyMiao87	2c90915362	Support correlated non-scalar subquery (#2468 ) The first item of non-scalar subquery could be non-aggregation function such as column k1. This commit remove this prohibit.	2019-12-16 18:52:05 +08:00
xy720	c8c32658a7	Fix PIPE operator priority (#2459 ) This commit will promote the priority of the \|\| operator to the front of the + - * / mod operators. It solves the problems 2.1 that mentioned at issue #2396 . For problem at 2.2 in issue #2396 , it is actually the same problem mentioned in issue #2142 . As it said in pr #2398 before, the influence of modifying that logic will cause semantic errors in insert and load, so this commit will left the bug unsolved temporary. appendix: In Mysql 5.7.27 \|\| and \| select 23\|1\|\|7; 23 select (23\|1)\|\|7 237 select 23\|(1\|\|7) 23 Priority : \|\| > \| \|\| and & select 10&1\|\|7; 0 select (10&1)\|\|7 7 select 10&(1\|\|7) 0 Priority : \|\| > & \|\| and ^ select 10^1\|\|7 27 select (10^1)\|\|7 117 select 10^(1\|\|7) 27 Priority : \|\| > ^ \|\| and ~ select ~1\|\|7 184467440737095516147 select ~(1\|\|7) 18446744073709551598 priority : \|\| < ~	2019-12-16 13:44:49 +08:00
Mingyu Chen	e65a645138	Add classes related to "tag". (#2343 ) [Tag System] This CL includes 2 parts: Add classes related to "tag" Resource: is the collective name of the nodes that provide various service capabilities in Doris cluster. Tag: A Tag consists of type and name. TagSet: TagSet represents a set of tags. TagManager: maintains 2 indexes: one is from tag to resource. one is from resource to tags ISSUE #1723 Using JSON as serialization methods of metadata Introduce GSON library to serialize the new classes mentioned above. ISSUE #2415 #2389 GSON's version is updated to 2.8.6	2019-12-15 20:13:29 +08:00
Seaven	e4cc17599f	Add plugin definition (#2351 )	2019-12-13 21:38:17 +08:00
kangkaisen	02c4edb98e	Add more HTTP log (#2458 )	2019-12-13 21:31:48 +08:00
Yunfeng,Wu	a17b28ccc1	Modify FE QueryPlan UT test failure by accident (#2455 )	2019-12-13 21:28:54 +08:00
kangkaisen	cf6d705df9	Add intersect_count UDAF (#2418 ) 1 Because we don't support array type currently, so I use variable arguments instead. 2 intersect_count directly return final count, not bitmap like bitmap_union, because intersect_count return bitmap is more complex and need more serialize. If we really need bitmap format from intersect_count, we could do that in another PR and which won't have compatibility problems.	2019-12-13 16:12:05 +08:00
Mingyu Chen	8ba3c9d777	[Tag System] Forbid cluster related operations (#2429 ) The multi cluster feature will be deprecated soon. Add a FE config "disable_cluster_feature", and default is true, to forbid any cluster related operations, include: * create/drop cluster * add free backend/add backend to cluster/decommission cluster balance * change the backends num of cluster * link/migration db * fix ut	2019-12-13 10:11:30 +08:00
caiconghui	59f5851c29	Fix bug for show tables from unknown database doesn't throw error (#2445 )	2019-12-12 23:18:52 +08:00
xy720	3af03d6283	Fix sql mode Bug (#2374 ) This commit fixs the bug below, FE throws a unexpected exception when encounter a query like : Set sql_mode = '0,PIPES_AS_CONCAT'. and make some change to sql mode analyze process, now the analyze process is no longer put in SetVar.class, but in VariableMgr.class.	2019-12-12 17:50:35 +08:00
Mingyu Chen	c39d35df4c	Add tablet compaction score metrics (#2427 ) [Metric] Add tablet compaction score metrics Backend: Add metric "tablet_max_compaction_score" to monitor the current max compaction score of tablets on this Backend. This metric will be updated each time the compaction thread picking tablets to compact. Frontend: Add metric "tablet_max_compaction_score" for each Backend. These metrics will be updated when backends report tablet. And also add a calculated metric "max_tablet_compaction_core" to monitor the max compaction core of tablets on all Backends.	2019-12-12 17:46:59 +08:00
kangkaisen	a5f52f80df	Add bitmap_hash function (#2439 ) Add a bitmap_hash function. Add a murmur_hash3_32 hash function.	2019-12-12 16:55:07 +08:00
Mingyu Chen	ded247f001	[Bug][Privilege] Missing current user identity when forwarding request to Master FE (#2443 ) The current user identity should be passed to Master FE in forward request.	2019-12-12 16:27:48 +08:00
yangzhg	bf31bd238b	Change default storage model from aggregate to duplicate(#2318 ) (#2412 ) change default storage model from aggregate to duplicate for sql `create table t (k1 int) DISTRIBUTED BY HASH(k1) BUCKETS 10 PROPERTIES("replication_num" = "1");` before: ``` CREATE TABLE `t` ( `k1` int(11) NULL COMMENT "" ) ENGINE=OLAP AGGREGATE KEY(`k1`) COMMENT "OLAP" DISTRIBUTED BY HASH(`k1`) BUCKETS 10 PROPERTIES ( "storage_type" = "COLUMN" ); ``` after: ``` CREATE TABLE `t` ( `k1` int(11) NULL COMMENT "" ) ENGINE=OLAP DUPLICATE KEY(`k1`) COMMENT "OLAP" DISTRIBUTED BY HASH(`k1`) BUCKETS 10 PROPERTIES ( "storage_type" = "COLUMN" ); ``` #2318	2019-12-12 14:30:30 +08:00
kangpinghuang	c07f37d78c	[Segment V2] Add a control framework between FE and BE through heartbeat #2247 (#2364 ) The control framework is implemented through heartbeat message. Use uint64_t as flags to control different functions. Now add a flag to set the default rowset type to beta.	2019-12-12 12:18:32 +08:00
xy720	7f2144e7e5	Upgrade JMockit from version 1.13 to 1.48 (#2423 )	2019-12-12 12:03:17 +08:00
kangkaisen	72cbf6f800	Add bitmap_union_count function (#2425 )	2019-12-11 22:28:20 +08:00
kangkaisen	036d7da290	Improve publish version performance (#2382 ) 1. Reduce the publish version interval 2. Change the visible version check from `getReadyToPublishTransactions` to `finishTransaction`，and make the publish version task from serial to concurrent. 3. When `getReadyToPublishTransactions` sort the transactionState by CommitTime to make low version transaction publish firstly and reduce the wait time in `finishTransaction`,	2019-12-10 22:34:58 +08:00
Mingyu Chen	8e6535053c	[Tag System] Remove the 'isRestore' flag when creating table or partition (#2363 ) 'isRestore' flag is for the old version of backup and restore process, which is deprecated long time ago. Remove it. This commit is also for making a further step to ISSUE #1723.	2019-12-10 16:37:44 +08:00
WingC	af3d901a06	Convert INT type to DATE type (#2393 )	2019-12-07 21:56:52 +08:00
Mingyu Chen	a3b7cf484b	Set the load channel's timeout to be the same as the load job's timeout (#2405 ) [Load] When performing a long-time load job, the following errors may occur. Causes the load to fail. load channel manager add batch with unknown load id: xxx There is a case of this error because Doris opened an unrelated channel during the load process. This channel will not receive any data during the entire load process. Therefore, after a fixed timeout, the channel will be released. And after the entire load job is completed, it will try to close all open channels. When it try to close this channel, it will find that the channel no longer exists and an error is reported. This CL will pass the timeout of load job to the load channel, so that the timeout of load channels will be same as load job's.	2019-12-06 21:51:00 +08:00
Mingyu Chen	55d64e3be8	Remove the readFields() method in Writable interface (#2394 ) All classes that implement the Wriable interface need only implement the write() method. The read() method should be implemented by itself according to the situation of different classes.	2019-12-06 21:46:21 +08:00
Mingyu Chen	a46bf1ada3	[Authorization] Modify the authorization checking logic (#2372 ) Authorization checking logic There are some problems with the current password and permission checking logic. For example: First, we create a user by: `create user cmy@"%" identified by "12345";` And then 'cmy' can login with password '12345' from any hosts. Second, we create another user by: `create user cmy@"192.168.%" identified by "abcde";` Because "192.168.%" has a higher priority in the permission table than "%". So when "cmy" try to login in by password "12345" from host "192.168.1.1", it should match the second permission entry, and will be rejected because of invalid password. But in current implementation, Doris will continue to check password on first entry, than let it pass. So we should change it. Permission checking logic After a user login, it should has a unique identity which is got from permission table. For example, when "cmy" from host "192.168.1.1" login, it's identity should be `cmy@"192.168.%"`. And Doris should use this identity to check other permission, not by using the user's real identity, which is `cmy@"192.168.1.1"`. Black list Functionally speaking, Doris only support adding WHITE LIST, which is to allow user to login from those hosts in the white list. But is some cases, we do need a BLACK LIST function. Fortunately, by changing the logic described above, we can simulate the effect of the BLACK LIST. For example, First we add a user by: `create user cmy@'%' identified by '12345';` And now user 'cmy' can login from any hosts. and if we don't want 'cmy' to login from host A, we can add a new user by: `create user cmy@'A' identified by 'other_passwd';` Because "A" has a higher priority in the permission table than "%". If 'cmy' try to login from A using password '12345', it will be rejected.	2019-12-06 17:45:56 +08:00
HaiBo Li	9fbc1c7ee6	Support where/orderby/limit after “SHOW ALTER TABLE COLUMN“ syntax (#2380 ) Features： 1、Support WHERE/ORDER BY/LIMIT 2、Columns：TableName、CreatTime、FinishTime、State 3、Only “And” between conditions 4、TableName and State column only support "=" operator 5、CreateTime and FinishTime column support “=”,“>=”,"<=",">","<","!=" operators 6、CreateTime and FinishTime column support Date and DateTime string, eg:"2019-12-04" or "2019-12-04 17:18:00" TestCase: MySQL [haibotest]> show alter table column where State='FINISHED' and CreateTime > '2019-12-03' order by FinishTime desc limit 0,2; +-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+ \| JobId \| TableName \| CreateTime \| FinishTime \| IndexName \| IndexId \| OriginIndexId \| SchemaVersion \| TransactionId \| State \| Msg \| Progress \| Timeout \| +-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+ \| 11134 \| test_schema_2 \| 2019-12-03 19:21:42 \| 2019-12-03 19:22:11 \| test_schema_2 \| 11135 \| 11059 \| 1:192010000 \| 3 \| FINISHED \| \| N/A \| 86400 \| \| 11096 \| test_schema_3 \| 2019-12-03 19:21:31 \| 2019-12-03 19:21:51 \| test_schema_3 \| 11097 \| 11018 \| 1:2063361382 \| 2 \| FINISHED \| \| N/A \| 86400 \| +-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+ 2 rows in set (0.00 sec)	2019-12-06 16:24:44 +08:00
yangzhg	8e2277d997	Fix group by inf and nan duplicated (#2142 #2145 ) (#2401 )	2019-12-06 16:19:08 +08:00
yangzhg	597a8b2146	Revert "Fix arithmetic operation between numeric and non-numeric (#2362 )" (#2398 ) This reverts commit 6857ffe1c5976ef06003aa479279368bafc581f1.	2019-12-06 14:58:38 +08:00
yangzhg	6857ffe1c5	Fix arithmetic operation between numeric and non-numeric (#2362 ) fix arithmetic operation between numeric and non-numeric will cause unexpected value. After this patch you will get mysql> select 1 + "kks"; +-----------+ \| 1 + 'kks' \| +-----------+ \| 1 \| +-----------+ 1 row in set (0.02 sec) mysql> select 1 - "kks"; +-----------+ \| 1 - 'kks' \| +-----------+ \| 1 \| +-----------+ 1 row in set (0.01 sec)	2019-12-06 10:33:06 +08:00
EmmyMiao87	27d6794b81	Support subquery with non-scalar result in Binary predicate and Between-and predicate (#2360 ) This commit add a new plan node named AssertNumRowsNode which is used to determine whether the number of rows exceeds the limit. The subquery in Binary predicate and Between-and predicate should be added a AssertNumRowsNode which is used to determine whether the number of rows in subquery is more than 1. If the number of rows in subquery is more than 1, the query will be cancelled. For example: There are 4 rows in table t1. Query: select c1 from t1 where c1=(select c2 from t1); Result: ERROR 1064 (HY000): Expected no more than 1 to be returned by expression select c2 from t1 ISSUE-2270 TPC-DS 6,54,58	2019-12-05 21:27:33 +08:00
Mingyu Chen	4f39d405ee	Fix some load bugs (#2384 ) For #2383 1. Limit the concurrent transactions of routine load job 2. Create new routine load task when txn is VISIBLE, not after COMMITTED. For #2267 1. All non-master daemon thread should also be started after catalog is ready. For #2354 1. `fixLoadJobMetaError()` should be called after all meta data is read, including image and edit logs. 2. Mini load job should set to CANCELLED when corresponding transaction is not found, instead of UNKNOWN.	2019-12-05 13:41:04 +08:00
WingC	102a845131	Support convert date to datetime through alter table (#2385 )	2019-12-05 07:37:45 +08:00
Mingyu Chen	92536272d3	Fixed bdbje heartbeat timeout config format bug (#2369 ) The heartbeat config format should be like "30 s", not "30" This CL is related to commit 261072ecdda7e8eb3ce685c557c6dab15488d1f3	2019-12-04 13:28:08 +08:00
Yunfeng,Wu	0f00febd21	Optimize Doris On Elasticsearch performance (#2237 ) Pure DocValue optimization for doris-on-es Future todo: Today, for every tuple scan we check if pure_docvalue is enabled, this is not reasonable, should check pure_docvalue enabled for one whole scan outside, I will add this todo in future	2019-12-04 12:57:45 +08:00
Mingyu Chen	f0c0a715d1	Add bdbje heartbeat timeout as a configuration of FE (#2366 ) The timeline for this question is as follows: 1. For some reason, the master have lost contact with the other two followers. Judging from the logs of the master, for almost 40 seconds, the master did not print any logs. It is suspected that it is stuck due to full gc or other reasons, causing the other two followers to think that the master has been disconnected. 2. After the other two followers re-elected, they continued to provide services. 3. The master node is manually restarted afterwards. When restarting it for the first time, it needs to rollback some committed logs, so it needs to be closed and restarted again. After restarting again, it returns to normal. The main reason is that the master got stuck for 40 seconds for some reason. This issue requires further observation. At the same time, in order to alleviate this problem, we decided to set bdbje's heartbeat timeout as a configurable value. The default is 30 seconds. Can be configured to 1 minute, try to avoid this problem first.	2019-12-04 08:56:37 +08:00
Mingyu Chen	c8cff85c94	Fixed a bug that HttpServer in unit test does not start correctly. (#2361 ) Because the http client in unit test try to connect to the server when server is not ready yet.	2019-12-03 20:34:16 +08:00
Mingyu Chen	086bb82fd2	Fixed a bug that Load job's state is incorrect when upgrading from 0.10.x to 0.11.x (#2356 ) There is bug in Doris version 0.10.x. When a load job in PENDING or LOADING state was replayed from image (not through the edit log), we forgot to add the corresponding callback id in the CallbackFactory. As a result, the subsequent finish txn edit logs cannot properly finish the job during the replay process. This results in that when the FE restarts, these load jobs that should have been completed are re-entered into the pending state, resulting in repeated submission load tasks. Those wrong images are unrecoverable, so that we have to reset all load jobs in PENDING or LOADING state when restarting FE, depends on its corresponding txn's status, to avoid submit jobs repeatedly. If corresponding txn exist, set load job' state depends on txn's status. If txn does not exist, may be the txn has been removed due to label expiration. So that we don't know the txn is aborted or visible. So we have to set the job's state as UNKNOWN, which need handle it manually.	2019-12-03 16:02:50 +08:00
lichaoyong	875790eb13	Remove VersionHash used to comparation in Fe (#2335 )	2019-12-02 19:59:13 +08:00
Mingyu Chen	d90995c410	Make node info metrics available on all FE node (#2353 ) Previously, only Master FE has node info metrics to indicate which node is alive. But this info should be available on every FE, so that the monitor system can get all metrics from any FE.	2019-12-02 17:31:32 +08:00

1 2 3 4 5 ...

646 Commits