doris

Author	SHA1	Message	Date
yangzhg	3b5a0b6060	[TPCDS] Implement the planner for set operation (#2957 ) Implement intersect and except planner. This CL does not implement intersect and except node in execution level.	2020-02-27 16:03:31 +08:00
EmmyMiao87	a3e588f39c	[MaterializedView] Implement new materialized view selector (#2821 ) This commit mainly implements the new materialized view selector which supports SPJ<->SPJG. Two parameters are currently used to regulate this function. 1. test_materialized_view: When this parameter is set to true, the user can create a materialized view for the duplicate table by using 'CREATE MATERIALIZED VIEW' command. At the same time, if the result of the new materialized views is different from the old version during the query, an error will be reported. This parameter is false by default, which means that the new version of the materialized view function cannot be enabled. 2. use_old_mv_selector: When this parameter is set to true, the result of the old version selector will be selected. If set to false, the result of the new version selector will be selected. This parameter is true by default, which means that the old selector is used. If the default values of the above two parameters do not change, there will be no behavior changes in the current version. The main steps for the new selector are as follows: 1. Predicates stage: This stage will mainly filter out all materialized views that do not meet the current query requirements. 2. Priorities stage: This stage will sort the results of the first stage and choose the best materialized view. The predicates phase is divided into 6 steps: 1. Calculate the predicate gap between the current query and view. 2. Whether the columns in the view can meet the needs of the compensating predicates. 3. Determine whether the group by columns of view match the group by columns of query. 4. Determine whether the aggregate columns of view match the aggregate columns of query. 5. Determine whether the output columns of view match the output columns of query. 6. Add partial materialized views The priorities phase is divided into two steps: 1. Find the materialized view that matches the best prefix index 2. Find the materialized view with the least amount of data The biggest difference between the current materialized view selector and the previous one is that it supports SPJ <-> SPJG.	2020-02-27 09:14:32 +08:00
Mingyu Chen	8f71b1025a	[Bug][Broker] Fix bug that Broker's alive status is inconsistent in different FEs In this CL, the isAlive field in FsBroker class will be persisted in metadata, to solve the problem describe in ISSUE: #2989 Notice: this CL update FeMetaVersion to 73	2020-02-25 22:27:27 +08:00
kangkaisen	fb5b58b75a	Add more constraints for bitmap column (#2966 )	2020-02-24 10:41:18 +08:00
Mingyu Chen	35b09ecd66	[JDK] Support OpenJDK (#2804 ) Support compile and running Frontend process and Broker process with OpenJDK. OpenJDK 13 is tested.	2020-02-20 23:47:02 +08:00
kangkaisen	ece8740c1b	Fix some function DATE type priority (#2952 ) 1. Fix the bug introduced by https://github.com/apache/incubator-doris/pull/2947. The following sql result is 0000, which is wrong. The result should be 1601 ``` select date_format('2020-02-19 16:01:12','%H%i'); ``` 2. Add constant Express plan test, ensure the FE constant Express compute result is right. 3. Remove the `castToInt ` function in `FEFunctions`, which is duplicated with `CastExpr::getResultValue` 4. Implement `getNodeExplainString` method for `UnionNode`	2020-02-20 20:45:45 +08:00
Mingyu Chen	a015cd0c8b	[Alter] Change table's state right after all rollup jobs being cancelled	2020-02-19 19:45:35 +08:00
kangkaisen	a76f2b8211	bitmap_union_count support window function (#2902 )	2020-02-19 14:33:05 +08:00
yangzhg	7be2871c36	[GroupingSet] Disable column both in select list and aggregate functions when using GROUPING SETS/CUBE/ROLLUP (#2921 )	2020-02-18 13:56:56 +08:00
kangkaisen	625411bd28	Doris support in memory olap table (#2847 )	2020-02-18 10:45:54 +08:00
Mingyu Chen	0fb52c514b	[UDF] Fix bug that UDF can't handle constant null value (#2914 ) This CL modify the `evalExpr()` of ExpressionFunctions, so that it won't change the `FunctionCallExpr` to `NullLiteral` when there is null parameter in UDF. Which will fix the problem described in ISSUE: #2913	2020-02-17 22:13:50 +08:00
Mingyu Chen	1e3b0d31ea	[Rollup] Change table's state right after all rollup jobs are done (#2904 ) In the current implementation, the state of the table will be set until the next round of job scheduling. So there may be tens of seconds between job completion and table state changes to NORMAL. And also, I made the synchronized range smaller by replacing the synchronized methods with synchronized blocks, which may solve the problem described in #2903	2020-02-14 21:28:51 +08:00
yangzhg	ed95352ecd	support intersect and except syntax (#2882 )	2020-02-13 16:48:46 +08:00
yangzhg	3e160aeb66	[GroupingSet] fix a bug when using grouping set without all column in a grouping set item (#2877 ) fix a bug when using grouping sets without all column in a grouping set item will produce wrong value. fix grouping function check will not work in group by clause	2020-02-12 21:50:12 +08:00
wangbo	1f001481ae	Support batch add and drop rollup indexes #2671 (#2781 )	2020-02-11 12:58:01 +08:00
Mingyu Chen	bb4a7381ae	[UnitTest] Support starting mocked FE and BE process in unit test (#2826 ) This CL implements a simulated FE process and a simulated BE service. You can view their specific usage methods at `fe/src/test/java/org/apache/doris/utframe/DemoTest.java` At the same time, I modified the configuration of the maven-surefire-plugin plugin, so that each unit test runs in a separate JVM, which can avoid conflicts caused by various singleton classes in FE. Starting a separate jvm for each unit test will bring about 30% extra time overhead. However, you can control the number of concurrency of unit tests by setting the `forkCount` configuration of the maven-surefire-plugin plugin in `fe/pom.xml`. The default configuration is still 1 for easy viewing of the output log. If set to 3, the entire FE unit test run time is about 4 minutes.	2020-02-03 21:17:57 +08:00
Mingyu Chen	bb00f7e656	[Load] Fix bug of wrong file group aggregation when handling broker load job (#2824 ) Describe the bug First, In the broker load, we allow users to add multiple data descriptions. Each data description represents a description of a file (or set of files). Including file path, delimiter, table and partitions to be loaded, and other information. When the user specifies multiple data descriptions, Doris currently aggregates the data descriptions belonging to the same table and generates a unified load task. The problem here is that although different data descriptions point to the same table, they may specify different partitions. Therefore, the aggregation of data description should not only consider the table level, but also the partition level. Examples are as follows: data description 1 is: ``` DATA INFILE("hdfs://hdfs_host:hdfs_port/input/file1") INTO TABLE `tbl1` PARTITION (p1, p2) ``` data description 2 is: ``` DATA INFILE("hdfs://hdfs_host:hdfs_port/input/file2") INTO TABLE `tbl1` PARTITION (p3, p4) ``` What user expects is to load file1 into partition p1 and p2 of tbl1, and load file2 into paritition p3 and p4 of same table. But currently, it will be aggregated together, which result in loading file1 and file2 into all partitions p1, p2, p3 and p4. Second, the following 2 data descriptions are not allowed: ``` DATA INFILE("hdfs://hdfs_host:hdfs_port/input/file1") INTO TABLE `tbl1` PARTITION (p1, p2) DATA INFILE("hdfs://hdfs_host:hdfs_port/input/file2") INTO TABLE `tbl1` PARTITION (p2, p3) ``` They have overlapping partition(p2), which is not support yet. And we should throw an Exception to cancel this load job. Third, there is a problem with the code implementation. In the constructor of `OlapTableSink.java`, we pass in a string of partition names separated by commas. But at the `OlapTableSink` level, we should be able to pass in a list of partition ids directly, instead of names. ISSUE: #2823	2020-02-03 20:15:13 +08:00
xy720	2a30ac2ba5	[SQL] Return NullLiteral in castTo method instead of throwing a exception (#2799 )	2020-01-21 10:20:31 +08:00
caiconghui	9dc9051930	Remove unused code for ShowPartitionsStmtTest and add apache license header (#2808 )	2020-01-20 22:51:26 +08:00
caiconghui	58ff952837	[Stmt] Support new show functions syntax to make user search function more conveniently (#2800 ) SHOW [FULL] [BUILTIN] FUNCTIONS [IN\|FROM db] [LIKE 'function_pattern'];	2020-01-20 14:14:42 +08:00
WingC	92d8f6ae78	[Alter] Allow submitting alter jobs when table is unstable Alter job will wait table to be stable before running.	2020-01-18 22:56:37 +08:00
caiconghui	ae018043b0	[Alter] Support replication_num setting for table level (#2737 ) Support replication_num setting for table level, so There is no need for user to set replication_num for every alter table add partition statement. eg: `alter table tbl set ("default.replication_num" = "2");`	2020-01-18 21:17:22 +08:00
yangzhg	fc55423032	[SQL] Support Grouping Sets, Rollup and Cube to extend group by statement Support Grouping Sets, Rollup and Cube to extend group by statement support GROUPING SETS syntax ``` SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) ); ``` cube or rollup like ``` SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY ROLLUP\|CUBE(a,b,c) ``` [ADD] support grouping functions in expr like grouping(a) + grouping(b) (#2039) [FIX] fix analyzer error in window function(#2039)	2020-01-17 16:24:02 +08:00
xy720	463c0e87ec	Replace PowerMock/EasyMock by Jmockit (4/4) (#2784 ) This commit replaces the PowerMock/EasyMock in our unit tests. (All)	2020-01-17 14:09:00 +08:00
xy720	753a7dd73a	Replace PowerMock/EasyMock by Jmockit (3/4)	2020-01-16 13:24:43 +08:00
xy720	9bc306d17c	Replace PowerMock/EasyMock by Jmockit (2/4) (#2749 )	2020-01-15 20:31:30 +08:00
yangzhg	ef6cd9ae25	Add files to gitignore (#2753 )	2020-01-14 22:29:56 +08:00
xy720	273edced77	Replace PowerMock/EasyMock by Jmockit (1/3) (#2732 ) This commit replaces the PowerMock/EasyMock in our unit tests, But not all. PS.(The tests relevant to DescribeStmt are ignored until I find a way to fix it)	2020-01-13 21:28:18 +08:00
yangzhg	425b1cf29b	Fix port already in use (#2716 )	2020-01-09 16:01:17 +08:00
xy720	f7cea6dda5	CreateViewStmt/AlterViewStmt support cte and fix bug (#2641 ) This commit contains the following changes: 1. Let create/alter view statement support cte sql. (Issue #2625 ) e.g. ``` Alter view test_tbl_view (h1, h2) as with testTbl_cte (w1, w2) as ( select col1, col2 from testDb.testTbl ) select w1 as c1, sum(w2) as c2 from testTbl_cte where w1 > 10 group by w1 order by w1 ``` 2. Fix the bug that view's schema remains unchanged after replaying alter view. (Issue #2624 )	2020-01-08 23:11:38 +08:00
Mingyu Chen	d4a3b34319	[Meta Serialization] Support GSON serialization for class "Type" (#2709 ) "Type" is a abstract class, it has 4 sub classes: 1. ScalarType 2. ArrayType 3. MapType 4. StructType This CL only support ScalarType. Other types can be added later.	2020-01-08 19:56:56 +08:00
WingC	af9529a207	[Dynamic Partition] Support for automatically adding partitions In some scenarios, when a user creates an olap table that is range partition by time, the user needs to periodically add and remove partitions to ensure that the data is valid. As a result, adding and removing partitions dynamically can be very useful for users.	2020-01-03 23:45:04 +08:00
caiconghui	42dfe1369b	Add filter conditions for 'show partitions from table' syntax (#2553 ) Add filter conditions for show partitions from table syntax, to filter partitions needed	2020-01-03 19:52:25 +08:00
yangzhg	c098178f7a	[Index] Implements create drop show index syntax for bitmap index [#2487 ] (#2573 ) ### create table with index ``` CREATE TABLE table1 ( siteid INT DEFAULT '10', citycode SMALLINT, username VARCHAR(32) DEFAULT '', pv BIGINT SUM DEFAULT '0', INDEX index_name [USING BITMAP] (siteid, citycode) COMMENT 'balabala' ) AGGREGATE KEY(siteid, citycode, username) DISTRIBUTED BY HASH(siteid) BUCKETS 10 PROPERTIES("replication_num" = "1"); ``` ### create index ``` CREATE INDEX index_name ON table1 (siteid, citycod) [USING BITMAP] COMMENT 'balabala'; or ALTER TABLE table1 ADD INDEX index_name [USING BITMAP] (siteid, citycod) COMMENT 'balabala'; ``` ### drop index ``` DROP INDEX index_name ON table1; or ALTER TABLE table1 DROP INDEX index_name ``` ### show index ``` SHOW INDEX[ES] FROM table1 ``` output ``` +---------+-------------+-----------------+------------+---------+ \| Table \| Index_name \| Column_name \| Index_type \| Comment \| +---------+-------------+-----------------+------------+---------+ \| table1 \| index_name \| siteid,citycode \| BITMAMP \| balabala\| +---------+-------------+-----------------+------------+---------+ ```	2020-01-03 17:41:26 +08:00
xy720	1113f951c3	Alter view stmt (#2522 ) This commit adds a new statement named alter view, like ALTER VIEW view_name ( col_1, col_2, col_3, ) AS SELECT k1, k2, SUM(v1) FROM exampleDb.testTbl GROUP BY k1,k2	2019-12-27 14:02:56 +08:00
Mingyu Chen	6f3c50a95c	[Document] Add example for using CTE in INSERT operation (#2572 )	2019-12-26 10:00:34 +08:00
HangyuanLiu	11b78008cd	Timezone variable support one digital time (#2513 ) Support time zone variable like "-8:00","+8:00","8:00" Time zone variable like "-8:00" is illegal in time-zone ID ,so we mush transfer it to standard format	2019-12-20 07:45:29 +08:00
EmmyMiao87	b1bac4d0cd	Support to create materialized view (#2431 ) Support to create materialized view This commit support to create materiliazed view. The syntax of stmt is following: CREATE Materialized View [MV name] AS SELECT select_expr[, select_expr ...] FROM [Base table name] GROUP BY column_name[, column_name ...] ORDER BY column_name[, column_name ...] The CreateMaterializedViewClause is used to check the semantic of stmt in the first step. Now, the where, having, limit clause is forbidden in CREATE MATERIALIZED VIEW. Also the aggregation function is restricted in SUM/MIN/MAX. The second step is to validate stmt according to metadata of base table. For example, the aggregate type of mv column must be same as the aggregate type of base column in aggregate table. The last step is to prepare index of mv and add this new mvJob in Handler. The handler will asynchronous process this new mvJob.	2019-12-17 21:12:24 +08:00
Mingyu Chen	e65a645138	Add classes related to "tag". (#2343 ) [Tag System] This CL includes 2 parts: Add classes related to "tag" Resource: is the collective name of the nodes that provide various service capabilities in Doris cluster. Tag: A Tag consists of type and name. TagSet: TagSet represents a set of tags. TagManager: maintains 2 indexes: one is from tag to resource. one is from resource to tags ISSUE #1723 Using JSON as serialization methods of metadata Introduce GSON library to serialize the new classes mentioned above. ISSUE #2415 #2389 GSON's version is updated to 2.8.6	2019-12-15 20:13:29 +08:00
Yunfeng,Wu	a17b28ccc1	Modify FE QueryPlan UT test failure by accident (#2455 )	2019-12-13 21:28:54 +08:00
Mingyu Chen	8ba3c9d777	[Tag System] Forbid cluster related operations (#2429 ) The multi cluster feature will be deprecated soon. Add a FE config "disable_cluster_feature", and default is true, to forbid any cluster related operations, include: * create/drop cluster * add free backend/add backend to cluster/decommission cluster balance * change the backends num of cluster * link/migration db * fix ut	2019-12-13 10:11:30 +08:00
caiconghui	59f5851c29	Fix bug for show tables from unknown database doesn't throw error (#2445 )	2019-12-12 23:18:52 +08:00
xy720	3af03d6283	Fix sql mode Bug (#2374 ) This commit fixs the bug below, FE throws a unexpected exception when encounter a query like : Set sql_mode = '0,PIPES_AS_CONCAT'. and make some change to sql mode analyze process, now the analyze process is no longer put in SetVar.class, but in VariableMgr.class.	2019-12-12 17:50:35 +08:00
Mingyu Chen	c39d35df4c	Add tablet compaction score metrics (#2427 ) [Metric] Add tablet compaction score metrics Backend: Add metric "tablet_max_compaction_score" to monitor the current max compaction score of tablets on this Backend. This metric will be updated each time the compaction thread picking tablets to compact. Frontend: Add metric "tablet_max_compaction_score" for each Backend. These metrics will be updated when backends report tablet. And also add a calculated metric "max_tablet_compaction_core" to monitor the max compaction core of tablets on all Backends.	2019-12-12 17:46:59 +08:00
xy720	7f2144e7e5	Upgrade JMockit from version 1.13 to 1.48 (#2423 )	2019-12-12 12:03:17 +08:00
Mingyu Chen	8e6535053c	[Tag System] Remove the 'isRestore' flag when creating table or partition (#2363 ) 'isRestore' flag is for the old version of backup and restore process, which is deprecated long time ago. Remove it. This commit is also for making a further step to ISSUE #1723.	2019-12-10 16:37:44 +08:00
Mingyu Chen	a3b7cf484b	Set the load channel's timeout to be the same as the load job's timeout (#2405 ) [Load] When performing a long-time load job, the following errors may occur. Causes the load to fail. load channel manager add batch with unknown load id: xxx There is a case of this error because Doris opened an unrelated channel during the load process. This channel will not receive any data during the entire load process. Therefore, after a fixed timeout, the channel will be released. And after the entire load job is completed, it will try to close all open channels. When it try to close this channel, it will find that the channel no longer exists and an error is reported. This CL will pass the timeout of load job to the load channel, so that the timeout of load channels will be same as load job's.	2019-12-06 21:51:00 +08:00
Mingyu Chen	55d64e3be8	Remove the readFields() method in Writable interface (#2394 ) All classes that implement the Wriable interface need only implement the write() method. The read() method should be implemented by itself according to the situation of different classes.	2019-12-06 21:46:21 +08:00
Mingyu Chen	a46bf1ada3	[Authorization] Modify the authorization checking logic (#2372 ) Authorization checking logic There are some problems with the current password and permission checking logic. For example: First, we create a user by: `create user cmy@"%" identified by "12345";` And then 'cmy' can login with password '12345' from any hosts. Second, we create another user by: `create user cmy@"192.168.%" identified by "abcde";` Because "192.168.%" has a higher priority in the permission table than "%". So when "cmy" try to login in by password "12345" from host "192.168.1.1", it should match the second permission entry, and will be rejected because of invalid password. But in current implementation, Doris will continue to check password on first entry, than let it pass. So we should change it. Permission checking logic After a user login, it should has a unique identity which is got from permission table. For example, when "cmy" from host "192.168.1.1" login, it's identity should be `cmy@"192.168.%"`. And Doris should use this identity to check other permission, not by using the user's real identity, which is `cmy@"192.168.1.1"`. Black list Functionally speaking, Doris only support adding WHITE LIST, which is to allow user to login from those hosts in the white list. But is some cases, we do need a BLACK LIST function. Fortunately, by changing the logic described above, we can simulate the effect of the BLACK LIST. For example, First we add a user by: `create user cmy@'%' identified by '12345';` And now user 'cmy' can login from any hosts. and if we don't want 'cmy' to login from host A, we can add a new user by: `create user cmy@'A' identified by 'other_passwd';` Because "A" has a higher priority in the permission table than "%". If 'cmy' try to login from A using password '12345', it will be rejected.	2019-12-06 17:45:56 +08:00
HaiBo Li	9fbc1c7ee6	Support where/orderby/limit after “SHOW ALTER TABLE COLUMN“ syntax (#2380 ) Features： 1、Support WHERE/ORDER BY/LIMIT 2、Columns：TableName、CreatTime、FinishTime、State 3、Only “And” between conditions 4、TableName and State column only support "=" operator 5、CreateTime and FinishTime column support “=”,“>=”,"<=",">","<","!=" operators 6、CreateTime and FinishTime column support Date and DateTime string, eg:"2019-12-04" or "2019-12-04 17:18:00" TestCase: MySQL [haibotest]> show alter table column where State='FINISHED' and CreateTime > '2019-12-03' order by FinishTime desc limit 0,2; +-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+ \| JobId \| TableName \| CreateTime \| FinishTime \| IndexName \| IndexId \| OriginIndexId \| SchemaVersion \| TransactionId \| State \| Msg \| Progress \| Timeout \| +-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+ \| 11134 \| test_schema_2 \| 2019-12-03 19:21:42 \| 2019-12-03 19:22:11 \| test_schema_2 \| 11135 \| 11059 \| 1:192010000 \| 3 \| FINISHED \| \| N/A \| 86400 \| \| 11096 \| test_schema_3 \| 2019-12-03 19:21:31 \| 2019-12-03 19:21:51 \| test_schema_3 \| 11097 \| 11018 \| 1:2063361382 \| 2 \| FINISHED \| \| N/A \| 86400 \| +-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+ 2 rows in set (0.00 sec)	2019-12-06 16:24:44 +08:00

1 2 3 4 5

210 Commits