doris

Author	SHA1	Message	Date
Mingyu Chen	511c5eed50	[Doc] Modify format of some docs (#3021 ) Format of some docs are incorrect for building the doc website. * fix a bug that `gensrc` dir can not be built with -j. * fix ut bug of CreateFunctionTest	2020-03-02 19:07:52 +08:00
worker24h	21b87ee23a	[Bug] Access follower FE's website got exception (#3020 ) QualifiedUser field is not set in ConnectContext	2020-03-02 13:53:35 +08:00
worker24h	ef4bb0c011	[RoutineLoad] Auto Resume RoutineLoadJob (#2958 ) When all backends restart, the routine load job can be resumed.	2020-03-02 13:27:35 +08:00
Mingyu Chen	df56588bb5	[Temp Partition] Support add/drop/replace temp partitions (#2828 ) This CL implements 3 new operations: ``` ALTER TABLE tbl ADD TEMPORARY PARTITION ...; ALTER TABLE tbl DROP TEMPORARY PARTITION ...; ALTER TABLE tbl REPLACE TEMPORARY PARTITION (p1, p2, ...); ``` User manual can be found in document: `docs/documentation/cn/administrator-guide/alter-table/alter-table-temp-partition.md` I did not update the grammar manual of `alter-table.md`. This manual is too confusing and too big, I will reorganize this manual after. This is the first part to implement the "overwrite load" feature mentioned in issue #2663. I will implement the "load to temp partition" feature in next PR. This CL also add GSON serialization method for the following classes (But not used): ``` Partition.java MaterializedIndex.java Tablet.java Replica.java ```	2020-03-01 21:30:34 +08:00
EmmyMiao87	bd23f2cda2	[MaterializedView] Fix bug that result is double when new mv selector is enable (#3012 ) The issue is #3011. Reset the tablet and scan range info before compute it. The old rollup selector has computed tablet and scan range info. Then the new mv selector maybe compute tablet and scan range info again sometimes. So, we need to reset those info in here. Before this commit, the result is double when query is "select k1 ,k2 from aggregate_table "	2020-02-27 18:19:34 +08:00
yangzhg	3b5a0b6060	[TPCDS] Implement the planner for set operation (#2957 ) Implement intersect and except planner. This CL does not implement intersect and except node in execution level.	2020-02-27 16:03:31 +08:00
caiconghui	fe086ab92c	[Log] Change log level from warn to debug for unauthrorized exception (#2996 ) This PR is to remove some unused log for unauthorized exception, some unauthorized access such as LVS probe request may cause connection exception which we should ignore.	2020-02-27 09:29:06 +08:00
Youngwb	e3d115af91	[Bug][Backup]Fix backup job block at UPLOAD_INFO phase (#3002 ) There is a case where the META upload succeeded but the upload INFO failed, in which case the UPLOAD_INFO task will try again, but the META file has succeeded and filename.part has been renamed to `filename.md5sum`. The retry task will keep failing with rename and cannot complete the backup job. Therefore, the `file.md5sum` file needs to be deleted in advance Fix #3001	2020-02-27 09:21:21 +08:00
EmmyMiao87	a3e588f39c	[MaterializedView] Implement new materialized view selector (#2821 ) This commit mainly implements the new materialized view selector which supports SPJ<->SPJG. Two parameters are currently used to regulate this function. 1. test_materialized_view: When this parameter is set to true, the user can create a materialized view for the duplicate table by using 'CREATE MATERIALIZED VIEW' command. At the same time, if the result of the new materialized views is different from the old version during the query, an error will be reported. This parameter is false by default, which means that the new version of the materialized view function cannot be enabled. 2. use_old_mv_selector: When this parameter is set to true, the result of the old version selector will be selected. If set to false, the result of the new version selector will be selected. This parameter is true by default, which means that the old selector is used. If the default values of the above two parameters do not change, there will be no behavior changes in the current version. The main steps for the new selector are as follows: 1. Predicates stage: This stage will mainly filter out all materialized views that do not meet the current query requirements. 2. Priorities stage: This stage will sort the results of the first stage and choose the best materialized view. The predicates phase is divided into 6 steps: 1. Calculate the predicate gap between the current query and view. 2. Whether the columns in the view can meet the needs of the compensating predicates. 3. Determine whether the group by columns of view match the group by columns of query. 4. Determine whether the aggregate columns of view match the aggregate columns of query. 5. Determine whether the output columns of view match the output columns of query. 6. Add partial materialized views The priorities phase is divided into two steps: 1. Find the materialized view that matches the best prefix index 2. Find the materialized view with the least amount of data The biggest difference between the current materialized view selector and the previous one is that it supports SPJ <-> SPJG.	2020-02-27 09:14:32 +08:00
Mingyu Chen	8f71b1025a	[Bug][Broker] Fix bug that Broker's alive status is inconsistent in different FEs In this CL, the isAlive field in FsBroker class will be persisted in metadata, to solve the problem describe in ISSUE: #2989 Notice: this CL update FeMetaVersion to 73	2020-02-25 22:27:27 +08:00
kangkaisen	fb5b58b75a	Add more constraints for bitmap column (#2966 )	2020-02-24 10:41:18 +08:00
Mingyu Chen	8eb413fa69	[Bug][RoutineLoad] Fix bug that routine Load encounter "label already used" exception (#2959 ) This CL modify 2 things: 1. When a routine load task submit failed, it will not be put back to the task queue. 2. The rpc timeout when executing a routine load task in BE is set to `query_timeout` of the task plan. ISSUE: #2964	2020-02-22 22:01:14 +08:00
Mingyu Chen	35b09ecd66	[JDK] Support OpenJDK (#2804 ) Support compile and running Frontend process and Broker process with OpenJDK. OpenJDK 13 is tested.	2020-02-20 23:47:02 +08:00
kangkaisen	ece8740c1b	Fix some function DATE type priority (#2952 ) 1. Fix the bug introduced by https://github.com/apache/incubator-doris/pull/2947. The following sql result is 0000, which is wrong. The result should be 1601 ``` select date_format('2020-02-19 16:01:12','%H%i'); ``` 2. Add constant Express plan test, ensure the FE constant Express compute result is right. 3. Remove the `castToInt ` function in `FEFunctions`, which is duplicated with `CastExpr::getResultValue` 4. Implement `getNodeExplainString` method for `UnionNode`	2020-02-20 20:45:45 +08:00
Mingyu Chen	180bf0251e	[Bug] Missing `in memory` property for restore meta info (#2950 )	2020-02-20 11:46:36 +08:00
WingC	cc0d41277c	[Alter] Add more schema change to varchar type (#2777 )	2020-02-19 23:14:43 +08:00
Mingyu Chen	cfcc29fb21	[Bug] Missing `in memory` property for old version of partition info (#2948 ) This bug is introduced by PR #2846	2020-02-19 20:19:00 +08:00
kangkaisen	147953f09e	Fix some function with date type bug (#2947 ) The logic chain is following: 1. `date_format(if(, NULL, `dt`), '%Y%m%d')` as HASH_PARTITIONED exprs，which is not right, we should use Agg intermediate materialized slot 2. we don't use Agg intermediate materialized slot as HASH_PARTITIONED exprs, becasue ``` // the parent fragment is partitioned on the grouping exprs; // substitute grouping exprs to reference the output of the agg, not the input partitionExprs = Expr.substituteList(partitionExprs, node.getAggInfo().getIntermediateSmap(), ctx_.getRootAnalyzer(), false); parentPartition = DataPartition.hashPartitioned(partitionExprs); ``` the partitionExprs substitute failed。 3. partitionExprs substitute failed because partitionExprs has a casttodate child,but agg info getIntermediateSmap has a cast in datetime child. 4. The cast to date or cast to datetime child exist because `TupleIsNullPredicate` insert a `if` Expr. we don't have `if date` fn, so Doris use `if int` Expr. 5. the `date` in the `catstodate` depend on slot dt date type. the `datetime` in the `catstodatetime` depend on datetime arg type in `date_format` function. So we could fix this issue by make if fn support date type or make date_format fn support date type	2020-02-19 20:16:44 +08:00
Mingyu Chen	a015cd0c8b	[Alter] Change table's state right after all rollup jobs being cancelled	2020-02-19 19:45:35 +08:00
yangzhg	ceaa790793	[Alter] Drop index when index column is dropped (#2941 )	2020-02-19 17:57:27 +08:00
WingC	3994b52f34	[Alter] Change max create replicas timeout configurable (#2945 )	2020-02-19 17:47:27 +08:00
kangkaisen	a76f2b8211	bitmap_union_count support window function (#2902 )	2020-02-19 14:33:05 +08:00
yangzhg	7be2871c36	[GroupingSet] Disable column both in select list and aggregate functions when using GROUPING SETS/CUBE/ROLLUP (#2921 )	2020-02-18 13:56:56 +08:00
kangkaisen	625411bd28	Doris support in memory olap table (#2847 )	2020-02-18 10:45:54 +08:00
wangbo	11b43700b9	[Alter] Fix pending AlterJobV2 replay bug (#2922 ) Call replayPending method when load pending status AlterJobV2. So that the tablet and replica won't missing in TabletInvertedIndex.	2020-02-17 23:02:18 +08:00
Mingyu Chen	0fb52c514b	[UDF] Fix bug that UDF can't handle constant null value (#2914 ) This CL modify the `evalExpr()` of ExpressionFunctions, so that it won't change the `FunctionCallExpr` to `NullLiteral` when there is null parameter in UDF. Which will fix the problem described in ISSUE: #2913	2020-02-17 22:13:50 +08:00
yangzhg	1089f09d26	[Syntax] Fix bug introduced by #2906 (#2917 )	2020-02-17 21:41:03 +08:00
Mingyu Chen	1e3b0d31ea	[Rollup] Change table's state right after all rollup jobs are done (#2904 ) In the current implementation, the state of the table will be set until the next round of job scheduling. So there may be tens of seconds between job completion and table state changes to NORMAL. And also, I made the synchronized range smaller by replacing the synchronized methods with synchronized blocks, which may solve the problem described in #2903	2020-02-14 21:28:51 +08:00
yangzhg	1f7c03d998	[FIX] Fix a sqlparser conflict by KW_PROPERTIES (#2907 ) fix a sqlparser conflict by KW_PROPERTIES, now change KW_PROPERTIES's precedence to right, so it must use like PROPERTIES()	2020-02-14 21:08:50 +08:00
yangzhg	5386c92383	[FIX] Fix a sqlparser conflict imported by PR #2725 (#2906 ) Fix a sqlparser conflict imported by pr #2725, in that pr add some time unit to keyword I have moved those to time_unit	2020-02-14 21:06:01 +08:00
yangzhg	0e997a8798	Fix a sql_parser.cup conflict by a duplicated show index stmt (#2894 )	2020-02-14 12:00:23 +08:00
wangbo	83d33cec25	[Syntax] Fix alter rollup stmt Shift/Reduce conflict (#2897 )	2020-02-14 11:49:14 +08:00
yangzhg	ed95352ecd	support intersect and except syntax (#2882 )	2020-02-13 16:48:46 +08:00
yangzhg	f2875ceb73	[Index] Add column type check when creating bitmap index (#2883 )	2020-02-12 23:05:16 +08:00
yangzhg	3e160aeb66	[GroupingSet] fix a bug when using grouping set without all column in a grouping set item (#2877 ) fix a bug when using grouping sets without all column in a grouping set item will produce wrong value. fix grouping function check will not work in group by clause	2020-02-12 21:50:12 +08:00
wangbo	1f001481ae	Support batch add and drop rollup indexes #2671 (#2781 )	2020-02-11 12:58:01 +08:00
kangkaisen	feb02ab27a	Make intersect_count function accept any expression that returns bitmap (#2850 )	2020-02-07 09:56:54 +08:00
caiconghui	d549c40fcd	Fix spelling mistakes for load metrics description (#2840 )	2020-02-06 10:18:30 +08:00
Mingyu Chen	bb4a7381ae	[UnitTest] Support starting mocked FE and BE process in unit test (#2826 ) This CL implements a simulated FE process and a simulated BE service. You can view their specific usage methods at `fe/src/test/java/org/apache/doris/utframe/DemoTest.java` At the same time, I modified the configuration of the maven-surefire-plugin plugin, so that each unit test runs in a separate JVM, which can avoid conflicts caused by various singleton classes in FE. Starting a separate jvm for each unit test will bring about 30% extra time overhead. However, you can control the number of concurrency of unit tests by setting the `forkCount` configuration of the maven-surefire-plugin plugin in `fe/pom.xml`. The default configuration is still 1 for easy viewing of the output log. If set to 3, the entire FE unit test run time is about 4 minutes.	2020-02-03 21:17:57 +08:00
Mingyu Chen	bb00f7e656	[Load] Fix bug of wrong file group aggregation when handling broker load job (#2824 ) Describe the bug First, In the broker load, we allow users to add multiple data descriptions. Each data description represents a description of a file (or set of files). Including file path, delimiter, table and partitions to be loaded, and other information. When the user specifies multiple data descriptions, Doris currently aggregates the data descriptions belonging to the same table and generates a unified load task. The problem here is that although different data descriptions point to the same table, they may specify different partitions. Therefore, the aggregation of data description should not only consider the table level, but also the partition level. Examples are as follows: data description 1 is: ``` DATA INFILE("hdfs://hdfs_host:hdfs_port/input/file1") INTO TABLE `tbl1` PARTITION (p1, p2) ``` data description 2 is: ``` DATA INFILE("hdfs://hdfs_host:hdfs_port/input/file2") INTO TABLE `tbl1` PARTITION (p3, p4) ``` What user expects is to load file1 into partition p1 and p2 of tbl1, and load file2 into paritition p3 and p4 of same table. But currently, it will be aggregated together, which result in loading file1 and file2 into all partitions p1, p2, p3 and p4. Second, the following 2 data descriptions are not allowed: ``` DATA INFILE("hdfs://hdfs_host:hdfs_port/input/file1") INTO TABLE `tbl1` PARTITION (p1, p2) DATA INFILE("hdfs://hdfs_host:hdfs_port/input/file2") INTO TABLE `tbl1` PARTITION (p2, p3) ``` They have overlapping partition(p2), which is not support yet. And we should throw an Exception to cancel this load job. Third, there is a problem with the code implementation. In the constructor of `OlapTableSink.java`, we pass in a string of partition names separated by commas. But at the `OlapTableSink` level, we should be able to pass in a list of partition ids directly, instead of names. ISSUE: #2823	2020-02-03 20:15:13 +08:00
yangzhg	5dc80dc05d	[Maven] Fix some mistake in fe/pom.xml (#2818 )	2020-01-21 10:38:46 +08:00
xy720	2a30ac2ba5	[SQL] Return NullLiteral in castTo method instead of throwing a exception (#2799 )	2020-01-21 10:20:31 +08:00
caiconghui	9dc9051930	Remove unused code for ShowPartitionsStmtTest and add apache license header (#2808 )	2020-01-20 22:51:26 +08:00
caiconghui	58ff952837	[Stmt] Support new show functions syntax to make user search function more conveniently (#2800 ) SHOW [FULL] [BUILTIN] FUNCTIONS [IN\|FROM db] [LIKE 'function_pattern'];	2020-01-20 14:14:42 +08:00
WingC	92d8f6ae78	[Alter] Allow submitting alter jobs when table is unstable Alter job will wait table to be stable before running.	2020-01-18 22:56:37 +08:00
caiconghui	ae018043b0	[Alter] Support replication_num setting for table level (#2737 ) Support replication_num setting for table level, so There is no need for user to set replication_num for every alter table add partition statement. eg: `alter table tbl set ("default.replication_num" = "2");`	2020-01-18 21:17:22 +08:00
worker24h	23f472903a	[Routine Load] Fix a bug that `show routine load` will throw Unknown Exception If we connect to a non-master FE and execute `show routine load;`. It may sometimes throw Unknown Exception, because some of fields in thrift result is not set.	2020-01-17 20:46:00 +08:00
jmk1011	6365a7d559	[FE Maven] Change maven repository url from http to https (#2786 ) From January 15th, 2020, Requests to http://repo1.maven.org/maven2/ return a 501 HTTPS Required status. So switch central repository url from http to https	2020-01-17 16:45:04 +08:00
yangzhg	fc55423032	[SQL] Support Grouping Sets, Rollup and Cube to extend group by statement Support Grouping Sets, Rollup and Cube to extend group by statement support GROUPING SETS syntax ``` SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) ); ``` cube or rollup like ``` SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY ROLLUP\|CUBE(a,b,c) ``` [ADD] support grouping functions in expr like grouping(a) + grouping(b) (#2039) [FIX] fix analyzer error in window function(#2039)	2020-01-17 16:24:02 +08:00
xy720	463c0e87ec	Replace PowerMock/EasyMock by Jmockit (4/4) (#2784 ) This commit replaces the PowerMock/EasyMock in our unit tests. (All)	2020-01-17 14:09:00 +08:00

1 2 3 4 5 ...

748 Commits