Commit Graph

764 Commits

Author SHA1 Message Date
4c98596283 [MysqlProtocol] Support MySQL multiple statements protocol (#3050)
2 Changes in this CL:

## Support multiple statements in one request like:

```
select 10; select 20; select 30;
```
ISSUE: #3049 

For simple testing this CL, you can using mysql-client shell command tools:

```
mysql> delimiter //
mysql> select 1; select 2; //
+------+
| 1    |
+------+
|    1 |
+------+
1 row in set (0.01 sec)

+------+
| 2    |
+------+
|    2 |
+------+
1 row in set (0.02 sec)

Query OK, 0 rows affected (0.02 sec)
```

I add a new class called `OriginStatement.java`, to save the origin statement in string format with an index. This class is mainly for the following cases:

1. User send a multi-statement to the non-master FE:
      `DDL1; DDL2; DDL3`

2. Currently we cannot separate the original string of a single statement from multiple statements. So we have to forward the entire statement to the Master FE. So I add an index in the forward request. `DDL1`'s index is 0,  `DDL2`'s index is 1,...

3. When the Master FE handle the forwarded request, it will parse the entire statement, got 3 DDL statements, and using the `index` to get the  specified the statement.

## Optimized the display of syntax errors
I have also optimized the display of syntax errors so that longer syntax errors can be fully displayed.
2020-03-13 22:21:40 +08:00
9832024995 [Insert] Fix bug that insert meet unexpected "label already exists" exception (#3103)
This CL will abort the transaction of an insert operation when encountering exception thrown in analysis phase.

ISSUE: #3102
2020-03-13 20:51:44 +08:00
aa540966c6 Output null for hll and bitmap column when select * (#2991) 2020-03-13 11:59:30 +08:00
d8c756260b Rewrite count distinct to bitmap and hll (#3096) 2020-03-13 11:44:40 +08:00
8276c6d7f8 Show BE version in 'show backends;' (#3074)
In a large scale cluster, we may rolling upgrade BEs, this patch add a
column named 'Version' for command 'show backends;', as well as website
'/system?path=//backends', to provide a method to check whether there
is any BE missing upgraded.
2020-03-12 22:15:13 +08:00
c8705ccf12 [MaterializedView] Support dropping materialized view (#3068)
`DROP MATERIALIZE VIEW [ IF EXISTS ] <mv_name> ON [db_name].<table_name>`

Parameters:

  IF EXISTS: Do not throw an error if the materialized view does not exist. A notice is issued in this case.
  mv_name: The name of the materialized view to remove.
  db_name: The name of db to which materialized view belongs.
  table_name: The name of table to which materialized view belongs.
2020-03-11 18:16:24 +08:00
a77515fe03 [Backup] Fix backup job block at SNAPSHOTING phase (#3058)
This bug occurred when BE make snapshot, the version required by fe had been merged into the cumulative version, so the snapshot task could not complete the task even if it retried. In order to solve this problem, the BackupJob could be set to CANCELLED, and the user could continue to retry the job.

Fix #3057
2020-03-11 14:05:02 +08:00
cf219ddf18 [ConsistencyCheck] Support checking replica consistency of tablet manually (#3067) 2020-03-10 15:25:25 +08:00
dc07182bd4 [Intersect] Implements intersect node (#3034)
imlement of the intersect node
now can support statement like `select a from t intersect select b from t1 intersect select 1;`
2020-03-09 10:52:55 +08:00
172838175f [Bug] Fix bug that index name in MaterializedViewMeta is not changed after schema change (#3048)
The index name in MaterializedViewMeta is still with `__doris_shadow` prefix
after schema change finished.

In this CL, I just remove the index name field in MaterializedViewMeta,
so that it would makes managing change of names less error-prone.
2020-03-09 10:11:16 +08:00
c8054ebe13 [Function] ifnull function supports new args (date,datetime) and (datetime, date) (#3043) 2020-03-09 09:37:26 +08:00
7b30bbea42 [MaterializedView] Support different keys type between MVs and base table (#3036)
Firstly, add materialized index meta in olap table

The materialized index meta include index name, schema, schemahash, keystype etc.
The information itself scattered in each map is encapsulated into MaterializedIndexMeta.

Also the keys type of index meta maybe not same as keys type of base index after materialized view enabled.

Secondly, support the deduplicate mv.
If there is group by or aggregation function in create mv stmt, the keys type of mv is agg.
At the same time, the keys type of base table is duplicate.
For example
Duplicate table (k1, k2, v1)
MV (k1, k2) group by k1, k2
It should be aggregated during executing mv.
2020-03-05 18:19:18 +08:00
c731c8b9bc [Bug] Fix bug of NPE when get replication number from olap table (#3029)
The default replication number of an olap table may not be set.
Every time we call `getReplicationNum()`, we have to check if it returns null, 
which is inconvenience and may cause problem

So in this PR, I set a default value to table's replication number.

This bug is introduced by #2958
2020-03-05 12:18:38 +08:00
c032d634f4 [FsBroker] Fix bug that broker cannot read file with %3A in name (#3028)
The hdfs support file with name like: "2018-01-01 00%3A00%3A00",
we should support it.

Also change the default broker log level to INFO.
2020-03-04 11:03:01 +08:00
54aa0ed26b [SetOperation] Change set operation from random shuffle to hash shuffle (#3015)
use hash shuffle instead of random shuffle in set operation, prepare for intersect and except operation
2020-03-02 19:34:41 +08:00
d151718e98 [MaterializedView] Fix bug that preAggregation is different between old and new selector (#3018)
If there is no aggregated column in aggregate index, the index will be deduplicate table.
For example:

    aggregate table (k1, k2, v1 sum)
    mv index (k1, k2)

This kind of index is SPJG which same as `select k1, k2 from aggregate_table group by k1, k2`.
It also need to check the grouping column using following steps.

If there is no aggregated column in duplicate index, the index will be SPJ which passes the grouping verification directly.

Also after the supplement of index, the new candidate index should be checked the output columns also.
2020-03-02 19:11:10 +08:00
511c5eed50 [Doc] Modify format of some docs (#3021)
Format of some docs are incorrect for building the doc website.
* fix a bug that `gensrc` dir can not be built with -j.
* fix ut bug of CreateFunctionTest
2020-03-02 19:07:52 +08:00
21b87ee23a [Bug] Access follower FE's website got exception (#3020)
QualifiedUser field is not set in ConnectContext
2020-03-02 13:53:35 +08:00
ef4bb0c011 [RoutineLoad] Auto Resume RoutineLoadJob (#2958)
When all backends restart, the routine load job can be resumed.
2020-03-02 13:27:35 +08:00
df56588bb5 [Temp Partition] Support add/drop/replace temp partitions (#2828)
This CL implements 3 new operations:

```
ALTER TABLE tbl ADD TEMPORARY PARTITION ...;
ALTER TABLE tbl DROP TEMPORARY PARTITION ...;
ALTER TABLE tbl REPLACE TEMPORARY PARTITION (p1, p2, ...);
```

User manual can be found in document:
`docs/documentation/cn/administrator-guide/alter-table/alter-table-temp-partition.md`

I did not update the grammar manual of `alter-table.md`.
This manual is too confusing and too big, I will reorganize this manual after.

This is the first part to implement the "overwrite load" feature mentioned in issue #2663.
I will implement the "load to temp partition" feature in next PR.

This CL also add GSON serialization method for the following classes (But not used):

```
Partition.java
MaterializedIndex.java
Tablet.java
Replica.java
```
2020-03-01 21:30:34 +08:00
bd23f2cda2 [MaterializedView] Fix bug that result is double when new mv selector is enable (#3012)
The issue is #3011.
Reset the tablet and scan range info before compute it.
The old rollup selector has computed tablet and scan range info.
Then the new mv selector maybe compute tablet and scan range info again sometimes.
So, we need to reset those info in here.

Before this commit, the result is double when query is "select k1 ,k2 from aggregate_table "
2020-02-27 18:19:34 +08:00
3b5a0b6060 [TPCDS] Implement the planner for set operation (#2957)
Implement intersect and except planner.
This CL does not implement intersect and except node in execution level.
2020-02-27 16:03:31 +08:00
fe086ab92c [Log] Change log level from warn to debug for unauthrorized exception (#2996)
This PR is to remove some unused log for unauthorized exception, some unauthorized access such as LVS probe request may cause connection exception which we should ignore.
2020-02-27 09:29:06 +08:00
e3d115af91 [Bug][Backup]Fix backup job block at UPLOAD_INFO phase (#3002)
There is a case where the META upload succeeded but the upload INFO failed, in which case the UPLOAD_INFO task will try again, but the META file has succeeded and filename.part has been renamed to `filename.md5sum`. The retry task will keep failing with rename and cannot complete the backup job. Therefore, the `file.md5sum` file needs to be deleted in advance

Fix #3001
2020-02-27 09:21:21 +08:00
a3e588f39c [MaterializedView] Implement new materialized view selector (#2821)
This commit mainly implements the new materialized view selector which supports SPJ<->SPJG.
Two parameters are currently used to regulate this function.
1. test_materialized_view: When this parameter is set to true, the user can create a materialized view for the duplicate table by using 'CREATE MATERIALIZED VIEW' command.
At the same time, if the result of the new materialized views is different from the old version during the query, an error will be reported. This parameter is false by default, which means that the new version of the materialized view function cannot be enabled.
2. use_old_mv_selector: When this parameter is set to true, the result of the old version selector will be selected. If set to false, the result of the new version selector will be selected. This parameter is true by default, which means that the old selector is used.
If the default values of the above two parameters do not change, there will be no behavior changes in the current version.

The main steps for the new selector are as follows:
1. Predicates stage: This stage will mainly filter out all materialized views that do not meet the current query requirements.
2. Priorities stage: This stage will sort the results of the first stage and choose the best materialized view.

The predicates phase is divided into 6 steps:
1. Calculate the predicate gap between the current query and view.
2. Whether the columns in the view can meet the needs of the compensating predicates.
3. Determine whether the group by columns of view match the group by columns of query.
4. Determine whether the aggregate columns of view match the aggregate columns of query.
5. Determine whether the output columns of view match the output columns of query.
6. Add partial materialized views

The priorities phase is divided into two steps:
1. Find the materialized view that matches the best prefix index
2. Find the materialized view with the least amount of data

The biggest difference between the current materialized view selector and the previous one is that it supports SPJ <-> SPJG.
2020-02-27 09:14:32 +08:00
8f71b1025a [Bug][Broker] Fix bug that Broker's alive status is inconsistent in different FEs
In this CL, the isAlive field in FsBroker class will be persisted in metadata, to solve the
problem describe in ISSUE: #2989

Notice: this CL update FeMetaVersion to 73
2020-02-25 22:27:27 +08:00
fb5b58b75a Add more constraints for bitmap column (#2966) 2020-02-24 10:41:18 +08:00
8eb413fa69 [Bug][RoutineLoad] Fix bug that routine Load encounter "label already used" exception (#2959)
This CL modify 2 things:

1. When a routine load task submit failed, it will not be put back to the task queue.
2. The rpc timeout when executing a routine load task in BE is set to `query_timeout` of the task plan.

ISSUE: #2964
2020-02-22 22:01:14 +08:00
35b09ecd66 [JDK] Support OpenJDK (#2804)
Support compile and running Frontend process and Broker process with OpenJDK.
OpenJDK 13 is tested.
2020-02-20 23:47:02 +08:00
ece8740c1b Fix some function DATE type priority (#2952)
1. Fix the bug introduced by https://github.com/apache/incubator-doris/pull/2947.  
The following sql result is 0000, which is wrong. The result should be 1601
```
select date_format('2020-02-19 16:01:12','%H%i');
```

2. Add constant Express plan test, ensure the FE constant Express compute result is right.

3. Remove the `castToInt ` function in `FEFunctions`, which is duplicated with `CastExpr::getResultValue`

4. Implement `getNodeExplainString` method for `UnionNode`
2020-02-20 20:45:45 +08:00
180bf0251e [Bug] Missing in memory property for restore meta info (#2950) 2020-02-20 11:46:36 +08:00
cc0d41277c [Alter] Add more schema change to varchar type (#2777) 2020-02-19 23:14:43 +08:00
cfcc29fb21 [Bug] Missing in memory property for old version of partition info (#2948)
This bug is introduced by PR #2846
2020-02-19 20:19:00 +08:00
147953f09e Fix some function with date type bug (#2947)
The logic chain is following:
1. `date_format(if(, NULL, `dt`), '%Y%m%d')` as HASH_PARTITIONED exprs,which is not right, we should use Agg  intermediate materialized slot
2. we don't use Agg  intermediate materialized slot as  HASH_PARTITIONED exprs, becasue
```
            // the parent fragment is partitioned on the grouping exprs;
            // substitute grouping exprs to reference the *output* of the agg, not the input
            partitionExprs = Expr.substituteList(partitionExprs,
                    node.getAggInfo().getIntermediateSmap(), ctx_.getRootAnalyzer(), false);
            parentPartition = DataPartition.hashPartitioned(partitionExprs);
```
the partitionExprs substitute failed。
3. partitionExprs substitute failed because partitionExprs  has a casttodate child,but agg info getIntermediateSmap has a cast in datetime child.
4. The cast to date or cast to datetime child exist because `TupleIsNullPredicate` insert a `if` Expr.   we don't have `if date` fn, so Doris use `if int` Expr.
5. the `date` in the `catstodate` depend on slot dt date type. the `datetime` in the `catstodatetime` depend on datetime arg type in `date_format` function.

So we could fix this issue by make if fn support date type or make date_format fn support date type
2020-02-19 20:16:44 +08:00
a015cd0c8b [Alter] Change table's state right after all rollup jobs being cancelled 2020-02-19 19:45:35 +08:00
ceaa790793 [Alter] Drop index when index column is dropped (#2941) 2020-02-19 17:57:27 +08:00
3994b52f34 [Alter] Change max create replicas timeout configurable (#2945) 2020-02-19 17:47:27 +08:00
a76f2b8211 bitmap_union_count support window function (#2902) 2020-02-19 14:33:05 +08:00
7be2871c36 [GroupingSet] Disable column both in select list and aggregate functions when using GROUPING SETS/CUBE/ROLLUP (#2921) 2020-02-18 13:56:56 +08:00
625411bd28 Doris support in memory olap table (#2847) 2020-02-18 10:45:54 +08:00
11b43700b9 [Alter] Fix pending AlterJobV2 replay bug (#2922)
Call replayPending method when load pending status AlterJobV2.
So that the tablet and replica won't missing in TabletInvertedIndex.
2020-02-17 23:02:18 +08:00
0fb52c514b [UDF] Fix bug that UDF can't handle constant null value (#2914)
This CL modify the `evalExpr()` of ExpressionFunctions, so that it won't change the
`FunctionCallExpr` to `NullLiteral` when there is null parameter in UDF. Which will fix the
problem described in ISSUE: #2913
2020-02-17 22:13:50 +08:00
1089f09d26 [Syntax] Fix bug introduced by #2906 (#2917) 2020-02-17 21:41:03 +08:00
1e3b0d31ea [Rollup] Change table's state right after all rollup jobs are done (#2904)
In the current implementation, the state of the table will be set until the next round of job scheduling. So there may be tens of seconds between job completion and table state changes to NORMAL.

And also, I made the synchronized range smaller by replacing the synchronized methods with synchronized blocks, which may solve the problem described in #2903
2020-02-14 21:28:51 +08:00
1f7c03d998 [FIX] Fix a sqlparser conflict by KW_PROPERTIES (#2907)
fix a sqlparser conflict by KW_PROPERTIES, now change KW_PROPERTIES's precedence to right, so it must use like PROPERTIES()
2020-02-14 21:08:50 +08:00
5386c92383 [FIX] Fix a sqlparser conflict imported by PR #2725 (#2906)
Fix a sqlparser conflict imported by pr #2725, in that pr add some time unit to keyword
I have moved those to time_unit
2020-02-14 21:06:01 +08:00
0e997a8798 Fix a sql_parser.cup conflict by a duplicated show index stmt (#2894) 2020-02-14 12:00:23 +08:00
83d33cec25 [Syntax] Fix alter rollup stmt Shift/Reduce conflict (#2897) 2020-02-14 11:49:14 +08:00
ed95352ecd support intersect and except syntax (#2882) 2020-02-13 16:48:46 +08:00
f2875ceb73 [Index] Add column type check when creating bitmap index (#2883) 2020-02-12 23:05:16 +08:00