Commit Graph

5755 Commits

Author SHA1 Message Date
db698978da Make from_unixtime and date_format function support grayscale upgrade (#2612) 2019-12-30 13:55:23 +08:00
f9cf8a1d65 Delete unused variable in the function of recordeTable (#2607) 2019-12-28 15:35:25 +08:00
1113f951c3 Alter view stmt (#2522)
This commit adds a new statement named alter view, like
ALTER VIEW view_name
(
col_1,
col_2,
col_3,
)
AS SELECT k1, k2, SUM(v1) FROM exampleDb.testTbl GROUP BY k1,k2
2019-12-27 14:02:56 +08:00
5fd7133e69 Fix bitmap, hll, segment v2 DefaultValue bug (#2570)
1. Change the bitmap and HLL default value to empty bitmap and empty bitmap HLL
2. Fix DefaultValueColumnIterator bug
3. Fix uint24.h ostream bug
2019-12-27 14:01:45 +08:00
1421a9be41 [Compaction] Support compact only one rowset (#2558)
Support compaction operation to compact only one rowset.
After the modification, the last rowset of the tablet will
also be compacted.

At the same time, we added a `segments_overlap_pb` field to
the rowset meta. Used to describe whether the segment data
in the rowset overlaps. This field is set by `rowset_writer`.
Initially UNKNOWN for compatibility with existing data.

In addition, the version hash of the rowset generated after
compaction is directly set to the version hash of last rowset
participating in compaction, to ensure that the tablet's
version hash remains unchanged after compaction.
2019-12-27 10:08:41 +08:00
043a9528f7 Support decompressing csv file with deflate format in hdfs broker load (#2583) 2019-12-27 08:06:22 +08:00
f7032b07f3 Support more schema change from VARCHAR type (#2501) 2019-12-26 22:38:53 +08:00
c43d0e2a75 [Tablet report] Fix bug that tablet report throw NPE. (#2578)
When processing tablet reports, some tablets carry transaction information.
This information is used by the FE to determine whether to publish these
transactions or clear these transactions.

During this process, Doris may try to obtain the commit information of some
deleted partitions, resulting in a null pointer exception.
2019-12-26 15:31:36 +08:00
6f3c50a95c [Document] Add example for using CTE in INSERT operation (#2572) 2019-12-26 10:00:34 +08:00
a511042397 [Export] Forget to set timeout for export job (#2516) 2019-12-23 18:14:41 +08:00
11b78008cd Timezone variable support one digital time (#2513)
Support time zone variable like "-8:00","+8:00","8:00"
Time zone variable like "-8:00" is illegal in time-zone ID ,so we mush transfer it to standard format
2019-12-20 07:45:29 +08:00
5111f8cfe8 [Export] Fix bug that NPE may be thrown when executing "show export;" (#2509)
Some export job from old version of Doris may not has timeout property,
which will cause NPE.

2 more changes:
1. Change the default BE config "max_runnings_transactions" to 2000.
2. Add a new metric to FE to show the master ip:port.
2019-12-19 19:09:25 +08:00
4220e3b3dc Merge pull request #2486 from EmmyMiao87/assert_node
Only specified function could be supported in correlated subquery
2019-12-19 10:21:06 +08:00
53132b4199 Chnange the name of specified agg function 2019-12-18 19:35:49 +08:00
e1ff744a99 [Alter Job] Cancel the alter job after a task failed for 3 times (#2447)
To avoid waiting timeout when it is a invalid alter job.
2019-12-18 19:17:34 +08:00
8342eb0b02 Only UDA function could be supported in correlated subquery
Those query of issue could not be supported. #2483 #2493
Those query is forbidden:
query1: select * from t1 where k1=(select k1 from t2 where t1.k2=t2.k2);
query2: select * from t1 where k1=(select distinct k1 from t2 where t1.k2=t2.k2);
Only sum, max, min, avg and count function could appear on select clause for correlated subquery. #2420
Those query is legal:
query1: select * from t1 where k1=(select avg(k1) from t2 where t1.k2=t2.k2);
2019-12-18 18:56:48 +08:00
63ea05f9c7 Add convert tablet rowset type (#2294)
to solve the issue #2246.

scheme is as following:

    add a optional preferred_rowset_type in TabletMeta for V2 format rollup index tablet
    add a boolean session variable use_v2_rollup, if set true, the query will v2 storage format rollup index to process the query.
    test queries will be sent to online service to verify the correctness of segment-v2 by send the the same queries to fe with use_v2_rollup set or not to check whether the returned results are the same.
2019-12-18 18:49:47 +08:00
c81b1db406 Support convert VARCHAR type to DATE type (#2489) 2019-12-18 12:58:47 +08:00
efd32f7a85 Remove unused import package (#2492) 2019-12-18 10:55:56 +08:00
89003b774b Support Convert Varchar to INT (#2481) 2019-12-17 22:02:28 +08:00
b1bac4d0cd Support to create materialized view (#2431)
Support to create materialized view

This commit support to create materiliazed view.
The syntax of stmt is following:
CREATE Materialized View [MV name] AS
  SELECT select_expr[, select_expr ...]
  FROM [Base table name]
  GROUP BY column_name[, column_name ...]
  ORDER BY column_name[, column_name ...]

The CreateMaterializedViewClause is used to check the semantic of stmt in the first step.
Now, the where, having, limit clause is forbidden in CREATE MATERIALIZED VIEW.
Also the aggregation function is restricted in SUM/MIN/MAX.

The second step is to validate stmt according to metadata of base table.
For example, the aggregate type of mv column must be same as the aggregate type of base column in aggregate table.

The last step is to prepare index of mv and add this new mvJob in Handler.
The handler will asynchronous process this new mvJob.
2019-12-17 21:12:24 +08:00
3e58e2d543 Forbidden the distinct function of subquery in binary predicate 2019-12-17 19:38:15 +08:00
2c90915362 Support correlated non-scalar subquery (#2468)
The first item of non-scalar subquery could be non-aggregation function such as column k1.
This commit remove this prohibit.
2019-12-16 18:52:05 +08:00
c8c32658a7 Fix PIPE operator priority (#2459)
This commit will promote the priority of the || operator to the front of the + - * / mod operators.
It solves the problems 2.1 that mentioned at issue #2396 .

For problem at 2.2 in issue #2396 , it is actually the same problem mentioned in issue #2142 . As it said in pr #2398 before, the influence of modifying that logic will cause semantic errors in insert and load, so this commit will left the bug unsolved temporary.

appendix:
In Mysql 5.7.27
|| and |
select 23|1||7;
23
select (23|1)||7
237
select 23|(1||7)
23
Priority : || > |

|| and &
select 10&1||7;
0
select (10&1)||7
7
select 10&(1||7)
0
Priority : || > &

|| and ^
select 10^1||7
27
select (10^1)||7
117
select 10^(1||7)
27
Priority : || > ^

|| and ~
select ~1||7
184467440737095516147
select ~(1||7)
18446744073709551598
priority : || < ~
2019-12-16 13:44:49 +08:00
e65a645138 Add classes related to "tag". (#2343)
[Tag System] 
This CL includes 2 parts:

    Add classes related to "tag"
        Resource: is the collective name of the nodes that provide various service capabilities in Doris cluster.
        Tag: A Tag consists of type and name.
        TagSet: TagSet represents a set of tags.
        TagManager: maintains 2 indexes:
        one is from tag to resource.
        one is from resource to tags

    ISSUE #1723

    Using JSON as serialization methods of metadata

    Introduce GSON library to serialize the new classes mentioned above.

    ISSUE #2415 #2389

GSON's version is updated to 2.8.6
2019-12-15 20:13:29 +08:00
e4cc17599f Add plugin definition (#2351) 2019-12-13 21:38:17 +08:00
02c4edb98e Add more HTTP log (#2458) 2019-12-13 21:31:48 +08:00
a17b28ccc1 Modify FE QueryPlan UT test failure by accident (#2455) 2019-12-13 21:28:54 +08:00
cf6d705df9 Add intersect_count UDAF (#2418)
1 Because we don't support array type currently, so I use variable arguments instead.

2 intersect_count directly return final count, not bitmap like bitmap_union, because intersect_count return bitmap is more complex and need more serialize. If we really need bitmap format from intersect_count, we could do that in another PR and which won't have compatibility problems.
2019-12-13 16:12:05 +08:00
8ba3c9d777 [Tag System] Forbid cluster related operations (#2429)
The multi cluster feature will be deprecated soon.
Add a FE config "disable_cluster_feature", and default is true, to
forbid any cluster related operations, include:

    * create/drop cluster
    * add free backend/add backend to cluster/decommission cluster balance
    * change the backends num of cluster
    * link/migration db

* fix ut
2019-12-13 10:11:30 +08:00
59f5851c29 Fix bug for show tables from unknown database doesn't throw error (#2445) 2019-12-12 23:18:52 +08:00
3af03d6283 Fix sql mode Bug (#2374)
This commit fixs the bug below,

FE throws a unexpected exception when encounter a query like :
Set sql_mode = '0,PIPES_AS_CONCAT'.

and make some change to sql mode analyze process, now the analyze process is no longer put in SetVar.class, but in VariableMgr.class.
2019-12-12 17:50:35 +08:00
c39d35df4c Add tablet compaction score metrics (#2427)
[Metric] Add tablet compaction score metrics

Backend:
    Add metric "tablet_max_compaction_score" to monitor the current max compaction
    score of tablets on this Backend. This metric will be updated each time
    the compaction thread picking tablets to compact.

Frontend:
    Add metric "tablet_max_compaction_score" for each Backend. These metrics will
    be updated when backends report tablet.
    And also add a calculated metric "max_tablet_compaction_core" to monitor the
    max compaction core of tablets on all Backends.
2019-12-12 17:46:59 +08:00
a5f52f80df Add bitmap_hash function (#2439)
Add a bitmap_hash function.
Add a murmur_hash3_32 hash function.
2019-12-12 16:55:07 +08:00
ded247f001 [Bug][Privilege] Missing current user identity when forwarding request to Master FE (#2443)
The current user identity should be passed to Master FE in forward request.
2019-12-12 16:27:48 +08:00
bf31bd238b Change default storage model from aggregate to duplicate(#2318) (#2412)
change default storage model from aggregate to duplicate
for sql  `create table t (k1 int) DISTRIBUTED BY HASH(k1) BUCKETS 10 PROPERTIES("replication_num" = "1");`
before: 
```
 CREATE TABLE `t` (
  `k1` int(11) NULL COMMENT ""
) ENGINE=OLAP
AGGREGATE KEY(`k1`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`k1`) BUCKETS 10
PROPERTIES (
"storage_type" = "COLUMN"
);
```
after:
```
CREATE TABLE `t` (
  `k1` int(11) NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`k1`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`k1`) BUCKETS 10
PROPERTIES (
"storage_type" = "COLUMN"
);
```

#2318
2019-12-12 14:30:30 +08:00
c07f37d78c [Segment V2] Add a control framework between FE and BE through heartbeat #2247 (#2364)
The control framework is implemented through heartbeat message. Use uint64_t as flags to control different functions. 
Now add a flag to set the default rowset type to beta.
2019-12-12 12:18:32 +08:00
7f2144e7e5 Upgrade JMockit from version 1.13 to 1.48 (#2423) 2019-12-12 12:03:17 +08:00
72cbf6f800 Add bitmap_union_count function (#2425) 2019-12-11 22:28:20 +08:00
036d7da290 Improve publish version performance (#2382)
1. Reduce the publish version interval
2. Change the visible version check from `getReadyToPublishTransactions` to `finishTransaction`,and make the publish version task from  serial to concurrent.
3. When `getReadyToPublishTransactions` sort the transactionState by CommitTime to make low version transaction publish firstly and reduce the wait time in `finishTransaction`,
2019-12-10 22:34:58 +08:00
8e6535053c [Tag System] Remove the 'isRestore' flag when creating table or partition (#2363)
'isRestore' flag is for the old version of backup and restore process,
which is deprecated long time ago. Remove it.

This commit is also for making a further step to  ISSUE #1723.
2019-12-10 16:37:44 +08:00
af3d901a06 Convert INT type to DATE type (#2393) 2019-12-07 21:56:52 +08:00
a3b7cf484b Set the load channel's timeout to be the same as the load job's timeout (#2405)
[Load] 

When performing a long-time load job, the following errors may occur. Causes the load to fail.

load channel manager add batch with unknown load id: xxx

There is a case of this error because Doris opened an unrelated channel during the load
process. This channel will not receive any data during the entire load process. Therefore,
after a fixed timeout, the channel will be released.

And after the entire load job is completed, it will try to close all open channels. When it try to
close this channel, it will find that the channel no longer exists and an error is reported.

This CL will pass the timeout of load job to the load channel, so that the timeout of load channels
will be same as load job's.
2019-12-06 21:51:00 +08:00
55d64e3be8 Remove the readFields() method in Writable interface (#2394)
All classes that implement the Wriable interface need only implement the write() method.
The read() method should be implemented by itself according to the situation of different
classes.
2019-12-06 21:46:21 +08:00
a46bf1ada3 [Authorization] Modify the authorization checking logic (#2372)
**Authorization checking logic**

There are some problems with the current password and permission checking logic. For example:
First, we create a user by:
`create user cmy@"%" identified by "12345";`

And then 'cmy' can login with password '12345' from any hosts.

Second, we create another user by:
`create user cmy@"192.168.%" identified by "abcde";`

Because "192.168.%" has a higher priority in the permission table than "%". So when "cmy" try
to login in by password "12345" from host "192.168.1.1", it should match the second permission
entry, and will be rejected because of invalid password.
But in current implementation, Doris will continue to check password on first entry, than let it pass. So we should change it.

**Permission checking logic**

After a user login, it should has a unique identity which is got from permission table. For example,
when "cmy" from host "192.168.1.1" login, it's identity should be `cmy@"192.168.%"`. And Doris
should use this identity to check other permission, not by using the user's real identity, which is
`cmy@"192.168.1.1"`.

**Black list**
Functionally speaking, Doris only support adding WHITE LIST, which is to allow user to login from
those hosts in the white list. But is some cases, we do need a BLACK LIST function.
Fortunately, by changing the logic described above, we can simulate the effect of the BLACK LIST.

For example, First we add a user by:
`create user cmy@'%' identified by '12345';`

And now user 'cmy' can login from any hosts. and if we don't want 'cmy' to login from host A, we
can add a new user by:
`create user cmy@'A' identified by 'other_passwd';`

Because "A" has a higher priority in the permission table than "%". If 'cmy' try to login from A using password '12345', it will be rejected.
2019-12-06 17:45:56 +08:00
9fbc1c7ee6 Support where/orderby/limit after “SHOW ALTER TABLE COLUMN“ syntax (#2380)
Features:
1、Support WHERE/ORDER BY/LIMIT
2、Columns:TableName、CreatTime、FinishTime、State
3、Only “And” between conditions
4、TableName and State column only support "=" operator
5、CreateTime and FinishTime column support “=”,“>=”,"<=",">","<","!=" operators
6、CreateTime and FinishTime column support Date and DateTime string, eg:"2019-12-04" or "2019-12-04 17:18:00"

TestCase:
MySQL [haibotest]> show alter table column where State='FINISHED' and CreateTime > '2019-12-03' order by FinishTime desc limit 0,2;
+-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+
| JobId | TableName | CreateTime | FinishTime | IndexName | IndexId | OriginIndexId | SchemaVersion | TransactionId | State | Msg | Progress | Timeout |
+-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+
| 11134 | test_schema_2 | 2019-12-03 19:21:42 | 2019-12-03 19:22:11 | test_schema_2 | 11135 | 11059 | 1:192010000 | 3 | FINISHED | | N/A | 86400 |
| 11096 | test_schema_3 | 2019-12-03 19:21:31 | 2019-12-03 19:21:51 | test_schema_3 | 11097 | 11018 | 1:2063361382 | 2 | FINISHED | | N/A | 86400 |
+-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+
2 rows in set (0.00 sec)
2019-12-06 16:24:44 +08:00
8e2277d997 Fix group by inf and nan duplicated (#2142 #2145) (#2401) 2019-12-06 16:19:08 +08:00
597a8b2146 Revert "Fix arithmetic operation between numeric and non-numeric (#2362)" (#2398)
This reverts commit 6857ffe1c5976ef06003aa479279368bafc581f1.
2019-12-06 14:58:38 +08:00
6857ffe1c5 Fix arithmetic operation between numeric and non-numeric (#2362)
fix arithmetic operation between numeric and non-numeric will cause unexpected value.
After this patch you will get
mysql> select 1 +  "kks";
+-----------+
| 1 + 'kks' |
+-----------+
|         1 |
+-----------+
1 row in set (0.02 sec)

mysql> select 1 -  "kks";
+-----------+
| 1 - 'kks' |
+-----------+
|         1 |
+-----------+
1 row in set (0.01 sec)
2019-12-06 10:33:06 +08:00
27d6794b81 Support subquery with non-scalar result in Binary predicate and Between-and predicate (#2360)
This commit add a new plan node named AssertNumRowsNode
which is used to determine whether the number of rows exceeds the limit.
The subquery in Binary predicate and Between-and predicate should be added a AssertNumRowsNode
which is used to determine whether the number of rows in subquery is more than 1.
If the number of rows in subquery is more than 1, the query will be cancelled.

For example:
There are 4 rows in table t1.
Query: select c1 from t1 where c1=(select c2 from t1);
Result: ERROR 1064 (HY000): Expected no more than 1 to be returned by expression select c2 from t1

ISSUE-2270
TPC-DS 6,54,58
2019-12-05 21:27:33 +08:00