Commit Graph

865 Commits

Author SHA1 Message Date
b08e08b3ba first 2020-05-14 09:25:51 +08:00
54e38ecda2 [Bug] Fix bug of transaction manager (#3565)
Fix bug of using wrong `abortTransaction()` method
2020-05-13 15:45:15 +08:00
ca7c0717cd Fix compile bug (#3557) 2020-05-12 10:24:37 +08:00
b648734441 [TxxMgr] Support txn management in db level and use ArrayDeque to improve txn task performance (#3369)
This PR is the first step to make Doris stream load more robust with higher concurrent 
performance(#3368),the main work is to support txn management in db level isolation
and use ArrayDeque to stored final status txns.
2020-05-11 23:32:43 +08:00
4294301c53 Throw DdlException when use admin set frontend config (#3539)
The set more than one config in a single set config stmt, an exception will be thrown
to forbid the operation.
2020-05-11 23:29:38 +08:00
edbeaf8e30 Throw a UserException when miss plugin's md5 file (#3542) 2020-05-11 15:33:35 +08:00
561765fc08 Identify old empty tablet when add tablet to meta in ReportHandler (#3547) 2020-05-11 09:50:43 +08:00
edb3ad696d [Deps] Remove redundant com.baidu:jprotobuf (#3322)
* exclude jprotobuf from jprotobuf-rpc-core
* add commons-io used in fe.
2020-05-10 17:10:46 +08:00
2586f09548 [Bug] Fix bug that SHOW DELETE not return Delete job info (#3515)
The callback added to the CallbackFactory should not be removed until the
transaction is aborted or visible. Otherwise, some callback method may failed
to be called.
2020-05-08 13:04:20 +08:00
f90da72078 [Planner]Enhance AssertNumRowsNode (#3485)
Enhance AssertNumRowsNode to support equal, less than, greater than,... assert conditions
2020-05-08 12:49:48 +08:00
45814c85ac [BugFix] Fix the bug that FE web can't show each fragment execute time percent (#3497)
like this:
          (Active: 14.133ms, non-child: 93.20%)
2020-05-08 12:48:05 +08:00
084515317f [Bug] Fix constant In Predicate result error (#3511)
`select 1 not in (2, NULL, 1);` should return `0`
2020-05-08 11:30:11 +08:00
d60bb81cb0 [SQL Function] Calculate 'case when expr' when possible (#3396)
Calculate 'case when expr' when possible
2020-05-07 22:04:09 +08:00
ca36dc697f [Bug] Fix bug that push down logic error on semi join (#3481)
For SQL like:
```
select * from
join1 left semi join join2
on join1.id = join2.id and join2.id > 1;
```

the predicate `join2.id > 1` can not be pushed down to table join1.
2020-05-07 09:30:30 +08:00
5e63629b8b [Decommission] Support NOT dropping BE after decommission (#3461)
Add a new config `drop_backend_after_decommission` in FE. if this config
is false, the BE will not be dropped after finishing decommission operation.

This new config is try to solve the problem described in ISSUE: #3460 .

TODO:
This method will generate a lot of data migration, so it is only a temporary solution.
After that, we should try to solve the problem of data balancing within the BE.

This CL also add the documents of FE and BE configuration.
These documents are incomplete and can be added later.
2020-05-06 17:14:24 +08:00
101628c813 [Bug] Fix bug of predicate pushdown logic (#3475)
When there is subquery in where clause, the query will be rewritten to join operation.
And some auxiliary binary predicates will be generated. These binary predicates
will not go through the ExprRewriteRule, so they are not normalized as
"column to the left and constant to the right" format.

We need to take this case into account so that the `canPushDownPredicate()` judgement
will not throw exception.
2020-05-06 15:15:37 +08:00
caa7a07c70 [Query Plan]Support simple transitivity on join predicate pushdown (#3453)
Current implement is very simply and conservative, because our query planner is error-prone.

After we implement the new query planner, we could do this work by `Predicate Equivalence Class` and `PredicatePushDown` rule like presto.
2020-05-04 15:32:19 +08:00
a5922051c9 [Fix] Fix bug that rowset meta is deleted after compaction (#3451)
* [Fix] Fix bug that rowset meta is deleted after compaction

After compaction, the tablet rowset meta will be modified by
adding to new output rowsets and deleting the old input rowsets.
The output version may equals to the input version.

So we should delete the "input" version from _rs_version_map
before adding the "output" version to _rs_version_map. Otherwise,
the new "output" version will be lost in _rs_version_map.
2020-05-04 09:45:25 +08:00
da4d2d2699 [UT] Fix UT bug (#3456)
SSD cool downtime shouldn't be fix time in UT;
2020-05-03 16:24:08 +08:00
7ef1e2ce5b [Bug] Fix bug that load data to wrong temp partitions (#3422)
When loading data without specifying partition, the data should only be loaded to
formal partitions, not including temp partitions;
2020-04-30 15:11:28 +08:00
c9ec4e8a73 [UT] Fix AlterTest UT failed (#3437) 2020-04-30 14:40:33 +08:00
332a3acedc [Bug] Fix bug that NPE when get table's storage format (#3401)
the OlapTable's tableProperty field may be null, we should handle it carefully.
This is error-prone, I will try to refactor it later.
Fix #3400
2020-04-29 11:20:25 +08:00
dfaad33b8c [Thirdparty] Upgrade Google Guava lib to 29.0-jre (#3404)
Fix #3403
The new version of Guava has move the `toStringHelper` from `Object` to `MoreObject`.
This CL has passed our test environment, and looks running well.
2020-04-29 10:33:11 +08:00
9a934ec9f6 [Load] Add more info in SHOW LOAD result (#3391)
Fix #3390
This CL add more info in `JobDetails` column of `SHOW LOAD` result for Broker Load Job.

For example:
```
{
	"Unfinished backends": {
		"9c3441027ff948a0-8287923329a2b6a7": [10002]
	},
        "All backends": {
		"9c3441027ff948a0-8287923329a2b6a7": [10002, 10004, 10006]
	},
	"ScannedRows": 2390016,
	"TaskNumber": 1,
	"FileNumber": 1,
	"FileSize": 1073741824
}
```

2 newly added keys:

`Unfinished backends` indicates the BE which task on them are not finished.
`All backends` indicates the BE which this job has tasks on it.

One more thing, I pass the Backend Id along with the heartbeat msg from FE to BE, so that BE can
know the Id of themselves.
2020-04-26 21:30:23 +08:00
5ec260887c [Dynamic Partition] Make config dynamic_partition_check_interval_seconds mutable (#3392) 2020-04-25 21:48:21 +08:00
42d14028a0 Use ThreadPoolManager to create threadPool and add some prometheus metrics about pool (#3386) 2020-04-25 15:57:15 +08:00
223ee85636 [Bug]Fix bug that PriorityQueue will throw IllegalArgumentException (#3393) 2020-04-25 15:49:34 +08:00
0e66385235 [SQL] Disable some unsupported syntax (#3357)
Disable some syntax when subquery is not binary predicate in case when clause.
2020-04-24 22:01:35 +08:00
07a9401f82 Forbidden correlated having clause (#3378)
1. The correlated slot ref should be bound by the agg tuple of outer query.
However, the correlated having clause can not be analyzed correctly so the result is incorrect.

For example: 

```
SELECT k1 FROM test GROUP BY k1 
HAVING EXISTS(SELECT k1 FROM baseall GROUP BY k1 HAVING SUM(test.k1) = k1);
```

The correlated predicate is not executed.

2. The limit offset should also be rewritten when there is subquery in having clause.

For example: 

```
select k1, count(*) cnt from test group by k1 having k1 in
(select k1 from baseall order by k1 limit 2) order by k1 limit 5 offset 3;
```

The new stmt should has a limit element with offset.
2020-04-24 21:34:40 +08:00
09eb40e356 [New Stmt] Alter replication number for table (#3360)
This CL add new command to set replication number of table in one time.
```
alter table test_tbl set ("replication_num" = "3");
```
It changes replication num of a unpartitioned table.

and

```
alter table test_tbl set ("default.replication_num" = "3");
```

It changes default replication num of the specified table.
2020-04-23 21:58:09 +08:00
d854a79878 [Bug] isQuery field should be reset at the beginning of query execution (#3374)
If not reset, all queries comes from same session will have save isQuery field value.
This bug will cause all entries in fe.audit.log has same IsQuery=true.

This CL also fix another bug:
The resolved IPs of domain of a user should not appear in other user's white list. Fix #3380
2020-04-23 09:00:47 +08:00
a88ae53326 [Bug]Use OlapTableSink::close to replace OlapTableSink::finalize method to avoid OOM (#3363)
This CL mainly solve the problem that when recycle `OlapTableSink`
object, GC thread will not do it immediately because the class override
the `finalize` method, and it will cause OOM.
2020-04-22 19:51:04 +08:00
c6ac60bab9 [SegmentV2] Optimize the upgrade logic of SegmentV2 (#3340)
This CL mainly made the following modifications:

1. Reorganized SegmentV2 upgrade document.
2. When the variable `use_v2_rollup` is set to true, the base rollup in v2 format is forcibly queried for verifying the data.
3. Fix a problem that there is no persistent storage format information in the schema change operation that performs v2 conversion.
4. Allow users to directly create v2 format tables.
2020-04-21 10:45:29 +08:00
a2c8d14fd9 [Bug] Partition key's type has been changed after executing queries (#3348)
Expr's `uncheckedCastTo()` method should return a new instance of casted expr.
The origin expr should remain unchanged.
2020-04-21 08:30:02 +08:00
46272a5621 [Bug] Fix bug of TransactionState SerDe error (#3356)
The TransactionState's coordinator should be created when deserialized from
old meta.
2020-04-21 08:24:10 +08:00
94b7bb5ad6 [Bug][Dynamic Partition]Fix Bug that dynamic partition properties is not consistent (#3359) 2020-04-20 23:52:47 +08:00
c69bf9ac44 [New Stmt] Add SHOW KEYS gramma (#3342)
support `SHOW KEYS FROM table` for the data connector of mainstream BI tools
like PowerBI/FineBI 

#3334
2020-04-20 15:58:20 +08:00
753d6cc19f Add LOG.isDebugEnabled for some debug logical of Coordinator (#3352)
This may very slightly affect the performance or not.
2020-04-20 08:30:57 +08:00
929e93699a Fix Colocate Join Bug (#3354)
1 Fix sync error colocate group status between fe
2 Fix losing call of EditLog.logColocateRemoveTable
2020-04-20 08:29:34 +08:00
c223d37c99 [Delete] Make some correct in delete operation (#3338)
#3190
1. Correct the directory of DeleteJob.java
2. Fix some logic fault in DeleteHandlerTest.java
3. Add timeout value in log and exception
2020-04-19 11:49:02 +08:00
77a7037346 Fix cooldown timestamp bug (#3336)
when add a parition with storage_cooldown_time property like this:
alter table tablexxx ADD PARTITION p20200421 VALUES LESS THAN("1588262400") ("storage_medium" = "SSD", "storage_cooldown_time" = "2020-05-01 00:00:00");
and show partitions from tablexxx;
the CooldownTime is wrong: 2610-02-17 10:16:40, and what is more, the storage migration is based on the wrong timestamp.
The reason is that the result of DateLiteral.getLongValue is not timestamp.
2020-04-18 22:47:22 +08:00
0624f6b9eb [Doris On ES]Add simple explain for EsTable (#3341)
related issue: #3306
Note: this PR just remove the es_scan_node_test.cpp which is useless

For the moment, just add a simple explain syntax for EsTable without translating the native predicates to ES queryDSL which is better to finished with moving the predicate translating from Doris BE to Doris FE, the whole work is still WIP.
2020-04-18 10:04:03 +08:00
9331574818 [Transaction] Cancel all txns whose coordinate BE is down. (#3293)
This CL solve problem:
- FE can't aware Coordinate BE down and cancel the txns because the txns can't finish.
- Do some code style refactor

NOTICE: FE meta version upgrade to 83
2020-04-17 11:24:03 +08:00
224f5d8bad [SegmentV1] Enable to read and write boolean type data (#3324)
This PR is to enable to read and write boolean type data for segment v1
2020-04-16 23:39:08 +08:00
b29cb9dbb3 [Optimize][Delete] Simplify the delete process to make it fast (#3191)
Our current DELETE strategy reuses the LoadChecker framework.
LoadChecker runs jobs in different stages by polling them in every 5 seconds.

There are four stages of a load job, Pending/ETL/Loading/Quorum_finish,
each of them is allocated to a LoadChecker. Four example, if a load job is submitted,
it will be initialized to the Pending state, then wait for running by the Pending LoadChecker.
After the pending job is ran, its stage will change to ETL stage, and then wait for
running by the next LoadChecker(ETL). Because interval time of the LoadChecker is 5s,
in worst case, a pending job need to wait for 20s during its life cycle.

In particular, the DELETE jobs do not need to wait for polling, they can run the pushTask()
function directly to delete. In this commit, I add a delete handler to concurrently
processing delete tasks.

All delete tasks will push to BE immediately, not required to wait for LoadCheker,
without waiting for 2 LoadChecker(delete job started in LOADING state),
at most 10s will be save(5s per LoadCheker). The delete process now is synchronized
and users get response only after the delete finished or be canceled.

If a delete is running over a certain period of time, it will be cancelled with a timeout exception.

NOTICE: this CL upgrade FE meta version to 82
2020-04-16 10:32:44 +08:00
e61793763a [Bug] Use equals() method to judge whether "type" are equal (#3310)
I don't why, but I found that sometimes when I use "==" to judge the equality of type,
it return false, even if the types are exactly same.

ISSUE: #3309

This CL only changes == to equals() to solve the problem, but the reason is still unknown.
2020-04-15 15:04:13 +08:00
9257535f91 [New Feature] Support setting replica quota in db level (#3283)
This PR is to limit the replica usage, admin need to know the replica usage for every db and 
table, be able to set replica quota for every db.

```
ALTER DATABASE db_name SET REPLICA QUOTA quota; 
```
2020-04-14 22:25:32 +08:00
a467c6f81f [ES Connector] Add field context for string field keyword type (#3305)
This PR is just a transitional way,but it is better to move the predicates transformation from Doris BE to Doris BE, in this way, Doris BE is responsible for fetching data from ES.

 Add a  `enable_keyword_sniff ` configuration item in creating External Elasticsearch Table ,it default to true , would to sniff the `keyword` type on the `text analyzed` Field and return the `json_path` which substitute the origin col name.

```
CREATE EXTERNAL TABLE `test` (
  `k1` varchar(20) COMMENT "",
  `create_time` datetime COMMENT ""
) ENGINE=ELASTICSEARCH
PROPERTIES (
"hosts" = "http://10.74.167.16:8200",
"user" = "root",
"password" = "root",
"index" = "test",
"type" = "doc",
"enable_keyword_sniff" = "true"
);
```
note: `enable_keyword_sniff` default to  "true"

run this SQL:

```
select * from test where k1 = "wu yun feng"
```
 Output predicate DSL:

```
{"term":{"k1.keyword":"wu yun feng"}}
```
and in this PR, I remove the elasticsearch version detected logic for now this is useless, maybe future is needed.
2020-04-13 23:07:33 +08:00
7c07083cd5 Forbidden multi subquery in having clause (#3291)
Multiple subqueries in the having statement need to be rewritten into multiple tables for join. The current rewriting rules need to be transformed.
And this writing is not common, and there is no strong requirement from the business side.
This function will be added later if it is required.
2020-04-11 21:56:08 +08:00
5b69c70f9a [Bug] Fix bug that user plugin dir is removed after installing the plugin (#3302)
When user install a FE plugin from a directory, the directory should not
be removed after installing.
2020-04-11 20:30:14 +08:00