Commit Graph

1765 Commits

Author SHA1 Message Date
c9ec4e8a73 [UT] Fix AlterTest UT failed (#3437) 2020-04-30 14:40:33 +08:00
a6c0d376dd [Thirdparty] Update ORC lib download address (#3440) 2020-04-30 14:16:44 +08:00
25e475898e [Bug] Fix the error result when assert num rows node is used (#3436)
The child.open() function is not called before this commit.

If the assert num rows node has child which process data in open function, the assert num rows node will fetch no data from child. So the result will be empty(incorrect).

This error only appear in inner subquery which has a aggregation function.
For example:

`select * from table where k1=(select k1 from (select avg(k1) from table) a);`

The first level of subquery returns a non-scalar value, so the assert num rows node is needed.
The second level of subquery has a aggregation function, so the child of assert node is aggregate node.

However, if the open stage of the aggregate node is not called, the get next state of aggregate node will return empty set.
So the result is wrong.

Fixed #3435.
2020-04-30 14:15:50 +08:00
beb5b29f66 [Doc] Fix linked file not found (#3434)
the source file of soft link CONTRIBUTING.md is changed, make it to a regular file
2020-04-29 21:56:34 +08:00
73a3c59efb [Bug] Fix bug that help-resource.zip file is missing. (#3423) 2020-04-29 19:25:28 +08:00
74b987f053 [Bug] Fix bug that storage engine bg threads should start after env is ready 2020-04-29 11:21:19 +08:00
332a3acedc [Bug] Fix bug that NPE when get table's storage format (#3401)
the OlapTable's tableProperty field may be null, we should handle it carefully.
This is error-prone, I will try to refactor it later.
Fix #3400
2020-04-29 11:20:25 +08:00
dfaad33b8c [Thirdparty] Upgrade Google Guava lib to 29.0-jre (#3404)
Fix #3403
The new version of Guava has move the `toStringHelper` from `Object` to `MoreObject`.
This CL has passed our test environment, and looks running well.
2020-04-29 10:33:11 +08:00
432965e360 [Enhancement] documents rebuild with Vuepress (#3408) (#3414) 2020-04-29 09:14:31 +08:00
0430714ca9 Remove redundant call function _wait_in_flight_packet() (#3399)
The function `_wait_in_flight_packet` has been called in `_send_cur_batch`.
No need to call twice.
2020-04-27 20:45:25 +08:00
9a934ec9f6 [Load] Add more info in SHOW LOAD result (#3391)
Fix #3390
This CL add more info in `JobDetails` column of `SHOW LOAD` result for Broker Load Job.

For example:
```
{
	"Unfinished backends": {
		"9c3441027ff948a0-8287923329a2b6a7": [10002]
	},
        "All backends": {
		"9c3441027ff948a0-8287923329a2b6a7": [10002, 10004, 10006]
	},
	"ScannedRows": 2390016,
	"TaskNumber": 1,
	"FileNumber": 1,
	"FileSize": 1073741824
}
```

2 newly added keys:

`Unfinished backends` indicates the BE which task on them are not finished.
`All backends` indicates the BE which this job has tasks on it.

One more thing, I pass the Backend Id along with the heartbeat msg from FE to BE, so that BE can
know the Id of themselves.
2020-04-26 21:30:23 +08:00
5ec260887c [Dynamic Partition] Make config dynamic_partition_check_interval_seconds mutable (#3392) 2020-04-25 21:48:21 +08:00
72f3082358 [Metrics] Add some metrics for container size in BE (#3246)
We can observe the workload of BE, and also it's a way to check
whether there is any problem in BE, like some container increase
too large and lead to OOM.

This patch add the following metrics:
```
Name                                   Description
rowset_count_generated_and_in_use      The total count of rowset id generated and in use since BE last start
unused_rowsets_count                   The total count of unused rowset waiting to be GC
broker_count                           The total count of brokers in management
data_stream_receiver_count             The total count of data stream receivers in management
fragment_endpoint_count                The total count of fragment endpoints of data stream in management, should always equal to data_stream_receiver_count
active_scan_context_count              The total count of active scan contexts
plan_fragment_count                    The total count of plan fragments in executing
load_channel_count                     The total count of load channels in management
result_buffer_block_count              The total count of result buffer blocks for queries, each block has a limited queue size (default 1024)
result_block_queue_count               The total count of queues for fragments, each queue has a limited size (default 20, by config::max_memory_sink_batch_count)
routine_load_task_count                The total count of routine load tasks in executing
small_file_cache_count                 The total count of cached small files' digest info
stream_load_pipe_count                 The total count of stream load pipes, each pipe has a limited buffer size (default 1M)
tablet_writer_count                    The total count of tablet writers
brpc_endpoint_stub_count               The total count of brpc endpoints
```
2020-04-25 16:13:39 +08:00
42d14028a0 Use ThreadPoolManager to create threadPool and add some prometheus metrics about pool (#3386) 2020-04-25 15:57:15 +08:00
223ee85636 [Bug]Fix bug that PriorityQueue will throw IllegalArgumentException (#3393) 2020-04-25 15:49:34 +08:00
37fccd53c4 [Tablet] A small refactor on class Tablet (#3339)
There is no functional changes in this patch.
Key refactor points are:
- Remove meaningless return value of functions in class Tablet, and
  also some related functions in other classes
- Allow RowsetGraph::capture_consistent_versions to pass a nullptr
  to the output parameter
- Use CHECK instead of LOG(FATAL) to simplify code
2020-04-24 22:22:26 +08:00
0e66385235 [SQL] Disable some unsupported syntax (#3357)
Disable some syntax when subquery is not binary predicate in case when clause.
2020-04-24 22:01:35 +08:00
4eb27bc7e3 [Profile] Make running profile clearer and more intuitive to improve usability (#3365) (#3383)
This CL mainly made the following modifications:
1. Delete Invalid method in Running Profile Class.
2. Move Memlimit Counter from blockmgr to fragment and add PeakMemUsage Counter
3. Fix the bug of buffer pool memlimit counter
4. Call compute_time_in_profile() before pretty_print() to show the _local_time_percent without child running profile
5. Add TransferThread ThreadToken count in AveThreadToken Counter
2020-04-24 21:38:55 +08:00
07a9401f82 Forbidden correlated having clause (#3378)
1. The correlated slot ref should be bound by the agg tuple of outer query.
However, the correlated having clause can not be analyzed correctly so the result is incorrect.

For example: 

```
SELECT k1 FROM test GROUP BY k1 
HAVING EXISTS(SELECT k1 FROM baseall GROUP BY k1 HAVING SUM(test.k1) = k1);
```

The correlated predicate is not executed.

2. The limit offset should also be rewritten when there is subquery in having clause.

For example: 

```
select k1, count(*) cnt from test group by k1 having k1 in
(select k1 from baseall order by k1 limit 2) order by k1 limit 5 offset 3;
```

The new stmt should has a limit element with offset.
2020-04-24 21:34:40 +08:00
7715deed4e [Doc] Add download link for 0.12.0 release (#3388) 2020-04-24 21:04:19 +08:00
09eb40e356 [New Stmt] Alter replication number for table (#3360)
This CL add new command to set replication number of table in one time.
```
alter table test_tbl set ("replication_num" = "3");
```
It changes replication num of a unpartitioned table.

and

```
alter table test_tbl set ("default.replication_num" = "3");
```

It changes default replication num of the specified table.
2020-04-23 21:58:09 +08:00
a58bc1957e Fix expect may produce incorrect values (#3381) 2020-04-23 09:35:41 +08:00
ad6698cd31 [Performance] Use Google/CCTZ to replace boost at timezone function (#3300)
NOTICE: the thirdparty dependency need to upgrade to add libcctz.
2020-04-23 09:26:04 +08:00
d854a79878 [Bug] isQuery field should be reset at the beginning of query execution (#3374)
If not reset, all queries comes from same session will have save isQuery field value.
This bug will cause all entries in fe.audit.log has same IsQuery=true.

This CL also fix another bug:
The resolved IPs of domain of a user should not appear in other user's white list. Fix #3380
2020-04-23 09:00:47 +08:00
4a7a88ede1 [LSAN] Fix some memory leak detected by LSAN (#3326) 2020-04-22 22:59:44 +08:00
a88ae53326 [Bug]Use OlapTableSink::close to replace OlapTableSink::finalize method to avoid OOM (#3363)
This CL mainly solve the problem that when recycle `OlapTableSink`
object, GC thread will not do it immediately because the class override
the `finalize` method, and it will cause OOM.
2020-04-22 19:51:04 +08:00
22e90f7260 [SegmentV2] Fix bloom filter bits buffer not initialize as 0 (#3372) 2020-04-22 19:50:05 +08:00
wyb
2de78e50e2 [Bug] Fix authorization missing when auditloader plugin redirect stream load (#3367)
HttpURLConnection can automatically redirect stream load to BE, but there is no authorization 
information in http request headers after redirect.

Maybe HttpURLConnection remove authorization info when do followRedirect.

The solution is set the followRedirect property to false on the connection object and do the 
redirect request manually.

#3364
2020-04-21 22:03:18 +08:00
5c53e0fee7 [UnitTest] Modify test to be compatible with coverage tool (#3366)
C ++ R syntax is not compatible with coverage tools, so modify the syntax for test case.
2020-04-21 21:23:17 +08:00
c6ac60bab9 [SegmentV2] Optimize the upgrade logic of SegmentV2 (#3340)
This CL mainly made the following modifications:

1. Reorganized SegmentV2 upgrade document.
2. When the variable `use_v2_rollup` is set to true, the base rollup in v2 format is forcibly queried for verifying the data.
3. Fix a problem that there is no persistent storage format information in the schema change operation that performs v2 conversion.
4. Allow users to directly create v2 format tables.
2020-04-21 10:45:29 +08:00
b60aabda11 [Doris On ES] Pushdown some castexpr predicate to ES (#3351)
Process castexpr, such as: k (float) > 2.0, k(int) > 3.2, Doris On Es should ignore this doris native cast transformation for every row's col value, we push down this `cast semantic` to Elasticsearch.  

I believe in this `predicate` situation, would decrease the mount of data for transmission。

k1 is float:

````
k1 >= 5
````

push-down filter:

```
{"range":{"k1":{"gte":"5.000000"}}}
```
k2 is int :

```
k2 > 3.2
```

push-down filter:

```
{"range":{"k2":{"gte":"3.2"}}}
```
2020-04-21 08:34:20 +08:00
a2c8d14fd9 [Bug] Partition key's type has been changed after executing queries (#3348)
Expr's `uncheckedCastTo()` method should return a new instance of casted expr.
The origin expr should remain unchanged.
2020-04-21 08:30:02 +08:00
46272a5621 [Bug] Fix bug of TransactionState SerDe error (#3356)
The TransactionState's coordinator should be created when deserialized from
old meta.
2020-04-21 08:24:10 +08:00
94b7bb5ad6 [Bug][Dynamic Partition]Fix Bug that dynamic partition properties is not consistent (#3359) 2020-04-20 23:52:47 +08:00
c69bf9ac44 [New Stmt] Add SHOW KEYS gramma (#3342)
support `SHOW KEYS FROM table` for the data connector of mainstream BI tools
like PowerBI/FineBI 

#3334
2020-04-20 15:58:20 +08:00
753d6cc19f Add LOG.isDebugEnabled for some debug logical of Coordinator (#3352)
This may very slightly affect the performance or not.
2020-04-20 08:30:57 +08:00
929e93699a Fix Colocate Join Bug (#3354)
1 Fix sync error colocate group status between fe
2 Fix losing call of EditLog.logColocateRemoveTable
2020-04-20 08:29:34 +08:00
c223d37c99 [Delete] Make some correct in delete operation (#3338)
#3190
1. Correct the directory of DeleteJob.java
2. Fix some logic fault in DeleteHandlerTest.java
3. Add timeout value in log and exception
2020-04-19 11:49:02 +08:00
1d3370532b [Doc] Fix some typo, mod routine load doc (#3350)
Fix BOOLEAN typo, improve the routine load sample
2020-04-19 11:39:10 +08:00
31ebb2496d [ISSUE #3190]Add documents for delete simplifly (#3335) 2020-04-18 22:48:18 +08:00
77a7037346 Fix cooldown timestamp bug (#3336)
when add a parition with storage_cooldown_time property like this:
alter table tablexxx ADD PARTITION p20200421 VALUES LESS THAN("1588262400") ("storage_medium" = "SSD", "storage_cooldown_time" = "2020-05-01 00:00:00");
and show partitions from tablexxx;
the CooldownTime is wrong: 2610-02-17 10:16:40, and what is more, the storage migration is based on the wrong timestamp.
The reason is that the result of DateLiteral.getLongValue is not timestamp.
2020-04-18 22:47:22 +08:00
67b0da5652 Fix rowset_meta race condition for commit_txn in TxnManager (#3330) 2020-04-18 18:38:48 +08:00
0624f6b9eb [Doris On ES]Add simple explain for EsTable (#3341)
related issue: #3306
Note: this PR just remove the es_scan_node_test.cpp which is useless

For the moment, just add a simple explain syntax for EsTable without translating the native predicates to ES queryDSL which is better to finished with moving the predicate translating from Doris BE to Doris FE, the whole work is still WIP.
2020-04-18 10:04:03 +08:00
9331574818 [Transaction] Cancel all txns whose coordinate BE is down. (#3293)
This CL solve problem:
- FE can't aware Coordinate BE down and cancel the txns because the txns can't finish.
- Do some code style refactor

NOTICE: FE meta version upgrade to 83
2020-04-17 11:24:03 +08:00
f3e5320fea Fix document bug of storage_cooldown_time (#3333) 2020-04-17 09:34:28 +08:00
224f5d8bad [SegmentV1] Enable to read and write boolean type data (#3324)
This PR is to enable to read and write boolean type data for segment v1
2020-04-16 23:39:08 +08:00
b29cb9dbb3 [Optimize][Delete] Simplify the delete process to make it fast (#3191)
Our current DELETE strategy reuses the LoadChecker framework.
LoadChecker runs jobs in different stages by polling them in every 5 seconds.

There are four stages of a load job, Pending/ETL/Loading/Quorum_finish,
each of them is allocated to a LoadChecker. Four example, if a load job is submitted,
it will be initialized to the Pending state, then wait for running by the Pending LoadChecker.
After the pending job is ran, its stage will change to ETL stage, and then wait for
running by the next LoadChecker(ETL). Because interval time of the LoadChecker is 5s,
in worst case, a pending job need to wait for 20s during its life cycle.

In particular, the DELETE jobs do not need to wait for polling, they can run the pushTask()
function directly to delete. In this commit, I add a delete handler to concurrently
processing delete tasks.

All delete tasks will push to BE immediately, not required to wait for LoadCheker,
without waiting for 2 LoadChecker(delete job started in LOADING state),
at most 10s will be save(5s per LoadCheker). The delete process now is synchronized
and users get response only after the delete finished or be canceled.

If a delete is running over a certain period of time, it will be cancelled with a timeout exception.

NOTICE: this CL upgrade FE meta version to 82
2020-04-16 10:32:44 +08:00
e61793763a [Bug] Use equals() method to judge whether "type" are equal (#3310)
I don't why, but I found that sometimes when I use "==" to judge the equality of type,
it return false, even if the types are exactly same.

ISSUE: #3309

This CL only changes == to equals() to solve the problem, but the reason is still unknown.
2020-04-15 15:04:13 +08:00
91438fcb40 [rowset id] Reduce memory of UniqueRowsetIdGenerator (#3316) 2020-04-14 22:27:49 +08:00
9257535f91 [New Feature] Support setting replica quota in db level (#3283)
This PR is to limit the replica usage, admin need to know the replica usage for every db and 
table, be able to set replica quota for every db.

```
ALTER DATABASE db_name SET REPLICA QUOTA quota; 
```
2020-04-14 22:25:32 +08:00