Commit Graph

263 Commits

Author SHA1 Message Date
44325ae850 [Bug-Fix] Bucket shuffle join executes failed when two tables have no data (#5145)
Bucket shuffle join is an algorithm of joining two tables. Left table is distributed by a column.
Right table sends the data to the left table for joining operation.
It reduces the network cost. But when two table is without any data. Bucket shuffle join will fail.

Related Issue: #5144
2020-12-31 09:49:35 +08:00
2e95b1c389 [Enhancement]Make Cholocate table join more load balance (#5104)
When two colocate tables make join operation, to make join operation locally,
the tablet belongs to the same bucket sequence will be distributed to the same host.
When choosing which host for a bucket sequence, it takes random strategy.
Random strategy can not make query task load balance logically for one query.

Therefore, this patch takes round-robin strategy, make buckets distributed evenly.
For example, if there are 6 bucket sequences and 3 hosts,
it is better to distributed 2 buckets sequence for every host.
2020-12-31 09:47:06 +08:00
d7a584ac59 [Rebalancer] support partition rebalancer (#5010)
RebalancerType could be configured via Config.rebalancer_type(BeLoad, Partition).
PartitionRebalancer is based on TwoDimensionalGreedyAlgo.
Two dims of Doris should be cluster & partition. And we only consider about the replica count, 
do not consider replica size.
#4845 for further details.
2020-12-31 09:41:38 +08:00
fd6fb90a5a [Bug] Hit none partition cache, but hit range is still right (#5065)
Doris supports two kinds of cache mode: sql_cache and partition_cache.
sql_cache takes sql string as key and cache the whole data.
partition_cache splits the data into many partition data and caches them differently.
Therefore a query may hit part of the partition_cache data.
If a query hits the left part of the data, we call the hit range is left.
If a query hits the right part of the data, we call the hit range is right.
And if a query hits the whole part of the data, we call the hit range is full.

A query does not hit any partition cache, but the algorithm still returns hit range right.
It should return hit range none.

Related issue: #5136
2020-12-31 09:40:31 +08:00
62604dfeac Improve the processing logic of Load statement derived columns (#5140)
* support transitive in load expr
2020-12-30 10:27:46 +08:00
cd865c95e0 Follower don't forward non-query statement to master repeatedly (#5160)
Co-authored-by: lanhuajian <lanhuajian@sankuai.com>
2020-12-29 10:29:26 +08:00
f7a325a08f [Refactor]Refactor function computeScanRangeAssignmentByColocate (#5097) 2020-12-26 14:38:39 +08:00
279ae1cb75 Add fuzzy_parse option to speed up json import (#5114)
add a flag of fuzzy_parse, if the json file all object keys are the same and has same order, we only need to parse the first row, and then use index instead key to parse value
2020-12-25 09:19:42 +08:00
cf3f830e9a [Bug-Fix] Fix 'Malformed packet' error when desc OlapTable with Rollup (#4455) (#5115)
Fix 'Malformed packet' error when desc OlapTable with Rollup #4455
2020-12-23 09:34:12 +08:00
c57145b4c2 [Bug] Fix bug that routine load may lost some data (#5093)
In the previous implementation, whether a subtask is in commit or abort state,
we will try to update the job progress, such as the consumed offset of kafka.
Under normal circumstances, the aborted transaction does not consume any data,
and all progress is 0, so even we update the progress, the progress will remain
unchanged.
However, in the case of high cluster load, the subtask may fail half of the execution on the BE side.
At this time, although the task is aborted, part of the progress is updated.
Cause the next subtask to skip these data for consumption, resulting in data loss.
2020-12-23 09:33:52 +08:00
6673306fda [DOC] fix toSql of ShowPartitionsStmt (#5070) 2020-12-19 11:18:00 +08:00
5bf84814cc [Doc] Improve broadcast instructions (#5048) 2020-12-19 11:16:59 +08:00
b485c10d56 [ODBC] ODBC Catalog do not show password in 'show resource' (#5088)
issue:#5087
2020-12-17 00:34:04 +08:00
9864a5d818 [Enhance] Modify the error message when mv column is transformed from base column in agg family table (#5084)
When user wants to create materialized view with a mv column which is transformed
from original column in agg family table, Doris will throw a new error message
"The mv column of agg or uniq table cannot be transformed from original column"
instead of "column not exists".
2020-12-17 00:33:27 +08:00
ef15c5151c [BUG] Fix colocate balance bug when no available BE (#5079) 2020-12-17 00:32:42 +08:00
b640991e43 [Enhance] Add profile for load job (#5052)
Add viewable profile for broker load. Similar to the query profile,
the user can submit the import job by setting the session variable is_report_success to true,
and then view the running profile of the job on the FE web page for easy analysis and debugging.
2020-12-16 23:52:10 +08:00
74bfd69595 [Bug] Forbidden creating table with dynamic partition when FE.config dynamic_partition_enable=false (#5043)
- There is a fe configuration called dynamic_partition_enable
    which controls the opening and closing of the dynamic partition function.
  When this configuration is false, it means that all tables do not support dynamic partitioning.

- But when the user tried to create the dynamic partition table, Doris did not detect this parameter.
  This will cause the user can normally create a dynamic partition table,
    but in fact Doris cannot create a partition for this table.

- This pr detect this config when building the table.
  The dynamic partition table can be created only when the dynamic_partition_enable configuration is true.
  If the configuration is false, the command to create a dynamic partition table will directly report an error.
2020-12-16 23:44:20 +08:00
dfa413335f [Heartbeat] Support fe heartbeat use thrift protocol to get stable response (#5027)
This PR is to support fe master get fe heartbeat response by thrift protocol instead of http protocol.
2020-12-16 23:38:04 +08:00
650536d53e [Feature] Add Topn udaf (#4803)
For #4674 
This is a udaf for approximate topn using Space-Saving algorithm.  At present, we can only calculate
the frequent items and their frequencies in a certain column, based on which we can implement similar
topN functions supported by Kylin in the future. 

I have also added a test to calculate the accuracy of this algorithm. The following is a rough running result.
The total amount of data is 1 million lines and follows the Zipfian distribution, where Element Cardinality
represents the data cardinality, 20X, 50X.. The value representing space_expand_rate is 20,50, which is
used to set the counter number in the space-saving algorithm

```
zf exponent = 0.5
Element cardinality	        20X        50X          100X
               1000		100%	   100%         100%
               10000		100%	   100%		100%
	       100000		100%	   100%		100%
	       500000		 94%	    98%		 99%

zf exponent = 0.6,1
Element cardinality	        20X        50X          100X
		1000		100%	   100%         100%
		10000		100%	   100%		100%
		100000		100%	   100%		100%
		500000		100%	   100%		100%

```
2020-12-16 21:58:34 +08:00
ff4bd1223f [Profile] Add cpu time cost in query audit (#5051) 2020-12-13 22:22:15 +08:00
f847e22eeb [AuditLog] Send queryId to master FE (#5064)
For fix #4977, we return queryId in master FE when finish query for non master to audit it in #4978.
But when the query fail(timeout), the client may not receive the right queryId for audit.
In this PR:
None master FE send queryId to master for querying;
Add more log.
2020-12-13 22:05:35 +08:00
115d4332aa [ODBC] Support ODBC Sink for insert into data to ODBC external table (#5033)
issue:#5031

1. Support ODBC Sink for insert into data to ODBC external table.
2. Support Transaction for ODBC sink to make sure insert into data is atomicital.
3. The document about ODBC sink has been modified
2020-12-13 21:53:27 +08:00
1267d6bf66 [Bug][MultiLoad] Fix multiload missing userinfo and rebase error (#5058) 2020-12-11 12:01:32 +08:00
e47fb502b2 [Compatibility] Support embedded quota in string literal (#5045)
```
mysql> select 'I''m a student';
+-----------------+
| 'I'm a student' |
+-----------------+
| I'm a student   |
+-----------------+

mysql> select "I""m a student";
+-----------------+
| 'I"m a student' |
+-----------------+
| I"m a student   |
+-----------------+

mysql> select 'I""m a student';
+------------------+
| 'I""m a student' |
+------------------+
| I""m a student   |
+------------------+

mysql> select "I''m a student";
+------------------+
| 'I''m a student' |
+------------------+
| I''m a student   |
+------------------+
```
2020-12-10 21:34:06 +08:00
e278e0b3db [Load] Support full StreamLoad feature in multiload (#4717) 2020-12-10 09:37:18 +08:00
2dbcb726ac [Bug] Fix bug that failed to write meta image of load job (#5029)
In #4863, we add userInfo in load job, but the userInfo must be analyzed
so that it can be written to the image.
2020-12-08 10:00:42 +08:00
6021d6fc7f [Performance Optimization] Remove push down conjuncts in olap scan node (#4999)
Push conjunct to Storage Engine as more as possible

olap scan node do not need filter data use push down conjuncts again.

fix #4986
2020-12-06 08:50:08 +08:00
b954dfd82d [Bug] Fix the bug of Largetint and Decimal json load failed. (#4983)
Use param of json load "num_as_string" to use flag kParseNumbersAsStringsFlag to parse json data.
2020-12-06 08:49:30 +08:00
b1b99ae884 [Function] Support Decimal to calculate variance and standard deviation (#4959) 2020-12-06 08:49:01 +08:00
42dd821021 [Refactor] Private constructor for singleton (#4956) 2020-12-06 08:47:29 +08:00
c5f780305e [Repair] Add an option whether to allow the partition column to be NULL (#5013) 2020-12-05 14:58:32 +08:00
1ae6de7117 [Enhance] Add "statistics" meta table and fix some mysql compatibility problem (#4991)
1. Add metadata table 'statistics' to store index information;
2. In the header information returned by mysql, the data type length is returned according to the actual type.
2020-12-03 09:38:18 +08:00
bd558f1895 [Doris][Doris On ES] support prefix @ symbol for column name (#5006)
Support `@` leading  column name, such as:

```
CREATE EXTERNAL TABLE `es_10` (
  `@k3` bigint(20) NULL COMMENT "",
  `@k1` boolean NULL COMMENT "",
  `@k2` varchar(20) NULL COMMENT ""
) ENGINE=ELASTICSEARCH
COMMENT "ELASTICSEARCH"
PROPERTIES (
"hosts" = "ip:port",
"user" = "root",
"password" = "",
"index" = "data_type_test",
"type" = "doc",
"transport" = "http"
); 
```
2020-12-03 09:33:49 +08:00
5215727b45 [Function] Let "str_to_date" return correct type (#5004)
The return type of str_to_date depends on whether the time part is included in the format.
If included, it is DATETIME, otherwise it is DATE.
If the format parameter is not constant, the return type will be DATETIME.
The above judgment has been completed in the FE query planning stage,
so here we directly set the value type to the return type set in the query plan.

For example:
A table with one column k1 varchar, and has 2 lines:
    "%Y-%m-%d"
    "%Y-%m-%d %H:%i:%s"
Query:
    SELECT str_to_date("2020-09-01", k1) from tbl;
Result will be:
    2020-09-01 00:00:00
    2020-09-01 00:00:00

Query:
     SELECT str_to_date("2020-09-01", "%Y-%m-%d");
Return type is DATE

Query:
     SELECT str_to_date("2020-09-01", "%Y-%m-%d %H:%i:%s");
Return type is DATETIME
2020-12-03 09:33:26 +08:00
204c15119f [Bug] ConcurrentModificationException when finish transaction (#5003) 2020-12-03 09:33:04 +08:00
b4c1eabe3f [Bug] fix finished load jobs cost too much heap (#4993)
Since the plan is retained in the task, if the task is not cleaned up, the memory usage will be too large caused Memory leak or OOM.
When load job finished, there is no need to hold the tasks which are the biggest memory consumers.
Fixed #4992
2020-12-02 17:11:27 +08:00
99404df8b2 [Bug][Compaction] Fix bug that output rowset is not deleted after compaction failure (#4964)
This CL fix 2 bugs:

1. 
When the compaction fails, we must explicitly delete the output rowset,
otherwise the GC logic cannot process these rows.

2. 
Base compaction failed if compaction process include some delete version in SegmentV2,
Because the number of filtered rows is wrong.
2020-11-30 22:02:03 +08:00
27ef5b4d2c [Bug] Use the right queryId to audit master only query in non master (#4978)
Add queryId in TMasterOpResult.
Audit it in non master FE.
2020-11-29 11:14:17 +08:00
f944bf4d44 [Compile][Bug] Fix FE compilation bug (#4979)
[Bug] Fix compile failed that cannot find symbol for variable scanRangeLength, Introduced by #4914 #4912
2020-11-28 16:19:54 +08:00
f1248cb10e [BUG] Fix colocate balance bug when there is decommissioned be (#4955)
We should ignore decommissioned BE when select BEs to balance group bucketSeq.
2020-11-28 09:59:25 +08:00
2e9c8dda04 [Doris On ES][Bug-Fix] fix problem for selecting random be (#4972)
1.  Random().nextInt() maybe return negative numeric value which would result in `java.lang.ArrayIndexOutOfBoundsException`, 
pass a positive numeric value would avoid this problem.

```
int seed = new Random().nextInt(Short.MAX_VALUE) % nodesInfo.size()
```

2.  EsNodeInfo[] nodeInfos = (EsNodeInfo[]) nodesInfo.values().toArray() maybe lead `java.lang.ClassCastException  in some JDK version : [Ljava.lang.Object; cannot be cast to [Lorg.apache.doris.external.elasticsearch.EsNodeInfo` , pass the original `Class Type` can resolve this.

```
EsNodeInfo[] nodeInfos = nodesInfo.values().toArray(new EsNodeInfo[0]);
```
2020-11-28 09:57:44 +08:00
c6bc30e375 [Bug] Fix httpv2 append extra useless information in get_small_file api (#4953) 2020-11-28 09:52:52 +08:00
55ce88da34 [Schema change] Support More column type in schema change (#4938)
1. Support modify column type CHAR to TINYINT/SMALLINT/INT/BIGINT/LARGEINT/FLOAT/DOUBLE/DATE
and TINYINT/SMALLINT/INT/BIGINT/LARGEINT/FLOAT/DOUBLE convert to a wider range of numeric types (#4937)

2. Use template to refactor code of types.h and schema_change.cpp to delete redundant code.
2020-11-28 09:52:28 +08:00
3b56b601fb Show fe commit hash on proc (#4943)
Show FE's commit has in SHOW PROC "/frontends" result.
2020-11-28 09:50:48 +08:00
0493eb172f [Optimize] optimize host selection strategy (#4914)
When a tablet selects which replica's host to execute scan operation,
it takes `round-robin` strategy to load balance. `minAssignedBytes` is the current load of one host.
If a backend is not alive momently, it will randomly take one of other replicas as the choice,
but the unalive backend's `minAssignedBytes`  not be descreased and the new choice's `minAssignedBytes`
also not be increased. That will make the real load of the backends not correct.
2020-11-28 09:48:13 +08:00
68db176013 [Refator]Modify code write error (#4950)
* fix typo in udf: replace function

Co-authored-by: wangxixu <wangxixu@xiaomi.com>
2020-11-27 12:16:45 +08:00
37a6731244 [BUG] Fix Colocate table balance bug (#4936)
Fix bug that colocation group is always in unstable status.
2020-11-22 21:22:44 +08:00
584b33f95b [Bug] Fix the bug of NULL do not show in CTE statement. (#4932)
All Column create in inlineView will set `allowNull = false`, which will cause `NULL` data in CTE be process will be ignore.
So we should set column in inlineView allowNull to make sure correct of query.
2020-11-22 20:58:03 +08:00
c28769c512 [Bug] Avoid partition prune if predicate is not with SlotRef (#4833) (#4921) 2020-11-22 20:49:20 +08:00
4f7c6da1f5 [Refactor] Refactor function getScanRangeLength (#4912)
getScanRangeLength always return 1, it is no need to maintain a function like this.
2020-11-22 20:44:11 +08:00