When using linked schema change, we need to check if all rowsets are of the same type,
ALPHA or BETA. otherwise, we need to use direct schema change to convert the data.
Currently, if we encounter a problem with a replica of a tablet during the load process,
such as a write error, rpc error, -235, etc., it will cause the entire load job to fail,
which results in a significant reduction in Doris' fault tolerance.
This PR mainly changes:
1. refined the judgment of failed replicas in the load process, so that the failure of a few replicas will not affect the normal completion of the load job.
2. fix a bug introduced from #7754 that may cause BE coredump
Fix the following bugs.
1. `column1` created a bitmap index.
2. `column1` has a lot index items in the bitmap index, and the index page is divided into two levels.
3. `column1`'s value range is `[1000, 10000000]`.
4. the query condition is `column1 > 0`
5. the empty result will be returned, while the expected value should be 9999000 rows.
This PR mainly changes:
1. Help to Cancel the load job ASAP when encounter unqualified data.
Solution is described in #6318 .
Also replace some std::stringstream with fmt::memory_buffer to avoid performance issues.
2. fix a NPE bug when create user with empty host
3. fix compile warning after rebasing the master(vectorization)
1. Add a new FE config `colocate_group_relocate_delay_second`
The relocation of a colocation group may involve a large number of tablets moving within the cluster.
Therefore, we should use a more conservative strategy to avoid relocation of colocation groups as much as possible.
Relocation usually occurs after a BE node goes offline or goes down.
This config is used to delay the determination of BE node unavailability.
The default is 30 minutes, i.e., if a BE node recovers within 30 minutes, relocation of the colocation group
will not be triggered.
2. Change the priority of colocate tablet repair and balance task from HIGH to NORMAL
3. Add a new FE config allow_replica_on_same_host
If set to true, when creating table, Doris will allow to locate replicas of a tablet
on same host. And also the tablet repair and balance will be disabled.
This is only for local test, so that we can deploy multi BE on same host and create table
with multi replicas.
# Proposed changes
Issue Number: close#6238
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
Co-authored-by: wangbo <506340561@qq.com>
Co-authored-by: emmymiao87 <522274284@qq.com>
Co-authored-by: Pxl <952130278@qq.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: thinker <zchw100@qq.com>
Co-authored-by: Zeno Yang <1521564989@qq.com>
Co-authored-by: Wang Shuo <wangshuo128@gmail.com>
Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
Co-authored-by: xinghuayu007 <1450306854@qq.com>
Co-authored-by: weizuo93 <weizuo@apache.org>
Co-authored-by: yiguolei <guoleiyi@tencent.com>
Co-authored-by: anneji-dev <85534151+anneji-dev@users.noreply.github.com>
Co-authored-by: awakeljw <993007281@qq.com>
Co-authored-by: taberylyang <95272637+taberylyang@users.noreply.github.com>
Co-authored-by: Cui Kaifeng <48012748+azurenake@users.noreply.github.com>
## Problem Summary:
### 1. Some code from clickhouse
**ClickHouse is an excellent implementation of the vectorized execution engine database,
so here we have referenced and learned a lot from its excellent implementation in terms of
data structure and function implementation.
We are based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community and developers.**
The following comment has been added to the code from Clickhouse, eg:
// This file is copied from
// https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h
// and modified by Doris
### 2. Support exec node and query:
* vaggregation_node
* vanalytic_eval_node
* vassert_num_rows_node
* vblocking_join_node
* vcross_join_node
* vempty_set_node
* ves_http_scan_node
* vexcept_node
* vexchange_node
* vintersect_node
* vmysql_scan_node
* vodbc_scan_node
* volap_scan_node
* vrepeat_node
* vschema_scan_node
* vselect_node
* vset_operation_node
* vsort_node
* vunion_node
* vhash_join_node
You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set.
### 3. Data Model
Vec Exec Engine Support **Dup/Agg/Unq** table, Support Block Reader Vectorized.
Segment Vec is working in process.
### 4. How to use
1. Set the environment variable `set enable_vectorized_engine = true; `(required)
2. Set the environment variable `set batch_size = 4096; ` (recommended)
### 5. Some diff from origin exec engine
https://github.com/doris-vectorized/doris-vectorized/issues/294
## Checklist(Required)
1. Does it affect the original behavior: (No)
2. Has unit tests been added: (Yes)
3. Has document been added or modified: (No)
4. Does it need to update dependencies: (No)
5. Are there any changes that cannot be rolled back: (Yes)
If an load task has a relatively short timeout, then we need to ensure that
each RPC of this task does not get blocked for a long time.
And an RPC is usually blocked for two reasons.
1. handling "memory exceeds limit" in the RPC
If the system finds that the memory occupied by the load exceeds the threshold,
it will select the load channel that occupies the most memory and flush the memtable in it.
this operation is done in the RPC, which may be more time consuming.
2. close the load channel
When the load channel receives the last batch, it will end the task.
It will wait for all memtables flushes to finish synchronously. This process is also time consuming.
Therefore, this PR solves this problem by.
1. Use timeout to determine whether it is a high-priority load task
If the timeout of an load task is relatively short, then we mark it as a high-priority task.
2. not processing "memory exceeds limit" for high priority tasks
3. use a separate flush thread to flush memtable for high priority tasks.
1. Allow set cpu_resource_limit
-1 means unlimited
2. Drop replica not in valid tag
Otherwise, the migration task from a resource group to another may never finish.
1. fix core dump when using multi explode_bitmap #7716
2. fix bug that json array extract by json path is wrong #7717
3. fix bug that after lateral view, the null value become non-null value #7718
4. fix bug that lateral view may return error: couldn't resolve slot descriptor 1. #7719
5. fix error result when using lateral view with where predicate #7720
Support general hints.
Sql example:
```sql
SELECT /*+ one_hint(1000000) another_hint(k = "v")*/ 1;
```
hints syntax is:
```
/*+ [ HINT_NAME( [ key [ =value ]? ]* ) ]+ */
```
- support multi hints, sep with space
- hint name could be any string in identifier format
- hint could have zero or more parameters, sep with comma
- hint parameter must have one key
- hint parameter could have zero or one value
- hint parameter‘s key and value connected by equal sign