Commit Graph

2023 Commits

Author SHA1 Message Date
04287cabb2 [Forbidden](Vec) Switch to non-vec engine when outer join + not null column (#8979)
* [Forbidden](Vec) Switch to non-vec engine when outer join + not null column

Vectorized code will occur `core` in the case of ```outer join + not null column```, such as issue #7901
So we need to fall back from vectorized mode to non-vectorized mode when we encounter this situation.

If the nullside column of the outer join is a column that must return non-null like count(*)
then there is no way to force the column to be nullable.
At this time, vectorization cannot support this situation,
so it is necessary to fall back to non-vectorization for processing.
For example:
  Query: set enable_vectorized_engine=true
  Query: select * from t1 left join (select k1, count(k2) as count_k2 from t2 group by k1) tmp on t1.k1=tmp.k1
  Result: Query goes non-vectorized engine
2022-04-18 09:55:33 +08:00
0f8a7ff985 [Refactor](ReportHandler) Remove some unused schema_hash code in fe (#9005) 2022-04-17 10:01:34 +08:00
c7a098c1b0 [fix](sql_block_rule) optimization of alter sql_block_rule stmt (#8971)
Optimization of alter sql_block_rule stmt.
2022-04-16 11:05:31 +08:00
67c16f3a03 [fix](show-function) fix bug for show function (#9025)
show full function
result has an error:
INIT_FN and UPDATE_FN is wrong
2022-04-15 15:18:20 +08:00
7634e55513 [fix] fix p0 test failed because of char type cannot convert to datetime (#8996)
fix p0 test failed because of char type cannot convert to datetime
2022-04-15 15:16:00 +08:00
0fa917703e [Bug] Fix some node in vectorized not have V title (#9028)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-04-15 11:25:52 +08:00
579aee110a [fix](ut)(compile) Fix BE compile bug and FE unit test (#9027)
1. The compile bug is introduced from #8855
2. FE ut bug is introduced from #8848 and #8770
2022-04-14 17:37:41 +08:00
9ac6d23a44 [Feature]support stddev/variance agg functions to window function (#8962) 2022-04-14 12:07:26 +08:00
48c288af94 [refactor](fe) modify warning message of drop backends (#9006)
Modify warning message of drop backends
2022-04-14 11:46:29 +08:00
91200cc7a6 [fix] fix NPE when initialize GlobalState (#8990)
Introduced from #8695
The context object may be null for StreamLoadPlanner
2022-04-14 11:44:41 +08:00
18daefff80 [refactor](fe): remove unused code (#8986) 2022-04-14 11:44:21 +08:00
a1982c4391 [improvement] Use System.currentTimeMillis() to get the current millisecond (#8828) 2022-04-14 10:03:37 +08:00
bca121333e [feature](cold-hot) support s3 resource (#8808)
Add cold hot support in FE meta, support alter resource DDL in FE
2022-04-13 09:52:03 +08:00
7e08d3e320 Modify the maximum and minimum number of threads in jetty (#8960)
Co-authored-by: smallhibiscus <844981280>
2022-04-13 09:50:46 +08:00
d79e8a7b5a [fix](load) start transaction before we need it (#8819) (#8908) 2022-04-13 09:50:26 +08:00
b33ab960a8 [fix] move new add enum OFS of StorageType to last (#8983)
* [fix] move new add enum OFS of StorageType to last

* modify enum in gensrc/thrift/Types.thrift
2022-04-12 20:21:15 +08:00
6af1c52e13 [Feature] add support for tencent chdfs (#8963)
Co-authored-by: chengwu <chengwu@tencent.com>
2022-04-12 16:02:42 +08:00
51269efbb7 [improvement]Disable mini load (#8955)
Disable miniload by default
2022-04-12 16:01:03 +08:00
0d761f9909 [feature-wip][UDF][DIP-1] Support variable-size input and output for Java UDF (#8678)
This feature is proposed in DSIP-1. This PR support variable-length input and output Java UDF.
2022-04-11 09:36:16 +08:00
936b942e3a [fix](error-code) replace invalid format specifier (#8940)
change %lu and %ld to %d
2022-04-10 20:37:32 +08:00
1ee8633e5e [fix](account) use LOG.info instead of LOG.debug (#8911)
This complements (#8849)
2022-04-09 19:18:13 +08:00
ce6b5169c2 [fix](join) Fix error bucket num get in bucket shuffle join in dynamic partition (#8891) 2022-04-09 19:11:44 +08:00
a290104966 [fix](routine load) Routine load task doesn't reallocate when previous BE is down. (#8824)
if previous be is not alive, should assigned another available BE instead.
2022-04-09 19:02:55 +08:00
ddf7ef9327 [improvement](join) update broadcast join cost algorithm (#8695)
broadcast join cost is used compressed data size currently.
The amount of memory used may be significantly more than estimated.
This patch:
1. add a compressed ratio to broadcast join cost and set to 5 according to the experience.
2. add a new session variable `auto_broadcast_join_threshold` to limit memory used by broadcast in bytes, the default value is 1073741824(1GB)
2022-04-09 19:00:27 +08:00
Pxl
453485abfb [Bug] Fix some bugs(rewrite rule/symbol transport) of like predicate (#8770) 2022-04-08 14:32:09 +08:00
c5718928df [feature-wip](array-type) support explode and explode_outer table function (#8766)
explode(ArrayColumn) desc:
> Create a row for each element in the array column. 

explode_outer(ArrayColumn) desc:
> Create a row for each element in the array column. Unlike explode, if the array is null or empty, it returns null.

Usage example:
1. create a table with array column, and insert some data;
2. open enable_lateral_view and enable_vectorized_engine;
```
set enable_lateral_view = true;
set enable_vectorized_engine=true;
```
3. use explode_outer
```
> select * from array_test;
+------+------+--------+
| k1   | k2   | k3     |
+------+------+--------+
|    3 | NULL | NULL   |
|    1 |    2 | [1, 2] |
|    2 |    3 | NULL   |
|    4 | NULL | []     |
+------+------+--------+

> select k1,explode_column from array_test LATERAL VIEW explode_outer(k3) TempExplodeView as explode_column;
+------+----------------+
| k1   | explode_column |
+------+----------------+
|    1 |              1 |
|    1 |              2 |
|    2 |           NULL |
|    4 |           NULL |
|    3 |           NULL |
+------+----------------+
```
4. explode usage example. explode return empty rows while the ARRAY is null or empty
```
> select k1,explode_column from array_test LATERAL VIEW explode(k3) TempExplodeView as explode_column;
+------+----------------+
| k1   | explode_column |
+------+----------------+
|    1 |              1 |
|    1 |              2 |
+------+----------------+
```
2022-04-08 12:11:04 +08:00
fa8e4ec2f0 [fix] Disable cast operation of object type (#8882)
Disable cast between string and object type(bitmap, hll, quantile_state)
2022-04-08 09:13:56 +08:00
e3daa9580a [Fix](Lateral View) The Error expr type when exploding a function result of inline view (#8851)
Fixed #8850

The column in inline view maybe a function instead of slotRef.
So when this column is used as the input of explode function,
it can't be converted to slotRef.

The correct way is to treat it as an Expr and extract the required slotRef for materialization.
For example:
```
with d as (select k1+k1 as k1_plus from table)
select k1_plus from d explode_split(k1_plus, ",")
```
FnExp: SlorRef<k1_plus>
SubstituteFnExpr: functionCallExpr<k1+k1>
originSlotRefList: SlotRef<k1>
2022-04-08 09:08:55 +08:00
318feb01f3 [improvement](account) support to account management sql (#8849)
Add [IF EXISTS] support to following statements:
- CREATE [IF NOT EXISTS] USER
- CREATE [IF NOT EXISTS] ROLE
- DROP [IF EXISTS] USER
- DROP [IF EXISTS] ROLE
2022-04-08 09:08:08 +08:00
32bba15e34 [refactor][fix] remove useless import in Config.java (#8878) 2022-04-07 11:40:05 +08:00
64d18364db [improvement](restore) set table property 'dynamic_partition.enable' to false after restore (#8852)
when restore table with dynamic partition properties, 'dynamic_partition.enable' is set to the backup time value.
but Doris could not turn on dynamic partition automatically when restore.
So we cloud see table never do dynamic partition with dynamic_partition.enable is set to 'true'.
2022-04-07 11:34:01 +08:00
ce50c4d826 [feature](diagnose) support "ADMIN DIAGNOSE TABLET" stmt (#8839)
`ADMIN DIAGNOSE TABLET tablet_id`

This statement makes it easier to quickly diagnose the status of a tablet.
See "ADMIN-DIAGNOSE-TABLET.md" for details

```
mysql> admin diagnose tablet 10196;
+----------------------------------+------------------------------+------------+
| Item                             | Info                         | Suggestion |
+----------------------------------+------------------------------+------------+
| TabletExist                      | Yes                          |            |
| TabletId                         | 10196                        |            |
| Database                         | default_cluster:db1: 10192   |            |
| Table                            | tbl1: 10194                  |            |
| Partition                        | tbl1: 10193                  |            |
| MaterializedIndex                | tbl1: 10195                  |            |
| Replicas(ReplicaId -> BackendId) | {"10197":10002}              |            |
| ReplicasNum                      | OK                           |            |
| ReplicaBackendStatus             | Backend 10002 is not alive.  |            |
| ReplicaVersionStatus             | OK                           |            |
| ReplicaStatus                    | OK                           |            |
| ReplicaCompactionStatus          | OK                           |            |
+----------------------------------+------------------------------+------------+
```
2022-04-07 11:30:03 +08:00
e72ccfd80c [Refactor][httpv2]remove http v1 code (#8848)
http v2 has been actually tested in production, and it is completely replaceable to have http code. In order to simplify code maintenance, remove the previous http part of the code
2022-04-07 08:38:29 +08:00
98cab78320 [refactor](schema_hash) remove schema_hash since every tablet id in be is unique (#8574) 2022-04-07 08:37:45 +08:00
319f1f634a [fix](ut) fix fe run CreateTableAsSelectStmtTest ,UserPropertyTest, ProjectPlannerFunctionTest and AggregateTest failed (#8838) 2022-04-06 15:23:49 +08:00
33736e45fa [fix](table-function) Fixed unreasonable nullable conversion (#8818) 2022-04-03 11:02:35 +08:00
a8417e6c8b [fix](restore) fix restore issue when meta version is too low (#8816)
When restore snapshot from 0.13 to master, the restore job is pending for long time.
However, we get error "Could not set meta version to 93 since it is lower than minimum required version 100" in log.
We should cancel restore job once get that error.
2022-04-03 10:56:23 +08:00
eed4908790 [chore](deps) upgrade spring to 2.6.2 to 2.6.6 (#8802) 2022-04-03 10:52:31 +08:00
0e3b15f2d7 [fix](colocate) Fix the error colocate plan when query is (rollup + instance >1) (#8779)
The Repeat Node will change the fragment data partition.

So the output partition of child fragment is different from the data partition of current fragment.
When judging whether colocate can be enabled,
the current data partition of fragment should be used directly instead of the child's output partition.

Before this PR fix, queries with '''rollup + concurrency greater than 1''' may have incorrect results.
For example: 
```
select t1.tc1,t1.tc2,sum(t1.tc3) as total from t1 join[shuffle] t1 t2 on t1.tc1=t2.tc1
group by rollup(tc1,tc2) order by t1.tc1,t1.tc2,total;
```

Fixed #8778
2022-04-03 10:19:39 +08:00
a75e4a1469 Window funnel (#8485)
Add new feature window funnel
2022-04-02 22:08:50 +08:00
13f1f94f86 [chore] upgrade log4j version to 2.17.2 (#8774)
upgrade log4j version to 2.17.2
2022-04-02 21:29:25 +08:00
4d516bece8 [feature-wip](array-type)Add element_at and subscript functions (#8597)
Describe the overview of changes.
1. add function element_at;
2. support element_subscript([]) to get element of array, col_array[N] <==> element_at(col_array, N);
3. return error message instead of BE crash while array function execute failed;

element_at(array, index) desc:
>   Returns element of array at given **(1-based)** index. 
  If **index < 0**, accesses elements from the last to the first. 
  Returns NULL if the index exceeds the length of the array or the array is NULL.

Usage example:
1. create table with ARRAY type column and insert some data:
```
+------+------+--------+
| k1   | k2   | k3     |
+------+------+--------+
|    1 |    2 | [1, 2] |
|    2 |    3 | NULL   |
|    4 | NULL | []     |
|    3 | NULL | NULL   |
+------+------+--------+
```
2. enable vectorized:
```
set enable_vectorized_engine=true;
```
3. element_subscript([]) usage example:
```
> select k1,k3,k3[1] from array_test;
+------+--------+----------------------------+
| k1   | k3     | %element_extract%(`k3`, 1) |
+------+--------+----------------------------+
|    3 | NULL   |                       NULL |
|    1 | [1, 2] |                          1 |
|    2 | NULL   |                       NULL |
|    4 | []     |                       NULL |
+------+--------+----------------------------+
```
4. element_at function usage example:
```
> select k1,k3 from array_test where element_at(k3, -1) = 2;
+------+--------+
| k1   | k3     |
+------+--------+
|    1 | [1, 2] |
+------+--------+
```
2022-04-02 12:03:56 +08:00
6c5bbc6e4c fix agg functions check failed from empty table (#8785)
fix agg functions check failed from empty table
2022-04-02 10:44:55 +08:00
9f80f6cf5e [Improvement](Planner)Enable hash join project (#8618) 2022-04-01 15:42:25 +08:00
6729d41c93 [improvement] add switch of quantile_state column (#8706)
Add switch for quantile_state column, default false.
2022-03-31 22:59:27 +08:00
d17ee7b476 [fix](sql-block-rule)Fix sql block rule bug (#8738)
1. Check properties' effectiveness of sql_block_rule, can't set limitations of sql_block_rule to be negative.
2. Optimize the judgment conditions when a query hits a sql_block_rule
3. Check if sql_block_rule has already exist when exec set property for "user" "sql_block_rule" = "xxx"
4. Add UT ad SqlBlockRuleMgrTest.java
2022-03-31 13:51:13 +08:00
b98da02611 [chore][fix](httpv2) Use mariadb-java-client for http query api (#8716)
In #8319, I remove mysql-connector-java dependency because of license incompatibility.
But we need a mysql compatible driver for http query api. So I choose mariadb-java-client,
which is under LGPL.
2022-03-30 09:59:45 +08:00
22cf6ea17c [chore] Modify build.sh and refactor dependency of FE submodules (#8732)
This PR fixes the #8731 and refactor the `build.sh` script.

The build.sh script is currently responsible for the compilation of the following Doris components.
1. FE
    - fe-common
    - fe-core
    - spark-dpp
    - hive-udf
    - java-udf
    - ui
2. BE
    - palo_be
    - meta_tool
3. broker

In the FE module.
- The 4 submodules `fe-common, fe-core, spark-dpp and ui` together form Frontend.
- `spark-dpp, hive-udf and java-udf` can be compiled separately to produce jar packages for individual use.

In the BE module.
- `palo_be` can start the BE process separately.
- `meta_tool` can be compiled separately to produce binaries.

The modified build.sh script has the following changes:

1. there is no longer an option to compile `ui` separately, build together with `--fe`.
2. `fe/be/spark-dpp/hive-udf/java-udf/palo_be/meta_tool` can be compiled separately.
3. all components except `java-udf` will be compiled by default (`java-udf` is in development)

Remaining issues:

Several submodules of FE have messy dependencies.
For example, `java-udf` depends on `fe-core`, and `fe-core` depends on `spark-dpp`,
resulting in a large binary jar of `java-udf`.
It needs to be reorganized afterwards.
2022-03-30 00:13:24 +08:00
3f5bc5206d [Improvement] broker load with hdfs support wildcard (#8718)
broker load with hdfs support wildcard
2022-03-29 18:21:41 +08:00
66a3c574df [Vectorized][Bug] fix percentile_approx function to return always nullable (#8572) 2022-03-29 14:47:39 +08:00