Commit Graph

4264 Commits

Author SHA1 Message Date
e3daa9580a [Fix](Lateral View) The Error expr type when exploding a function result of inline view (#8851)
Fixed #8850

The column in inline view maybe a function instead of slotRef.
So when this column is used as the input of explode function,
it can't be converted to slotRef.

The correct way is to treat it as an Expr and extract the required slotRef for materialization.
For example:
```
with d as (select k1+k1 as k1_plus from table)
select k1_plus from d explode_split(k1_plus, ",")
```
FnExp: SlorRef<k1_plus>
SubstituteFnExpr: functionCallExpr<k1+k1>
originSlotRefList: SlotRef<k1>
2022-04-08 09:08:55 +08:00
318feb01f3 [improvement](account) support to account management sql (#8849)
Add [IF EXISTS] support to following statements:
- CREATE [IF NOT EXISTS] USER
- CREATE [IF NOT EXISTS] ROLE
- DROP [IF EXISTS] USER
- DROP [IF EXISTS] ROLE
2022-04-08 09:08:08 +08:00
0b98d78664 [improvement](hll) Optimize Hyperloglog (#8829)
In meituan, pr #6625 was revert due to the oom probleam.
currently, we are trying to modify the old hyperloglog, based on pr #8555, we did some works.
via some test, we find it better than old hll, and better than apache:master hll.

Changes summary:

- use SIMD max tp speed up heavy function _merge_registers
- use phmap::flat_hash_set rather than std::set
- replace std::max
- other small changes
2022-04-08 09:06:08 +08:00
b88bf73ca7 [refactor][doc] Added doc for compilation, deployment and data export (#8776) 2022-04-08 09:04:03 +08:00
519305cb22 [feature-wip] (memory tracker) (step4) Switch TLS mem tracker to separate more detailed memory usage (#8669)
Based on #8605, Separate out the memory usage of each operator from the Query/Load/StorageEngine mem tracker.
2022-04-08 09:02:26 +08:00
7fb4b6a6e2 [chore](tsan) add file mremap_fallback for tsan (#8665) 2022-04-08 09:01:53 +08:00
24bb9810b4 [doc](manager) Add space list documents (#8658)
Add space list and access control document. Remove some pictures to reduce the size of source code.
2022-04-08 09:01:23 +08:00
d51545a952 [fix](ut)(memory-leak) Fix be asan ut failed and hdfs file reader memory leak (#8905) 2022-04-08 00:07:00 +08:00
Pxl
2a25b90cb3 [Test] Fix explode test and build fail (#8885) 2022-04-07 14:23:57 +08:00
32bba15e34 [refactor][fix] remove useless import in Config.java (#8878) 2022-04-07 11:40:05 +08:00
c9cb07a270 [typo](doc)Update upgrade.md (#8866) 2022-04-07 11:36:39 +08:00
02be8176c3 [fix] access parallel_flat_hash_map via thread safely methods (#8854)
Iterator of parallel_flat_hash_map is not thread safely, so
we should use if_contains instead.
2022-04-07 11:35:59 +08:00
64d18364db [improvement](restore) set table property 'dynamic_partition.enable' to false after restore (#8852)
when restore table with dynamic partition properties, 'dynamic_partition.enable' is set to the backup time value.
but Doris could not turn on dynamic partition automatically when restore.
So we cloud see table never do dynamic partition with dynamic_partition.enable is set to 'true'.
2022-04-07 11:34:01 +08:00
ce50c4d826 [feature](diagnose) support "ADMIN DIAGNOSE TABLET" stmt (#8839)
`ADMIN DIAGNOSE TABLET tablet_id`

This statement makes it easier to quickly diagnose the status of a tablet.
See "ADMIN-DIAGNOSE-TABLET.md" for details

```
mysql> admin diagnose tablet 10196;
+----------------------------------+------------------------------+------------+
| Item                             | Info                         | Suggestion |
+----------------------------------+------------------------------+------------+
| TabletExist                      | Yes                          |            |
| TabletId                         | 10196                        |            |
| Database                         | default_cluster:db1: 10192   |            |
| Table                            | tbl1: 10194                  |            |
| Partition                        | tbl1: 10193                  |            |
| MaterializedIndex                | tbl1: 10195                  |            |
| Replicas(ReplicaId -> BackendId) | {"10197":10002}              |            |
| ReplicasNum                      | OK                           |            |
| ReplicaBackendStatus             | Backend 10002 is not alive.  |            |
| ReplicaVersionStatus             | OK                           |            |
| ReplicaStatus                    | OK                           |            |
| ReplicaCompactionStatus          | OK                           |            |
+----------------------------------+------------------------------+------------+
```
2022-04-07 11:30:03 +08:00
ca4055244e [fix](storage) Fix core bug of convert to predicate column (#8833)
recurrent:
When `enable_low_cardinality_optimize = true`, for the TPCH dataset, using the following SQL query will Core
```sql
select count(*) from lineitem where l_comment = 'ously even exc';
```

This SQL will trigger the execution of `ColumnDictionary::convert_to_predicate_column_if_dictionary`, and `res->reserve(_codes.size())` is problematic because the current `_codes.size()` is smaller than its reserve value, so inserting a value into `PredicateColumn` will Core.
2022-04-07 11:29:26 +08:00
e72ccfd80c [Refactor][httpv2]remove http v1 code (#8848)
http v2 has been actually tested in production, and it is completely replaceable to have http code. In order to simplify code maintenance, remove the previous http part of the code
2022-04-07 08:38:29 +08:00
98cab78320 [refactor](schema_hash) remove schema_hash since every tablet id in be is unique (#8574) 2022-04-07 08:37:45 +08:00
57638ae43d [Refactor][Doc]Add part of the document content for data import (#8772) 2022-04-07 08:37:11 +08:00
e53c90fbef min and max window function bug fix (#8822)
[Fix bug] min and max window function bug fix #8822
2022-04-07 08:36:33 +08:00
319f1f634a [fix](ut) fix fe run CreateTableAsSelectStmtTest ,UserPropertyTest, ProjectPlannerFunctionTest and AggregateTest failed (#8838) 2022-04-06 15:23:49 +08:00
f90a1a1919 [fix](ut)(compile) Fix ut failure at functions_geo and compilation bug (#8843) 2022-04-05 21:30:40 +08:00
Pxl
03c5d5d677 fix some error on build.sh && fix build fail with clang on runtime_profile (#8748) 2022-04-05 15:52:53 +08:00
d07b49247e rm sequential file (#8713)
[refactor]remove sequential file reader from env
2022-04-04 17:49:06 +08:00
fcefed7c1c [Bug][Vectorized] Fix core bug of segment vectorized (#8800)
* [Bug][Vectorized] Fix core bug of segment vectorized
1. Read table with delete condition
2. Read table with default value HLL/Bitmap Column

* refactor some code

Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-04-03 19:50:25 +08:00
33736e45fa [fix](table-function) Fixed unreasonable nullable conversion (#8818) 2022-04-03 11:02:35 +08:00
a8417e6c8b [fix](restore) fix restore issue when meta version is too low (#8816)
When restore snapshot from 0.13 to master, the restore job is pending for long time.
However, we get error "Could not set meta version to 93 since it is lower than minimum required version 100" in log.
We should cancel restore job once get that error.
2022-04-03 10:56:23 +08:00
78b85414d6 [fix](debug) get_hash_value_fvn DCHECK failed (#8811)
* fix_get_hash_value_fvn

* fix compile
2022-04-03 10:55:15 +08:00
eed4908790 [chore](deps) upgrade spring to 2.6.2 to 2.6.6 (#8802) 2022-04-03 10:52:31 +08:00
f3c6ddf651 [feature](function) Support geolocation functions on vectorized engine (#8790) 2022-04-03 10:50:54 +08:00
586bec79f5 [fix](storage) Fix query result error due to find code by bound (#8787)
Problem recurrence
SSB single table `lineorder_flat`, the query SQL is as follows:
```sql
SELECT
    sum(LO_REVENUE),
    (LO_ORDERDATE DIV 10000) AS year,
    P_BRAND
FROM lineorder_flat
WHERE P_BRAND >= 'MFGR#22211111' AND P_BRAND <= 'MFGR#22281111' AND S_REGION = 'ASIA' and (LO_ORDERDATE DIV 10000) = 1992
GROUP BY
    year,
    P_BRAND
ORDER BY
    year,
    P_BRAND;
```

when `enable_low_cardinality_optimize=false`, query result:
```sql
+-------------------+------+-----------+
| sum(`LO_REVENUE`) | year | P_BRAND   |
+-------------------+------+-----------+
|       65423264312 | 1992 | MFGR#2222 |
|       66936772687 | 1992 | MFGR#2223 |
|       64047191934 | 1992 | MFGR#2224 |
|       65744559138 | 1992 | MFGR#2225 |
|       66993045668 | 1992 | MFGR#2226 |
|       67411226147 | 1992 | MFGR#2227 |
|       69390885970 | 1992 | MFGR#2228 |
+-------------------+------+-----------+
```

when `enable_low_cardinality_optimize=true`, query result:
```sql
+-------------------+------+-----------+
| sum(`LO_REVENUE`) | year | P_BRAND   |
+-------------------+------+-----------+
|       66936772687 | 1992 | MFGR#2223 |
|       64047191934 | 1992 | MFGR#2224 |
|       65744559138 | 1992 | MFGR#2225 |
|       66993045668 | 1992 | MFGR#2226 |
|       67411226147 | 1992 | MFGR#2227 |
|       69390885970 | 1992 | MFGR#2228 |
+-------------------+------+-----------+
```

One line less than the correct result.

The reason is that 'MFGR#22211111' is not in the dictionary, so get the boundary code (`find_code_by_bound` method), but there is a bug here.
2022-04-03 10:38:14 +08:00
0e3b15f2d7 [fix](colocate) Fix the error colocate plan when query is (rollup + instance >1) (#8779)
The Repeat Node will change the fragment data partition.

So the output partition of child fragment is different from the data partition of current fragment.
When judging whether colocate can be enabled,
the current data partition of fragment should be used directly instead of the child's output partition.

Before this PR fix, queries with '''rollup + concurrency greater than 1''' may have incorrect results.
For example: 
```
select t1.tc1,t1.tc2,sum(t1.tc3) as total from t1 join[shuffle] t1 t2 on t1.tc1=t2.tc1
group by rollup(tc1,tc2) order by t1.tc1,t1.tc2,total;
```

Fixed #8778
2022-04-03 10:19:39 +08:00
6cc8762ce7 [fix](load) fix concurrent synchronization problem in NodeChannel::try_send_batch (#8728)
The patch fixes two problems.
1. Memory order problem accessing _last_patch_processed_finished and in_flight, actually _last_patch_processed_finished is redundant, so the patch removes it.
2. synchronization in join on cid.

Fix for #8725.
2022-04-03 10:15:45 +08:00
4076c5466b [refactor][improvement](type_info) use template and single instance to refactor get type info logic (#8680)
1. use const pointer instead of shared_ptr
2. Restrict array types to support only primitive types and nest up to 9 levels.
2022-04-03 10:10:36 +08:00
6b0a642390 [feature][vectorized] Support explode json array func #8526 (#8539) 2022-04-03 10:06:47 +08:00
a75e4a1469 Window funnel (#8485)
Add new feature window funnel
2022-04-02 22:08:50 +08:00
13f1f94f86 [chore] upgrade log4j version to 2.17.2 (#8774)
upgrade log4j version to 2.17.2
2022-04-02 21:29:25 +08:00
3a77897072 [license] Add notice files for binary release (#8813)
For binary release, if a dependency has NOTICE file, we need to copy it to the binary release package.
2022-04-02 16:41:24 +08:00
0c98c1ee03 [Improvement][fix](compaction) Change min_compaction_failure_interval_sec to 5 and fix a bug of log (#8781)
see issue #8767
2022-04-02 13:00:56 +08:00
4d516bece8 [feature-wip](array-type)Add element_at and subscript functions (#8597)
Describe the overview of changes.
1. add function element_at;
2. support element_subscript([]) to get element of array, col_array[N] <==> element_at(col_array, N);
3. return error message instead of BE crash while array function execute failed;

element_at(array, index) desc:
>   Returns element of array at given **(1-based)** index. 
  If **index < 0**, accesses elements from the last to the first. 
  Returns NULL if the index exceeds the length of the array or the array is NULL.

Usage example:
1. create table with ARRAY type column and insert some data:
```
+------+------+--------+
| k1   | k2   | k3     |
+------+------+--------+
|    1 |    2 | [1, 2] |
|    2 |    3 | NULL   |
|    4 | NULL | []     |
|    3 | NULL | NULL   |
+------+------+--------+
```
2. enable vectorized:
```
set enable_vectorized_engine=true;
```
3. element_subscript([]) usage example:
```
> select k1,k3,k3[1] from array_test;
+------+--------+----------------------------+
| k1   | k3     | %element_extract%(`k3`, 1) |
+------+--------+----------------------------+
|    3 | NULL   |                       NULL |
|    1 | [1, 2] |                          1 |
|    2 | NULL   |                       NULL |
|    4 | []     |                       NULL |
+------+--------+----------------------------+
```
4. element_at function usage example:
```
> select k1,k3 from array_test where element_at(k3, -1) = 2;
+------+--------+
| k1   | k3     |
+------+--------+
|    1 | [1, 2] |
+------+--------+
```
2022-04-02 12:03:56 +08:00
8bb16bfeb3 [docs] minor update for broker load document (#8812)
[docs] minor update for broker load document
2022-04-02 10:56:04 +08:00
6c5bbc6e4c fix agg functions check failed from empty table (#8785)
fix agg functions check failed from empty table
2022-04-02 10:44:55 +08:00
3698176c40 use row_size as name of variable indicating rows rather than column_size (#8803)
use row_size as name of variable indicating rows rather than column_size
2022-04-02 10:38:16 +08:00
c31c6ae91a [improvement](storage) Add more detailed timer on SegmentIter in profile (#8768)
* [improvement](storage) Add more detailed timer on SegmentIter in profile

* add OutputColumnTime
2022-04-02 10:35:28 +08:00
5e908f5685 [doc] Update data-model-rollup.md (#8782)
* Update data-model-rollup.md
2022-04-02 10:35:02 +08:00
f3539cd3ba [refactor] remove useless code (#8773) 2022-04-02 10:28:16 +08:00
decdc8e8b9 [test][enhance][refactor] support suite block to specify multiple group, suppo… (#8792)
support suite block to specify multiple groups.
TestAction support compare result to iterator, local file and http stream.
support print teamcity service message.
abandon the logical: generate groovy file for sql file
support 3 levels parrallel: script file, suite block, thread action
support specify JAVA_OPTS for boot shell
avoid jvm metaspace oom
use -d to run the suite in some directories, instead of -g. and -g is used to specify groups
2022-04-01 20:59:01 +08:00
9f80f6cf5e [Improvement](Planner)Enable hash join project (#8618) 2022-04-01 15:42:25 +08:00
2730235e5b [typo](docs) update documentation (#8756) 2022-04-01 10:21:03 +08:00
eb68dd0bb5 [fix](ut) Fix be ut not work for byte_buffer_test2 and json_scanner_with_jsonpath_test (#8791) 2022-04-01 10:12:47 +08:00
f315fbd5ac [fix] vectorization decimal avg inconsistent (#8746) 2022-03-31 23:00:40 +08:00