Commit Graph

1598 Commits

Author SHA1 Message Date
797d9de192 [fix](Nereids) When col stats is Unknow, not expression should return the stats with selectivity of 1 2023-09-01 17:36:31 +08:00
e88c218390 [Improve](Job)Job internal interface provides immediate scheduling (#23735)
Delete meaningless job status
System scheduling is executed in the time wheel
Optimize window calculation code
2023-09-01 12:50:08 +08:00
d6450a3f1c [Fix](statistics)Fix external table auto analyze bugs (#23574)
1. Fix auto analyze external table recursively load schema cache bug.
2. Move some function in StatisticsAutoAnalyzer class to TableIf. So that external table and internal table could implement the logic separately. 
3. Disable external catalog auto analyze by default, could open it by adding catalog property "enable.auto.analyze"="true"
2023-09-01 10:58:14 +08:00
b93a1a83a5 [opt](Nereids) let keywords list same with legacy planner (#23632) 2023-09-01 10:24:30 +08:00
52e645abd2 [Feature](Nereids): support cte for update and delete statements of Nereids (#23384) 2023-08-31 23:36:27 +08:00
7379cdc995 [feature](nereids) support subquery in select list (#23271)
1. add scalar subquery's output to LogicalApply's output
2. for in and exists subquery's, add mark join slot into LogicalApply's output
3. forbid push down alias through join if the project list have any mark join slots.
4. move normalize aggregate rule to analysis phase
2023-08-31 15:51:32 +08:00
126606cb4d [Fix](cache) fix query cache returns wrong result after deleting partitions. (#23555)
The reason is that sql cache just use partitionKey , latestVersion and latestTime to check if the cache should be returned, if we delete some partition(s) which is not the latest updated partition, all above values are not changed, so the cache will hit.
Use a field to save the partition num of these tables and sum the partition nums and send it to BE, there are two situations which contains delete-partition ops:

- just delete some partition(s), so the sum of partition num will be lower than before.
- delete some partition(s) coexists with add some partition(s), so the latest time or latest version will be higher than before.
2023-08-31 14:22:52 +08:00
Pxl
7f4f39551a [Bug](materialized-view) fix change base schema when create mv (#23607)
* fix change base schema when create mv

* fix

* fix
2023-08-30 21:00:12 +08:00
509d865760 [feature](Nereids): convert CaseWhen to If (#23040)
Add a rule to optimize CASE WHEN expression.
Rewrite rule to convert CASE WHEN to IF.

For example:
CASE WHEN a > 1 THEN 1 ELSE 0 END -> IF(a > 1, 1, 0)
2023-08-30 15:47:29 +08:00
3a0a79b4a0 [Improvement][SparkLoad] Use system env configs when users don't set env configs. (#21837) 2023-08-30 15:14:40 +08:00
e02747e976 [feature](Nereids) support struct type (#23597)
1. support struct data type
2. add array / map / struct literal syntax
3. fix array union / intersect / except type coercion
4. fix explict cast data type check for array
5. fix bound function type coercion
2023-08-29 20:41:24 +08:00
8932a6fae7 [feature](Nereids) support Literal collate syntax (#23600)
Support such sql grammar, Just for compatibility

```sql
select table_name
from information_schema.tables
where table_schema collate utf8_general_ci = 'information_schema'
  and table_name collate utf8_general_ci = 'parameters';
```
2023-08-29 17:01:13 +08:00
4c00b1760b [feature](partial update) Support partial update for broker load (#22970) 2023-08-29 14:41:01 +08:00
0ff191cdf3 [feature](Nereids) add expr depth limit and expr children limit in Nereids (#23569)
#### `expr_depth_limit`

Default:3000

IsMutable:true

Limit on the depth of an expr tree.  Exceed this limit may cause long analysis time while holding db read lock.  Do not set this if you know what you are doing

#### `expr_children_limit`

Default:10000

IsMutable:true

Limit on the number of expr children of an expr tree.  Exceed this limit may cause long analysis time while holding database read lock.
2023-08-28 19:03:43 +08:00
a70ebe87c5 [optimize](Nereids): speedup analyze (#23549)
- avoid some `withRowCount`
- ArrayList with size
- checkPrimitiveInputDataTypesWithExpectType avoid to check AnyDataType
2023-08-28 18:17:55 +08:00
Pxl
8e4c0d1e81 [Bug](materialized-view) fix divide double can not match mv (#23504)
* fix divide double can not match mv

* fix

* fix
2023-08-28 18:01:08 +08:00
10792ca0f7 [fix](nereids) Mistaken stats when analyzing table incrementally and partition number less than 512 #23507
Fix bug that mistaken stats when analyzing table incrementally and partition number less than 512
Fix bug that cron expression lost during analyzing
Mark system job as running after registered to AnalysisManager to avoid submit same jobs if previous one take long time
2023-08-28 17:31:36 +08:00
Pxl
6e82178847 [Bug](materialized-view) fix loaddb analyze failed on MaterializedIndexMeta (#23442)
* fix loaddb analyze failed on MaterializedIndexMeta

* update

* update
2023-08-28 15:18:18 +08:00
4c8fc06e40 [Feature](fe) Add admin set partition version statement (#23086)
This commit add a statement to modify partition visible version.
2023-08-28 14:31:54 +08:00
f7d2c1faf6 [feature](Nereids) support select key encryptKey (#23257)
Add select key

```
- CREATE ENCRYPTKEY key_name AS "key_string"
- select key my_key
+-----------------------------+
| encryptKeyRef('', 'my_key') |
+-----------------------------+
| ABCD123456789               |
+-----------------------------+
```
2023-08-28 14:07:26 +08:00
e84989fb6d [feature](Nereids) support map type (#23493) 2023-08-28 11:31:44 +08:00
db8d18eb40 [Enhance](auth)row policy support role (#23022)
```
CREATE ROW POLICY test_row_policy_1 ON test.table1 
AS {RESTRICTIVE|PERMISSIVE} [TO  user] [TO ROLE role] USING (id in (1, 2)); // add `to role`

DROP [ROW] POLICY [IF EXISTS] test_row_policy;//delete `for user` and `on table`

SHOW ROW POLICY [FOR user][FOR ROLE role] // add `for role`
```
2023-08-26 10:24:59 +08:00
006c88827f [fix](stats) Fix auto analyze (#20426)
We only reanalyze those partition that lastVisibleTime is later than job's updatetime, so we shouldn't set this field when creat e system jobs
2023-08-25 21:30:59 +08:00
6d4f06689f [fix](Nereids) avoid Stats NaN (#23445)
tpcds 61 plan changed:
improved from 1.75 sec to 1.67 sec
2023-08-25 16:27:34 +08:00
ba931d9eed [fix](Nereids) infer predicates generate wrong result (#23456)
We use two facilities to do predicate infer: PredicatePropagation and
PullUpPredicates. In the prvious implementation, we use a set to save
the intermediate result of PredicatePropagation. The purpose is infer
new predicate though two equal relation. However, it is the wrong way.
Because it could infer wrong predicate through outer join. For example

```sql
select a.c1
   from a
   left join b on a.c2 = b.c2 and a.c1 = '1'
   left join c on a.c2 = c.c2 and a.c1 = '2'
   inner join d on a.c3=d.c3
```

the predicates `a.c1 = '1'` and `a.c1 = '2'` should not be inferred as
filter to relation `a`.

This PR:
1. revert the change from PR #22145, commit 3c58e9ba
2. Remove the unreasonable restrict in PullupPredicate.
3. Use new Filter node rather than new otherCondition on join node to
   save infer predicates
2023-08-25 11:59:28 +08:00
372f83df5c [opt](Nereids) remove between expression to simplify planner (#23421) 2023-08-25 11:28:12 +08:00
69e75f04ab [fix](feut) should not enable InternalSchemaDb in fe ut (#23400) 2023-08-25 11:03:37 +08:00
6a4976921d [fix](auth)Disable column auth temporarily (#23295)
- add config `enable_col_auth` to temporarily disable column permissions(because old/new planner has bug when select from view)
- Restore the old optimizer to the previous authentication method
- Support for new optimizer authentication(Legacy issue: When querying the view, the permissions of the base table will be authenticated. The view's own permissions should be authenticated and processed after the new optimizer is improved)
- fix: show grants for non-existent users
- fix: role:`admin` can not grant/revoke to/from user
2023-08-24 23:37:06 +08:00
320eda78e6 [fix](nereids) remove useless cast in in-predicate (#23171)
consider sql "select * from test_simplify_in_predicate_t where a in ('1992-01-31', '1992-02-01', '1992-02-02', '1992-02-03', '1992-02-04');"
before:

```
|   0:VOlapScanNode                                                                                                                                                                                      |
|      TABLE: default_cluster:bugfix.test_simplify_in_predicate_t(test_simplify_in_predicate_t), PREAGGREGATION: OFF. Reason: No aggregate on scan.                                                      |
|      PREDICATES: CAST(a[#0] AS DATETIMEV2(0)) IN ('1992-01-31 00:00:00', '1992-02-01 00:00:00', '1992-02-02 00:00:00', '1992-02-03 00:00:00', '1992-02-04 00:00:00') AND __DORIS_DELETE_SIGN__[#1] = 0 |
|      partitions=0/1, tablets=0/0, tabletList=                                                                                                                                                          |
|      cardinality=1, avgRowSize=0.0, numNodes=1                                                                                                                                                         |
|      pushAggOp=NONE                                                                                                                                                                                    |
|      projections: a[#0]                                                                                                                                                                                |
|      project output tuple id: 1                                                                                                                                                                        |
|      tuple ids: 0  
```
after:

```
|   0:VOlapScanNode                                                                                                                                 |
|      TABLE: default_cluster:bugfix.test_simplify_in_predicate_t(test_simplify_in_predicate_t), PREAGGREGATION: OFF. Reason: No aggregate on scan. |
|      PREDICATES: a[#0] IN ('1992-01-31', '1992-02-01', '1992-02-02', '1992-02-03', '1992-02-04') AND __DORIS_DELETE_SIGN__[#1] = 0                |
|      partitions=0/1, tablets=0/0, tabletList=                                                                                                     |
|      cardinality=1, avgRowSize=0.0, numNodes=1                                                                                                    |
|      pushAggOp=NONE                                                                                                                               |
|      projections: a[#0]                                                                                                                           |
|      project output tuple id: 1                                                                                                                   |
|      tuple ids: 0  

```
2023-08-24 18:14:43 +08:00
c3327b51b9 [Fix](Nereids) add nereids load function in read fields of GlobalFunctionMgr and Database (#23248)
add nereids load function in read fields of GlobalFunctionMgr and Database
to fix some udf is lost when restart FE and query with Nereids.
2023-08-23 15:59:50 +08:00
c7b9eb5f9c [enhancement](bitmap)support bitmap type for non-key column in unique table (#23228) 2023-08-23 14:21:22 +08:00
527293aa41 [refactor](dynamic table) remove dynamic table (#23298) 2023-08-23 14:15:14 +08:00
35d0c9e71e [refactor](nereids) Refactor stats collection framework (#22963)
* remove auto analyze grammer
* refactor ResultRow
2023-08-23 10:05:57 +08:00
8f48acaab1 [refactor](nereids) convert session var name "beNumForTest" #23255
this var is used for test only. So keep "for_test" as suffix.
2023-08-22 10:12:07 +08:00
b670dd0db7 [feature](Nereids) support array type (#22851)
FEATURE:
1. enable array type in Nereids
2. support generice on function signature
3. support array and map type in type coercion and type check
4. add element_at and element_slice syntax in Nereids parser

REFACTOR:
1. remove AbstractDataType

BUG FIX:
1. remove FROM from nonReserved keyword list

TODO:
1. support lambda expression
2. use Nereids' way do function type coercion
3. use castIfnotSame when do implict cast on BoundFunction
4. let AnyDataType type coercion do same thing as function type coercion
5. add below array function
- array_apply
- array_concat
- array_filter
- array_sortby
- array_exists
- array_first_index
- array_last_index
- array_count
- array_shuffle shuffle
- array_pushfront
- array_pushback
- array_repeat
- array_zip
- reverse
- concat_ws
- split_by_string
- explode
- bitmap_from_array
- bitmap_to_array
- multi_search_all_positions
- multi_match_any
- tokenize
2023-08-22 09:47:55 +08:00
8411705e36 [fix](nereids)scalar subquery shouldn't be used in mark join (#22907)
* [fix](nereids)scalar subquery shouldn't be used in mark join
2023-08-21 15:38:22 +08:00
Pxl
a11e0e3bc4 [Bug](agg) fix QUANTILE_UNION many problems (#23181)
fix QUANTILE_UNION many problems
2023-08-21 10:04:27 +08:00
6ffc26858a [Improvement](meta) add default_value column & is changed column for result of show_variables stmt (#23017)
* [Improvement](meta) add default_value column for result of show_variables stmt

* add Changed column to show whether value is modified

* fix code style issue
2023-08-20 20:48:45 +08:00
6bf65253d0 [fix](Nereids): unstable test when run single UT. (#23189) 2023-08-18 23:14:56 +08:00
10abbd2b62 [Feauture](Export) support parallel export job using Job Schedule (#22854) 2023-08-18 22:24:42 +08:00
9cee0ecccc [fix](show-table-status) fix priv error on show table status stmt (#22918) 2023-08-18 18:30:09 +08:00
609d20de8c [refactor](nereids)remove ColumnStatistics.selectivity (#23039) 2023-08-18 16:45:54 +08:00
441032c3d8 [fix](Nereids): LogicalSink equals() shouldn't invoke super.equals() (#23145) 2023-08-18 14:05:48 +08:00
03d59ba81e [Fix](Nereids) fix sql-cache for nereids. (#22808)
1. should not use ((LogicalPlanAdapter)parsedStmt).getStatementContext().getOriginStatement().originStmt.toLowerCase() as the cache key (do not invoke toLowerCase()), for example: select * from tbl1 where k1 = 'a' is different with select * from tbl1 where k1 = 'A', so the cache should be missed.
2. according to issue 6735 , the cache key should contains all views' s ddl sql (including nested views)
2023-08-18 09:36:07 +08:00
hzq
38c182100a [refactor](mysql compatibility) An abstract class for all databases created for mysql compatibility (#23087)
Better code structure for mysql compatibility databases.
2023-08-18 09:16:23 +08:00
1f19d0db3e [improvement](tablet clone) improve tablet balance, scaling speed etc (#22317) 2023-08-17 22:30:49 +08:00
11d76d0ebe [fix](Nereids) non-inner join should not merge dist info (#22979)
1. left join should use left dist info.
2. right join should use right dist info.
3. full outer join should return ANY dist info.
2023-08-17 17:48:50 +08:00
bf2b92f5e8 [fix](Nereids): PushdownDistinctThroughJoin don't push distinct for relation (#23066)
* [fix](Nereids): PushdownDistinctThroughJoin don't push distinct for relation.

* fix test
2023-08-17 14:50:34 +08:00
0594acfcf1 [fix](Nereids) scan should output all invisiable column (#23003) 2023-08-16 18:07:59 +08:00
f1880d32d9 [fix](nereids)bind slot failed because of "default_cluster" #23008
slot bind failed for following querys:
select tpch.lineitem.* from lineitem
select tpch.lineitem.l_partkey from lineitem

the unbound slot is tpch.lineitem.l_partkey, but the bounded slot is default_cluster:tpch.lineitem.l_partkey. They are not matched.
we need to ignore default_cluster: when compare dbName
2023-08-16 17:22:44 +08:00