Commit Graph

5755 Commits

Author SHA1 Message Date
90349f0e61 [Feature](Nereids) support mask function (#15120)
support function for nereids: mask, mask_first_n, mask_last_n
2022-12-21 10:25:11 +08:00
d0d7a6d8ad [fix](multi-catalog) can't show databases when creating a new user in external catalog (#15204)
Fix bug: A new user with grants to access external catalog can't show databases.
2022-12-21 08:58:06 +08:00
8969c19cd4 [fix](jdbc) fix create table like table of jdbc error (#15179)
when create table like table of jdbc, it will get error like
'errCode = 2, detailMessage = Failed to execute CREATE TABLE LIKE baseall_mysql.
Reason: errCode = 2, detailMessage = property table_type must be set'
this pr fix it.
2022-12-21 08:56:43 +08:00
5c35f02bdb [fix](nereids) add signature for IF to support HLL type (#15188) 2022-12-20 22:22:11 +08:00
c3712b1114 [bug](jdbc) fix error of jdbc with datetime type in oracle (#15205) 2022-12-20 22:05:55 +08:00
5cf21fa7d1 [feature](planner) mark join to support subquery in disjunction (#14579)
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
2022-12-20 15:22:43 +08:00
d9550c311e [feature](Nereids) implement setOperation (#15020)
The pr implements the SetOperation.

- Adapt to the EliminateUnnecessaryProject rule to ensure that the project under SetOperation is not deleted.
- Add predicate pushdown of SetOperation
- Optimization: Merge multiple SetOperations with the same type and the same qualifier
- Optimization: merge oneRowRelation and union
2022-12-20 15:14:29 +08:00
fdb54a346d [feature] (nereids) support aggregate function group_bit_and/or/xor (#15003)
support

group_bit_and
group_bit_xor
group_bit_or
2022-12-20 14:11:07 +08:00
6712f1fc1d [fix](Nereids) encryption function with 4 params should auto-complate last param with config (#15038) 2022-12-20 13:55:54 +08:00
737fe49f6f [Bug](FE) fix compile error due to code refactor (#15192) 2022-12-20 13:20:55 +08:00
4979ad09c8 [fix](join)the policy to choose colocate join is not correct (#15140)
* [hotfix](dev-1.0.1) fix colocate join bug in vec engine after introducing output tuple (#10651)

to support vectorized outer join, we introduced a out tuple for hash join node,
but it breaks the checking for colocate join.
To solve this problem, we need map the output slot id to the children's slot id of hash join node,
and the colocate join can be checked correctly.

* fix colocate join bug

* fix non vec colocate join issue

Co-authored-by: lichi <lichi@rateup.com.cn>

* add test cases

Co-authored-by: lichi <lichi@rateup.com.cn>
2022-12-20 09:44:47 +08:00
320b264c9d [feature](planner) compact multi-euqals to in-predicate #14876 2022-12-20 09:43:34 +08:00
81c06e8edc [feature](nereids) add scalar function is_null_pred and is_not_null_pred (#15163) 2022-12-20 00:54:40 +08:00
918698151a [Fix](Nereids)fix be core when select constant expression (#15157)
fix be core when select !2
2022-12-20 00:48:11 +08:00
a84a590b4f [fix](nereids) estimate TimeStampArithmetic (#15061)
`select * from lineitem where  l_shipdate < date('1994-01-01') + interval '1' YEAR limit 1;`
cause stack overflow
2022-12-20 00:44:42 +08:00
4dece99c97 [fix](nereids)add estimation for full outer join (#14902) 2022-12-20 00:42:11 +08:00
a086f67255 [fix](nereids) stats calculator lost column statistics on limit node (#14759)
`select avg(id) from (select id from t1 limit 1);` 
above sql encounters NPE, because stats for limit node lost column statistics
2022-12-20 00:39:57 +08:00
21523f4db1 [fix](auth) fix bug that user info may lost when upgrading to 1.2.0 (#15144)
* [fix](auth) fix bug that user info may lost when upgrading to 1.2.0

* fix
2022-12-19 16:01:18 +08:00
f5823a90ff [fix](broker-load) fix broker load with hdfs failed to get right file type (#15138) 2022-12-19 16:00:58 +08:00
6be5670ce9 [Feature](multi catalog)Remove enable_multi_catalog config item, open this function to public. (#15130)
The multi-catalog feature is ready to use, remove enable_multi_catalog switch in FE config, open it to public.
2022-12-19 14:29:13 +08:00
1597afcd67 [fix](mutil-catalog) fix get many same name db/table when show where (#15076)
when show databases/tables/table status where xxx, it will change a selectStmt to select result from 
information_schema, it need catalog info to scan schema table, otherwise may get many
database or table info from multi catalog.

for example
mysql> show databases where schema_name='test';
+----------+
| Database |
+----------+
| test |
| test |
+----------+

MySQL [internal.test]> show tables from test where table_name='test_dc';
+----------------+
| Tables_in_test |
+----------------+
| test_dc |
| test_dc |
+----------------+
2022-12-19 14:27:48 +08:00
000972ae17 [fix](executor) fix some npe about getting catalog and add some error info (#15155) 2022-12-19 14:25:52 +08:00
7730a88d11 [fix](multi-catalog) add support for orc binary type (#15141)
Fix three bugs:
1. DataTypeFactory::create_data_type is missing the conversion of binary type, and OrcReader will failed
2. ScalarType#createType is missing the conversion of binary type, and ExternalFileTableValuedFunction will failed
3. fmt::format can't generate right format string, and will be failed
2022-12-19 14:24:12 +08:00
e8bac706d3 [deps](FE)Upgrade the velocity version that hive-exec depends on to 2.3 (#15067) 2022-12-19 14:20:11 +08:00
b62a94ab46 [enhancement](metric)add one metric for the publish num per db (#14942)
Add one metric to detect the publish txn num per db. User can get the relative speed of the txns processing per db using this metric and doris_fe_txn_num.
2022-12-19 14:18:11 +08:00
07f5d9562c [fix](brokerload) fix broker load failed aused by the error path (#15057) 2022-12-19 10:51:48 +08:00
17e14e9a63 [bug](udaf) fix java udaf incorrect get null value with row (#15151) 2022-12-19 10:07:12 +08:00
a75c302bdb [fix](schema) Fix create table error if Colocate tables not equal to bucket num (#15071)
Co-authored-by: hugoluo <hugoluo@tencent.com>
2022-12-19 09:24:14 +08:00
3506b568ff [Regression](multi catalog)P2 regression case for external hms catalog on emr. #15156 2022-12-19 09:21:48 +08:00
af4d9b636a [refactor](Nerieds) Refactor aggregate function/plan/rules and support related cbo rules (#14827)
# Proposed changes

## refactor
- add AggregateExpression to shield the difference of AggregateFunction before disassemble and after
- request `GATHER` physicalProperties for query, because query always collect result to the coordinator, use `GATHER` maybe select a better plan
- refactor `NormalizeAggregate`
- remove some physical fields for the `LogicalAggregate`, like `AggPhase`, `isDisassemble`
- remove `AggregateDisassemble` and `DistinctAggregateDisassemble`, and use `AggregateStrategies` to generate various of PhysicalHashAggregate, like `two phases aggregate`, `three phases aggregate`, and cascades can auto select the lowest cost alternative.
- move `PushAggregateToOlapScan` to `AggregateStrategies`
- separate the traverse and visit method in FoldConstantRuleOnFE
  - if some expression not implement the visit method, the traverse method can handle and rewrite the children by default
  - if some expression implement the visit, the user defined traverse(invoke accept/visit method) will quickly return because the default visit method will not forward to the children, and the pre-process in traverse method will not be skipped.

## new feature
- support `disable_nereids_rules` to skip some rules.

example:

1. create 1 bucket table `n`
```sql
CREATE TABLE `n` (
  `id` bigint(20) NOT NULL
) ENGINE=OLAP
DUPLICATE KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2",
"disable_auto_compaction" = "false"
);
```

2. insert some rows into `n`
```sql
insert into n select * from numbers('number'='20000000')
```

3. query table `n`
```sql
SET enable_nereids_planner=true;
SET enable_vectorized_engine=true;
SET enable_fallback_to_original_planner=false;
explain plan select id from n group by id;
```

the result show that we use the one stage aggregate
```
| PhysicalHashAggregate ( aggPhase=LOCAL, aggMode=INPUT_TO_RESULT, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional.empty, requestProperties=[GATHER], stats=(rows=1, width=1, penalty=2.0E7) ) |
| +--PhysicalProject ( projects=[id#0], stats=(rows=20000000, width=1, penalty=0.0) )                                                                                                                                                                                                |
|    +--PhysicalOlapScan ( qualified=default_cluster:test.n, output=[id#0, name#1], stats=(rows=20000000, width=1, penalty=0.0) )                                                                                                                                                    |
```

4. disable one stage aggregate
```sql
explain plan select
  /*+SET_VAR(disable_nereids_rules=DISASSEMBLE_ONE_PHASE_AGGREGATE_WITHOUT_DISTINCT)*/
  id
from n
group by id
```

the result is two stage aggregate
```
| PhysicalHashAggregate ( aggPhase=GLOBAL, aggMode=BUFFER_TO_RESULT, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional[[id#0]], requestProperties=[GATHER], stats=(rows=1, width=1, penalty=2.0E7) ) |
| +--PhysicalHashAggregate ( aggPhase=LOCAL, aggMode=INPUT_TO_BUFFER, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional[[id#0]], requestProperties=[ANY], stats=(rows=1, width=1, penalty=2.0E7) )     |
|    +--PhysicalProject ( projects=[id#0], stats=(rows=20000000, width=1, penalty=0.0) )                                                                                                                                                                                                   |
|       +--PhysicalOlapScan ( qualified=default_cluster:test.n, output=[id#0, name#1], stats=(rows=20000000, width=1, penalty=0.0) )                                                                                                                                                       |
```
2022-12-18 21:49:29 +08:00
6aba948df0 [fix](multi-catalog) hidden password for show create jdbc catalog (#15145)
when show create catalog of jdbc, it will show 'jdbc.password' plain text. fix it like other code that hidden password.
2022-12-17 17:20:17 +08:00
6d5251af78 [fix](subquery)fix bug of using constexpr as subquery's output (#15119) 2022-12-16 21:58:58 +08:00
4530b531e7 [fix](type) forbid time type when creating table (#15093) 2022-12-16 21:54:35 +08:00
67b9d469c1 [Bug](datev2) Fix compatible problems caused by datev2 (#15131)
This bug is introduced by #15094
2022-12-16 21:52:39 +08:00
3909970ce9 [fix](explain) fix explain output format problem (#15019) 2022-12-16 10:53:05 +08:00
728a238564 [vectorized](jdbc) fix external table of oracle with condition about … (#15092)
* [vectorized](jdbc) fix external table of oracle with condition about datetime report error

* formatter
2022-12-16 10:48:17 +08:00
0e1e5a802b [config](load) enable new load scan node by default (#14808)
Set FE `enable_new_load_scan_node` to true by default.
So that all load tasks(broker load, stream load, routine load, insert into) will use FileScanNode instead of BrokerScanNode
to read data

1. Support loading parquet file in stream load with new load scan node.
2. Fix bug that new parquet reader can not read column without logical or converted type.
3. Change jsonb parser function to "jsonb_parse_error_to_null"
    So that if the input string is not a valid json string, it will return null for jsonb column in load task.
2022-12-16 09:41:43 +08:00
52e09e6b04 [fix](nereids) add hashcode and equal to TVFProperties to avoid duplicated error.(#15054) 2022-12-16 03:23:27 +08:00
5e0d44ff25 [fix](nereids) fix bug of expr rewrite and column prune rule of group by exprs (#15097) 2022-12-16 03:22:36 +08:00
8f914aa864 [feature](Nereids) support 'timestamp' type constructor (#15095)
sql like: select timestamp '2022-01-01 01:00:00' + interval '2' hours;
2022-12-16 03:20:56 +08:00
6ddbd204e7 [fix](Nereids): Update plan when prune column in DPHyp (#14880) 2022-12-15 21:59:55 +08:00
5ef4c42a80 [Bug](datev2) Fix wrong result when use datev2 as partition key (#15094) 2022-12-15 21:27:05 +08:00
bccea1c511 [Enhancement](partition prune): calculate the column ranges of compound predicates (#14886)
Doris does not support disjunctive predicates very well, which causes some problems in partition prune.
For example, sqls like the followings will trigger a full table scan without partition pruning

select * from test.t1
where (dt between 20211121 and 20211122) or  (dt between 20211125 and 20211126)
2022-12-15 20:47:44 +08:00
6625e650c4 [fix](resource) HdfsStorage can get default.Fs from path or configuration (#15079) 2022-12-15 16:56:32 +08:00
face82e56a [fix](meta) fix bug that backend tag may change to default after upgrading to 1.2 (#15085) 2022-12-15 12:07:11 +08:00
67e4292533 [fix](iceberg-v2) icebergv2 filter data path (#14470)
1. a icebergv2 delete file may cross many data paths, so the path of a file split is required as a predicate to filter rows of delete file
- create delete file structure to save predicate parameters
- create predicate for file path
2. add some log to print row range
3.  fix bug when create file metadata
2022-12-15 10:18:12 +08:00
46030d786a [Bug](datetimev2) Fix wrong result after insertion (#15052) 2022-12-15 09:54:18 +08:00
03847b6a3a [Feature](Api) Support operate node(fe/be). (#14904)
Support operate node(fe/be) via http
2022-12-14 23:18:56 +08:00
41838e6acb [fix](string-type) rectify string type' len to MAX_STRING_LENGTH (#14985)
cherry pick from #14587
2022-12-14 15:41:08 +08:00
9d6a81d1e3 [improvement](query)optimize select stmt with limit 0 (#14956) 2022-12-14 13:48:09 +08:00