Commit Graph

3223 Commits

Author SHA1 Message Date
562fb6db83 [fix](Nereids) event channel dead loop until queue is not empty (#14816) 2022-12-08 15:55:09 +08:00
1887881a61 [feature](Nereids) support push down no group agg to olap scan (#14683)
use zonemap to do aggregate without group by exprs.
valid aggregate function:
- count
- min
- max

implementation in legacy planner: #12881
2022-12-08 15:34:39 +08:00
2fb896d916 [feature](nereids) Support using join syntax (#14784) 2022-12-08 15:22:41 +08:00
Pxl
dbaa02d3a0 [Pipeline](fix) fix enable_pipeline_engine variable not work (#14909) 2022-12-08 14:52:52 +08:00
e62cc2ce76 [minor](typo) Fix typo (#14903) 2022-12-08 10:50:45 +08:00
27c8147a2b [fix](multi-catalog) use last used database for catalog when switch back (#14793)
remember last used database of every catalog and use it when switch back
2022-12-08 10:32:30 +08:00
962810b973 [Vectorized](jdbc) add check type for jdbc table (#14501) 2022-12-08 10:27:47 +08:00
56cc777087 [fix] 'SHOW ROLES' statement does not display resource privilege (#14812) (#14897) 2022-12-08 10:22:09 +08:00
167a20a03b [Minor](Planner): remove redundant SessionVariable (#14818) 2022-12-08 08:33:07 +08:00
6d6de0d408 [fix](multi-catalog) check new catalog name is used or not before rename (#14891) 2022-12-07 21:54:44 +08:00
0bc6f91c3a [Improvement](multi catalog)Add comment to external hms table show create table output. (#14861)
The output of show create table comment for hms external table doesn't contain comment section.
This pr is to add the comment to the output.
2022-12-07 21:12:30 +08:00
dfb02a7104 [refactor](statistics) Remove deprecated statistics related codes (#14797) 2022-12-07 20:41:00 +08:00
a3095e29d5 [fix](nereids)translate is not null predicate mistake (#14866)
the 'is not null' predicate is not translated correctly in ExpressionTranslator
2022-12-07 20:14:13 +08:00
ec2539e2a3 [chore](macOS) Resolve the issue with missing python program (#14864) 2022-12-07 15:30:12 +08:00
Pxl
48a9166aa4 [Pipeline](sink) support olap table sink operator (#14872)
* support olap table sink operator

* update config
2022-12-07 15:29:56 +08:00
6b5e10c8be [fix](agg)having clause should use alias if there is no group by clause (#14831) 2022-12-07 14:13:17 +08:00
wxy
ad6a356a84 [fix](audit) fix duplicate audit log. (#14246)
fix duplicate audit log.
2022-12-07 13:54:21 +08:00
9d2cb133f2 [fix](jdbc) fix logger error of statusLogger unrecognized (#14854)
* [fix](jdbc) fix logger error of statusLogger unrecognized

* update
2022-12-07 11:43:05 +08:00
9e51e0263d [fix](memory leak) Fix load fragment QueryFragmentsCtx is not destroyed (#14840) 2022-12-07 08:45:53 +08:00
1304185adb [Regression](Fix) fix the regression of pipeline and ConcurrentModificationException failed (#14849)
* [fix](ut) try to fix ConcurrentModifycationException bug

* [Regression](Fix) fix the regression of pipeline and ConcurrentModificationException failed

Co-authored-by: morningman <morningman@163.com>
2022-12-06 15:34:32 +08:00
3e911a05b1 [fix](fe)fix select from temporary partition bug (#14809) 2022-12-06 14:32:35 +08:00
fb78807430 [Exec](Profile) Register to Fetch Result time and Write Result time in FE to debug (#14832) 2022-12-06 14:32:18 +08:00
e578e2cd98 [Enhancement](Nereids) Explain display extra information (#14802)
# Proposed changes

Issue Number: close #14554

## Problem summary

1. provide a function **Plan.extraPlans** that returns extra plans, eg: LogicalSubQueryAlias in LogicalCTE.
2. combine the extra plans and the children in the AbstractPlan.treeString(), distinguished by the * at the beginning.
```
========== PARSED PLAN ==========
LogicalCTE ( aliasQueries=[LogicalSubQueryAlias ( alias=s )] )
|-*LogicalSubQueryAlias ( alias=s )
|  +--LogicalProject ( projects=['s_suppkey] )
|     +--LogicalFilter ( predicates=('s_suppkey = '') )
|        +--LogicalCheckPolicy ( child=UnboundRelation ( nameParts=supplier ) )
|           +--UnboundRelation ( nameParts=supplier )
+--LogicalProject ( projects=[*] )
   +--LogicalJoin ( type=CROSS_JOIN, hashJoinConjuncts=[], otherJoinConjuncts=[] )
      |--LogicalSubQueryAlias ( alias=t1 )
      |  +--LogicalCheckPolicy ( child=UnboundRelation ( nameParts=s ) )
      |     +--UnboundRelation ( nameParts=s )
      +--LogicalSubQueryAlias ( alias=t2 )
         +--LogicalCheckPolicy ( child=UnboundRelation ( nameParts=s ) )
            +--UnboundRelation ( nameParts=s )
```
2022-12-06 12:28:40 +08:00
db4524c10e [Bug](date function) Fix date_add function (#14826) 2022-12-05 20:34:20 +08:00
ed96442b85 [fix](multi-catalog) fix persist issue about jdbc catalog and class loader issue #14794
Fix a bug that JDBC catalog/database/table should be add to GsonUtil

Fix a class loader issue that sometime it will cause ClassNotFoundException

Fix regression test to use different catalog name.

Comment out 2 regression tests:

regression-test/suites/query_p0/system/test_query_sys.groovy
regression-test/suites/statistics/alter_col_stats.groovy
Need to be fixed later
2022-12-05 09:05:13 +08:00
5be8f9432e [fix](DOE) Support ES index which contains dynamic_templates (#14762)
Support ES index with dynamic_templates. And do not support index mapping without explicit mapping.
2022-12-05 08:33:51 +08:00
852b03729f [Improvement](meta)add IsCurrent column in show catalogs result #14700
When a user has multiple catalogs and switch several times, he may forget which catalog is using. So I add a iscurrent column in show catalogs result for help.

mysql> show catalogs;
+-----------+-------------+----------+-----------+
| CatalogId | CatalogName | Type | IsCurrent |
+-----------+-------------+----------+-----------+
| 136591 | es | es | |
| 130100 | hive | hms | yes |
| 0 | internal | internal | |
+-----------+-------------+----------+-----------+
2022-12-05 08:32:16 +08:00
ce95da8dfb [improvement](multi-catalog) support specify hadoop username (#14734)
Support setting "hadoop.username" property when creating hms catalog.
2022-12-04 21:09:39 +08:00
97dcd2b13a [feature](nereids) merge proj-proj in post process (#14730)
* merge proj-proj

* v2this pr guarantees that the physical plan does not contains consecutive physical projects.
Like rewrite rule "merge projects", it works on physical plan, not logical plan.

* move merge-proj code into Project.java
2022-12-03 23:41:02 +08:00
283b23f6da [fix](planner) wrong results when select from view which has with clause (#14747) 2022-12-02 18:10:52 +08:00
12304bc0ee [Pipeline](exec) Support pipeline exec engine (#14736)
Co-authored-by: Lijia Liu <liutang123@yeah.net>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: Pxl <952130278@qq.com>
Co-authored-by: shee <13843187+qzsee@users.noreply.github.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>

## Problem Summary:

### 1. Design

DSIP: https://cwiki.apache.org/confluence/display/DORIS/DSIP-027%3A+Support+Pipeline+Exec+Engine

### 2. How to use:

Set the environment variable `set enable_pipeline_engine = true; `
2022-12-02 17:11:34 +08:00
e9799fab09 [refactor](datev2) refine function expr for datev2 (#14697)
* [refactor](datev2) refine function expr for datev2

* update
2022-12-02 10:13:11 +08:00
228e9ed01c [fix](improvement)(meta) fix alter catalog properties issues and reformat code (#14745)
1. fix NPE exception #14740
2. fix issue:
mysql> alter catalog xyz set properties ('hive.metastore.uris'='thrift://172.21.0.1:7004');
ERROR 1105 (HY000): errCode = 2, detailMessage = Can't modify the type of catalog property with name: xyz
3. change behavior. The original logic is use props in set properties clause to replace all exists props, now change to only replace the listed props in set properties clause, and new props will be added. Make it behavior like alter table property stmt.
2022-12-02 09:34:13 +08:00
e5000c708e [feature](statistics) Support for collecting statistics on materialized view (#14676)
1. Map muiltiple tasks to one Job
2. Remove the codes for analyzing whole default db, since this feature is not available and would create too many tasks and related code is confusing
3. support analyze materialized view
4. abstract the common logic to BaseTask
2022-12-01 22:34:13 +08:00
2be8235d95 [feature](nereids) support timestampdiff function (#14662)
complete timeStampDiff
supported timeunit:
 - YEAR 
 - MONTH
 - WEEK
 - DAY
 - HOUR
 - MINUTE
 - SECOND
2022-12-01 22:11:55 +08:00
14e208354d [Feature](Nereids) support nereids event for logging the cascades states and transformation. (#13659)
Add an event producer, channel, consumer system to support the feature as title and you can turn it on using set
enable_nereids_event = true;
For more information, please see fe/fe-core/src/main/java/org/apache/doris/nereids/metrics/README.md
2022-12-01 21:42:40 +08:00
302da03b18 [enhancement](Nereids): Use long bitmap in DPHyp (#14725) 2022-12-01 20:47:45 +08:00
9dd1d989e8 [test](decimalv3) add regression test cases for decimalv3 (#14672) 2022-12-01 15:18:40 +08:00
f496d1972a [improvement](multi-catalog) return root cause of exception (#14708) 2022-12-01 14:58:05 +08:00
3c6b96b9be [enhancement](Nereids) avoid add project that output same with child to memo (#14180) 2022-12-01 10:49:44 +08:00
36737fe9f4 [feature](Nereids): Add cache to avoid repeatly calculation in DPhyp (#14585) 2022-11-30 21:35:45 +08:00
9bbbcf031c [enhancement](k8s) Support fqdn mode for be in k8s enviroment (#9172)
In the k8s environment, the ip of the pod can be changed, but the hostname of pod is stable. When the host machine of the pod fails, the k8s can schedule the failed pod to the new host machine for reconstruction. After that, the newly created pod's hostname remains unchanged, and the ip address has been changed. The change of the be node's ip address can be detected by FQDNManager when enable_fqdn_mode is true

Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-11-30 20:42:15 +08:00
593a916ae6 [feature](nereids) split AggregateDisassemble into two rules (#14611)
# Proposed changes

Issue Number: close #14280

## Problem summary

The AggregateDisassemble rule is refactored and split into two rules, which are not dependent on each other.
1. AggregateDisassemble splits the agg into two phases: Local, Global.
1.1. For count function, the implementation is as follows:distinct_multi_count(update)+ distinct_multi_count(merge)

2. DistinctAggregateDisassemble splits the agg into 4 stages: Local, Global, Distinct Local, Distinct GLobal.
2.1. For count function, the implementation is as follows:distinct_multi_count(update)+ distinct_multi_count(merge)+sum(update)+ sum(merge)
2022-11-30 14:02:42 +08:00
3ca3af2234 [improvement](planner)sort show catalogs result by name (#14684)
Result of show databases, show tables, show data are all sorted by name, so make show catalogs behavior same.
2022-11-30 11:55:14 +08:00
3ff409551c [enhencement](netty) bind netty's default logger when launching fe (#14675)
The logger Doris Fe uses is log4j, while netty might use slf4j to choose one logger.
And it's reported some confusing occasions would happen under such circumstance.
And this binding doesn't take effect if move the bind logic to other file or other place within PaloFe.java,
so I have to leave it before the main function.
2022-11-30 11:54:39 +08:00
9272680d00 [feature](multi-catalog) support Jdbc catalog (#14527)
Issue Number: close #xxx

I add jdbc catalog for doris multi-catalog feature.
Currently, the jdbc catalog only supports MYSQL DBMS.

TODO:

support for postgre DB
Support for other databases.
Problem summary
For jdbc catalog, we can create catalog like:

CREATE CATALOG jdbc4 PROPERTIES (
    "type"="jdbc",
    "jdbc.user"="root",
    "jdbc.password"="123456",
    "jdbc.jdbc_url" = "jdbc:mysql://127.0.0.1:13396/demo?yearIsDateType=false",
    "jdbc.driver_url" = "file:/mnt/disk2/ftw/tools/jar/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar",
    "jdbc.driver_class" = "com.mysql.jdbc.Driver"
);
Note:
yearIsDateType is a param of jdbc:
If yearIsDateType configuration property is set to false, then the returned object type is java.sql.Short. If set to true (the default), then the returned object is of type java.sql.Date with the date set to January 1st, at midnight.
To compat with mysql, we force the use of yearIsDateType=false in FE. if user sets yearIsDateType=true, doris FE will force to change yearIsDateType=false.
2022-11-30 11:28:08 +08:00
82f3980774 [feature](Nereids) estimation without column statistics (#14526)
estimate plan cost without column statistics.
change list:
1. remove original StatsCalculator, it is replaced by StatsCalculatorV2. rename StatsCalculatorV2 to StatsCalculator
2. remove FilterSelectivityCalculator, it is replaced by FilterEstimation
3. remove session var:ENABLE_NEREIDS_STATS_DERIVE_V2
4. add ColumnStatistics.isUnKnown, which means the column is not analyzed, and its stats is not accurate.
5. add estimatedRowCount() function for OLAP tables
6. add unit tests for FilterEstimation and StatsCalculator
2022-11-30 11:27:51 +08:00
3a362fab76 [fix](fe)table function node use wrong info for projection (#14667) 2022-11-30 10:41:32 +08:00
ca90253b09 [config](storage-policy) add a FE config to disable storage policy by default (#14655)
the cold-hot separation feature is still
under development. And seems there are some unsolved feature remains.
So I add a fe config enable_storage_policy, and default is false, to disable the creation and usage of storage policy by default.

So that user can aware that he is using an experimental feature on his own, and it will not be released formally in v1.2.0.

Disable storage policy by default, user can not use or create storage policy. Configured by enable_storage_policy.

Remove property remote_storage_policy, it is duplicate with storage_policy

Change the persist field in DataProperty.java.
And remove remoteCooldownTime from DataProperty, because it can be got from StoragePolicy.
2022-11-30 10:04:33 +08:00
dd7ec8f4ca [improvement](test) add tpch1 orc for hive catalog and refactor some test dir (#14669)
Add tpch 1g orc test case in hive docker

Refactor some suites dir of catalog test cases.

And "-internal" for dlf endpoint, to support access oss with aliyun vpc.
2022-11-30 10:03:58 +08:00