Commit Graph

832 Commits

Author SHA1 Message Date
a2c8d14fd9 [Bug] Partition key's type has been changed after executing queries (#3348)
Expr's `uncheckedCastTo()` method should return a new instance of casted expr.
The origin expr should remain unchanged.
2020-04-21 08:30:02 +08:00
46272a5621 [Bug] Fix bug of TransactionState SerDe error (#3356)
The TransactionState's coordinator should be created when deserialized from
old meta.
2020-04-21 08:24:10 +08:00
94b7bb5ad6 [Bug][Dynamic Partition]Fix Bug that dynamic partition properties is not consistent (#3359) 2020-04-20 23:52:47 +08:00
c69bf9ac44 [New Stmt] Add SHOW KEYS gramma (#3342)
support `SHOW KEYS FROM table` for the data connector of mainstream BI tools
like PowerBI/FineBI 

#3334
2020-04-20 15:58:20 +08:00
753d6cc19f Add LOG.isDebugEnabled for some debug logical of Coordinator (#3352)
This may very slightly affect the performance or not.
2020-04-20 08:30:57 +08:00
929e93699a Fix Colocate Join Bug (#3354)
1 Fix sync error colocate group status between fe
2 Fix losing call of EditLog.logColocateRemoveTable
2020-04-20 08:29:34 +08:00
c223d37c99 [Delete] Make some correct in delete operation (#3338)
#3190
1. Correct the directory of DeleteJob.java
2. Fix some logic fault in DeleteHandlerTest.java
3. Add timeout value in log and exception
2020-04-19 11:49:02 +08:00
77a7037346 Fix cooldown timestamp bug (#3336)
when add a parition with storage_cooldown_time property like this:
alter table tablexxx ADD PARTITION p20200421 VALUES LESS THAN("1588262400") ("storage_medium" = "SSD", "storage_cooldown_time" = "2020-05-01 00:00:00");
and show partitions from tablexxx;
the CooldownTime is wrong: 2610-02-17 10:16:40, and what is more, the storage migration is based on the wrong timestamp.
The reason is that the result of DateLiteral.getLongValue is not timestamp.
2020-04-18 22:47:22 +08:00
0624f6b9eb [Doris On ES]Add simple explain for EsTable (#3341)
related issue: #3306
Note: this PR just remove the es_scan_node_test.cpp which is useless

For the moment, just add a simple explain syntax for EsTable without translating the native predicates to ES queryDSL which is better to finished with moving the predicate translating from Doris BE to Doris FE, the whole work is still WIP.
2020-04-18 10:04:03 +08:00
9331574818 [Transaction] Cancel all txns whose coordinate BE is down. (#3293)
This CL solve problem:
- FE can't aware Coordinate BE down and cancel the txns because the txns can't finish.
- Do some code style refactor

NOTICE: FE meta version upgrade to 83
2020-04-17 11:24:03 +08:00
224f5d8bad [SegmentV1] Enable to read and write boolean type data (#3324)
This PR is to enable to read and write boolean type data for segment v1
2020-04-16 23:39:08 +08:00
b29cb9dbb3 [Optimize][Delete] Simplify the delete process to make it fast (#3191)
Our current DELETE strategy reuses the LoadChecker framework.
LoadChecker runs jobs in different stages by polling them in every 5 seconds.

There are four stages of a load job, Pending/ETL/Loading/Quorum_finish,
each of them is allocated to a LoadChecker. Four example, if a load job is submitted,
it will be initialized to the Pending state, then wait for running by the Pending LoadChecker.
After the pending job is ran, its stage will change to ETL stage, and then wait for
running by the next LoadChecker(ETL). Because interval time of the LoadChecker is 5s,
in worst case, a pending job need to wait for 20s during its life cycle.

In particular, the DELETE jobs do not need to wait for polling, they can run the pushTask()
function directly to delete. In this commit, I add a delete handler to concurrently
processing delete tasks.

All delete tasks will push to BE immediately, not required to wait for LoadCheker,
without waiting for 2 LoadChecker(delete job started in LOADING state),
at most 10s will be save(5s per LoadCheker). The delete process now is synchronized
and users get response only after the delete finished or be canceled.

If a delete is running over a certain period of time, it will be cancelled with a timeout exception.

NOTICE: this CL upgrade FE meta version to 82
2020-04-16 10:32:44 +08:00
e61793763a [Bug] Use equals() method to judge whether "type" are equal (#3310)
I don't why, but I found that sometimes when I use "==" to judge the equality of type,
it return false, even if the types are exactly same.

ISSUE: #3309

This CL only changes == to equals() to solve the problem, but the reason is still unknown.
2020-04-15 15:04:13 +08:00
9257535f91 [New Feature] Support setting replica quota in db level (#3283)
This PR is to limit the replica usage, admin need to know the replica usage for every db and 
table, be able to set replica quota for every db.

```
ALTER DATABASE db_name SET REPLICA QUOTA quota; 
```
2020-04-14 22:25:32 +08:00
a467c6f81f [ES Connector] Add field context for string field keyword type (#3305)
This PR is just a transitional way,but it is better to move the predicates transformation from Doris BE to Doris BE, in this way, Doris BE is responsible for fetching data from ES.

 Add a  `enable_keyword_sniff ` configuration item in creating External Elasticsearch Table ,it default to true , would to sniff the `keyword` type on the `text analyzed` Field and return the `json_path` which substitute the origin col name.

```
CREATE EXTERNAL TABLE `test` (
  `k1` varchar(20) COMMENT "",
  `create_time` datetime COMMENT ""
) ENGINE=ELASTICSEARCH
PROPERTIES (
"hosts" = "http://10.74.167.16:8200",
"user" = "root",
"password" = "root",
"index" = "test",
"type" = "doc",
"enable_keyword_sniff" = "true"
);
```
note: `enable_keyword_sniff` default to  "true"

run this SQL:

```
select * from test where k1 = "wu yun feng"
```
 Output predicate DSL:

```
{"term":{"k1.keyword":"wu yun feng"}}
```
and in this PR, I remove the elasticsearch version detected logic for now this is useless, maybe future is needed.
2020-04-13 23:07:33 +08:00
7c07083cd5 Forbidden multi subquery in having clause (#3291)
Multiple subqueries in the having statement need to be rewritten into multiple tables for join. The current rewriting rules need to be transformed.
And this writing is not common, and there is no strong requirement from the business side.
This function will be added later if it is required.
2020-04-11 21:56:08 +08:00
5b69c70f9a [Bug] Fix bug that user plugin dir is removed after installing the plugin (#3302)
When user install a FE plugin from a directory, the directory should not
be removed after installing.
2020-04-11 20:30:14 +08:00
3086790e06 Fix bug when use ZoneMap/BloomFiter on column with REPLACE/REPLACE_IF_NOT_NULL (#3288)
Now, column with REPLACE/REPLACE_IF_NOT_NULL can be filtered by ZoneMap/BloomFilter
when the rowset is base(version starts with zero). Always we think is an optimization.
But when some case, it will occurs bug.

create table test(
  k1 int,
  v1 int replace,
  v2 int sum
);
If I have two records on different two versions

1 2 2 on version [0-10]
1 3 1 on version 11
If I perform a query

select * from test where k1 = 1 and v1 = 3;
The result will be 1 3 1, this is not right because of the first record is filtered.
The right answer is 1 3 3, the v2 should be summed.
Remove this optimization is necessity to make the result is right.
2020-04-10 10:22:21 +08:00
ce1d5ab9ab [Bug] Fix some bugs of install/uninstall plugins (#3267)
1. Avoid losing plugin if plugin failed to load when replaying
    When in replay process, the plugin should always be added to the plugin manager,
    even if that plugin failed to be loaded.

2. `show plugin` statement should show all plugins, not only the successfully installed plugins.

3. plugin's name should be unique globally and case insensitive.

4. Avoid creating new instances of plugins when doing metadata checkpoint.

5. Add a __builtin_ prefix for builtin plugins.
2020-04-09 23:04:28 +08:00
037bc53b54 [BUG] Fix cast result expr bug (#3279)
When the result type is a date type, the result expr type should not be cast.
Because in the FE function, the specific type of the date type is determined by the actual
type of the return value, not by the function return value type.

For example, the function `str_to_date` may return DATE or DATETIME, depends on the
format pattern.

DATE:
```
mysql> select str_to_date('11/09/2011', '%m/%d/%Y');
+---------------------------------------+
| str_to_date('11/09/2011', '%m/%d/%Y') |
+---------------------------------------+
| 2011-11-09                            |
+---------------------------------------+
```

DATETIME:
```
mysql> select str_to_date('2014-12-21 12:34:56', '%Y-%m-%d %H:%i:%s');
+---------------------------------------------------------+
| str_to_date('2014-12-21 12:34:56', '%Y-%m-%d %H:%i:%s') |
+---------------------------------------------------------+
| 2014-12-21 12:34:56                                     |
+---------------------------------------------------------+
2020-04-09 22:02:05 +08:00
8699bb7bd4 [Query] Optimize where clause by extracting the common predicate in the OR compound predicate. (#3278)
Queries like below cannot finish in a acceptable time, `store_sales` has 2800w rows, `customer_address` has 5w rows, for now Doris will create only one cross join node to execute this sql, 
the time of eval the where clause is about 200-300 ns, the total count of eval will be  2800w * 5w, this is extremely large, and this will cost 2800w * 5w * 250 ns = 4 billion seconds;

```
select avg(ss_quantity)
       ,avg(ss_ext_sales_price)
       ,avg(ss_ext_wholesale_cost)
       ,sum(ss_ext_wholesale_cost)
 from store_sales, customer_address 
 where  ((ss_addr_sk = ca_address_sk
  and ca_country = 'United States'
  and ca_state in ('CO', 'IL', 'MN')
  and ss_net_profit between 100 and 200  
     ) or
     (ss_addr_sk = ca_address_sk
  and ca_country = 'United States'
  and ca_state in ('OH', 'MT', 'NM')
  and ss_net_profit between 150 and 300  
     ) or
     (ss_addr_sk = ca_address_sk
  and ca_country = 'United States'
  and ca_state in ('TX', 'MO', 'MI')
  and ss_net_profit between 50 and 250  
     ))
```

but this  sql can be rewrite to 
```
select avg(ss_quantity)
       ,avg(ss_ext_sales_price)
       ,avg(ss_ext_wholesale_cost)
       ,sum(ss_ext_wholesale_cost)
 from store_sales, customer_address 
 where ss_addr_sk = ca_address_sk
  and ca_country = 'United States' and (((ca_state in ('CO', 'IL', 'MN')
  and ss_net_profit between 100 and 200  
     ) or
     (ca_state in ('OH', 'MT', 'NM')
  and ss_net_profit between 150 and 300  
     ) or
     (ca_state in ('TX', 'MO', 'MI')
  and ss_net_profit between 50 and 250  
     ))
 )
```
there for  we can do a hash join first and then use 
```
(((ca_state in ('CO', 'IL', 'MN')
  and ss_net_profit between 100 and 200  
     ) or
     (ca_state in ('OH', 'MT', 'NM')
  and ss_net_profit between 150 and 300  
     ) or
     (ca_state in ('TX', 'MO', 'MI')
  and ss_net_profit between 50 and 250  
     ))
 )
```
to filter the value,

in TPCDS 10g dataset,  the rewritten sql only cost about 1 seconds.
2020-04-09 21:57:45 +08:00
c9c58342b2 [License] Add License to codes (#3272) 2020-04-07 16:35:13 +08:00
3f247b0d2d Fix cast date type return wrong result (#3214)
We have multiple date type, and we also need to cast between different date types.
If not cast, it will cause problems when binarypredicate
2020-04-03 12:08:18 +08:00
fcb651329c [Plugin] Making FE audit module pluggable (#3219)
Currently we have implemented the plugin framework in FE. 
This CL make the original audit log logic pluggable.
The following classes are mainly implemented:

1. AuditPlugin
    The interface of audit plugin

2. AuditEvent
    An AuditEvent contains all information about an audit event, such as a query, or a connection.

3. AuditEventProcessor
    Audit event processor receive all audit events and deliver them to all installed audit plugins.

This CL implements two audit module plugins:

1. The builtin plugin `AuditLogBuilder`, which act same as the previous logic, to save the 
    audit log to the `fe.audit.log`

2. An optional plugin `AuditLoader`, which will periodically inserts the audit log into a Doris table
    specified by the user. In this way, users can conveniently use SQL to query and analyze this
    audit log table.

Some documents are added:

1. HELP docs of install/uninstall/show plugin.
2. Rename the `README.md` in `fe_plugins/` dir to `plugin-development-manual.md` and move
    it to the `docs/` dir
3. `audit-plugin.md` to introduce the usage of `AuditLoader` plugin.

ISSUE: #3226
2020-04-03 09:53:50 +08:00
c9ff6f68d1 Fix Rewrite count distinct bitmap and hll order by bug (#3251) 2020-04-03 09:08:27 +08:00
d14726e05b Fix join hints not work when need table reorder (#3188)
* fix join hints not work when need table reorder
fix cross join numNodes not computed

* fix some typo

* disable table reorder when has join hints
2020-04-02 17:13:35 +08:00
390f462f55 [Bug] Fix read schema change job meta bug (#3244) 2020-04-02 12:31:46 +08:00
6252a271dd Rewrite count distinct bitmap and hll in order by and having (#3232) 2020-04-02 09:11:42 +08:00
29b37dad49 Sql reference of materialized view (#3208)
* Sql reference of materialized view

Sql reference of Create and drop materialized view in English and Chinese.

* Change discription
2020-04-01 21:22:19 +08:00
9c937180cd [Alter]Clean SchemaChangeJobV2 when schema change CANCELLED or FINISHED (#3212)
SchemaChangeJobV2 will use too much memory in FE, which may cause FullGC. But these data is useless after job is done, so we need to clean it up.

NOTICE: update FE meta version to 80
2020-04-01 21:05:17 +08:00
34993a69a8 Fix colocate relocateGroup bug after decommission (#3239) 2020-04-01 18:50:36 +08:00
028da655a9 Increased compatibility with mysql (#3235)
Add divPrecisionIncrement and utf8-superset transform
2020-04-01 09:57:00 +08:00
68a801ffbe Support Java version 64 bits Integers for BITMAP type (#3090)
Fork from roaringbitmap's Roaring64NavigableMap, overwrite serialize/deserialize method to keep compatibility with be's bitmap storage format
2020-03-31 15:29:41 +08:00
0554e89645 [Alter] Fix bug of assertion failure when submitting schema change job (#3181)
When creating a schema change job, we will create a corresponding shadow replica for each replica.
Here we should check the state of the replica and only create replicas in the normal state.

The process here may need to be modified later. We should completely allow users to submit alter jobs
under any circumstances, and then in the job scheduling process, dynamically detect changes in the replicas
and do replica repairs, instead of forcing a check on submission.
2020-03-31 12:06:30 +08:00
e9b3584d45 [Bug] Fix bug that desc tbl all stmt throw error: Malformed packet (#3233) 2020-03-31 10:29:53 +08:00
4131afe316 [Bug] NPE when using unknown function in broker load process (#3225)
This CL fix the bug described in issue #3224 by

1. Forbid UDF in broker load process
2. Improving the function checking logic to avoid NPE when trying to
   get default database from ConnectionContext.
2020-03-30 18:34:41 +08:00
41f1ab006b Add curdate/now function in fe (#3215) 2020-03-28 13:39:54 +08:00
d3555e3624 [Conf][API Change] Change the default FE meta dir and BE storage_root_path
1. Change word of palo to doris in conf file.
2. Set default meta_dir to ${DORIS_HOME}/doris-meta
3. Comment out FE meta_dir, leave it to ${DORIS_HOME}/doris-meta, as exsting in FE Config.java.
4. Comment out BE storage_root_path, leave it to ${DORIS_HOME}/storage, as exsting in BE config.h.

NOTICE: default config is changed.
2020-03-27 20:42:12 +08:00
cb68e10217 [MaterializedView] Add 'IndexKeysType' field in 'Desc all table stmt' (#3209)
After doris support aggregation materialized view on duplicate table, 
desc stmt of metadata is confused in sometimes. The reason is that
there is no grouping information in desc stmt of metadata.

For example:
There are two materialized view as following.
    1. create materialized view k1_k2 as select k1, k2 from table;
    2. create materialzied view deduplicated_k1_k2 as select k1, k2 from table group by k1, k2;
Before this commit, the metatdata in desc stmt is the same.

   ```
    +-----------------------+-------+----------+------+-------+---------+-------+
    | IndexName             | Field | Type     | Null | Key   | Default | Extra |
    +-----------------------+-------+----------+------+-------+---------+-------+
    | k1_k2                 | k1    | TINYINT  | Yes  | true  | N/A     |       |
    |                       | k2    | SMALLINT | Yes  | true  | N/A     |       |
    | deduplicated_k1_k2    | k1    | TINYINT  | Yes  | true  | N/A     |       |
    |                       | k2    | SMALLINT | Yes  | true  | N/A     |       |
    +-----------------------+-------+----------+------+-------+---------+-------+
   ```

So, we need to show the KeysType of materialized view in desc stmt.
Now, the desc stmt of all mvs is changed as following:

    ```
    +-----------------------+---------------+-------+----------+------+-------+---------+-------+
    | IndexName             | IndexKeysType | Field | Type     | Null | Key   | Default | Extra |
    +-----------------------+---------------+-------+----------+------+-------+---------+-------+
    | k1_k2                 | DUP_KEYS      | k1    | TINYINT  | Yes  | true  | N/A     |       |
    |                       |               | k2    | SMALLINT | Yes  | true  | N/A     |       |
    | deduplicated_k1_k2    | AGG_KEYS      | k1    | TINYINT  | Yes  | true  | N/A     |       |
    |                       |               | k2    | SMALLINT | Yes  | true  | N/A     |       |
    +-----------------------+---------------+-------+----------+------+-------+---------+-------+
    ```

NOTICE: this modify the the column of `desc` stmt.
2020-03-27 20:36:02 +08:00
aa8b2f86c4 [Bug][Refactor] Fix the conflict of temp partition and dynamic partition operations (#3201)
The bug is described in issue: #3200.

This CL solve the problem by:
1. Refactor the alter operation conflict checking logic by introducing new classes `AlterOperations` and `AlterOpType`.
2. Allow add/drop temporary partition when dynamic partition feature is enabled.
3. Allow modifying table's property when there is temporary partition in table.
4. Make the properties `dynamic_partition.enable` optional, and default is true.
2020-03-27 20:25:15 +08:00
c1969a3fb3 [Conf] Make default_storage_medium configurable (#2980)
Doris support choose medium when create table, and the cluster balance strategy is dependent
between different storage medium, and most use will not specify the storage medium when create table,
even they kown that they should choose a storage medium, they have no idea about the
cluster's storage medium, so, I think we should make storage_medium and storage_cooldown_time
configurable, and this should be the admin's responsibility.

For Example, if the cluster's storage medium is HDD, but we need to change part of machines to SSD,
if we change the machine, the tablets before change is stored in HDD and they can't find a dest path
to migrate, and user will create table as usual, it will make all tablets stored in old machines and
the new machines will only store a little tablets. Without this config the only way is admin need
to traverse all partitions in cluster and change the property of storage_medium, it will increase
operational and maintenance costs.

So I add a FE config default_storage_medium, so that user can set the default storage medium.
2020-03-27 20:22:18 +08:00
32c4fc691c Support determine isPreviousLoadFinished for some alter jobs in table level (#3196)
This PR is to reduce the time cost for waiting transactions to be completed in same db by filter the running transactions in table level.

NOTICE: Update FE meta version to 79
2020-03-27 20:16:23 +08:00
c4c37a4394 Rewritten subquery in having clause (#3206)
The subquery in having clause should be rewritten too.
If not, ExprRewriteRule will not be apply in subquery.
For example:
select k1, sum (k2) from table group by k1 having sum(k2) > (select t1 from table2 where t2 between 1 and 2);
```t1 between 1 and 2``` should be rewritten to ```t1 >=1 and t1<=2```.

Fixed #3205. TPC-DS 14 will be passed after this commit.
2020-03-26 21:13:57 +08:00
eda23b57f2 [Plugin] Create the FE plugin dir if missing (#3202)
The FE plugin dir should be created when initializing.
Also modify the pom.xml in fe_plugins dir to make it able to use custom maven setting.
2020-03-26 11:21:10 +08:00
f585f30b1e [Plugin] Add FE plugin framework (#2463)
issue #2344 

* Add install/unintall Plugin statement
* Add show plugin statement
* Support install plugin through two ways:
    * Built-in Plugin: use PluginMgr's register method.
    * Dynamic Plugin: install by SQL statement, and the process:
        1. check Plugin has already install?
        2. download Plugin file from remote source or copy from local source
        3. extract Plugin's .zip 
        4. read Plugin's plugin.properties, and check Plugin's Value
        5. dynamic load .jar and init Plugin's main Class
        6. invoke Plugin's init method
        7. register Plugin into PluginMgr.
        8. update meta

* Support FE Plugin dynamic uninstall process
    1. check Plugin has install?
    2. invoke Plugin's close method
    3. delete Plugin from PluginMgr
    4. update meta

* Add audit plugin interface 
* Add plugin enable flags in Config
* Add plugin install path in Config, default plugin will install in ${DORIS_FE_PATH}/plugins
* Add FE plugins project
* Add audit plugin demo

The usage:

```
// install plugin and show plugins;

mysql>
mysql> install plugin from "/home/users/seaven/auditplugin.zip";                                              
Query OK, 0 rows affected (0.05 sec)
mysql>
mysql> show plugins;
+-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+
| Name              | Type  | Description   | Version | JavaVersion | ClassName              | SoName | Sources                               |
+-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+
| audit_plugin_demo | AUDIT | just for test | 0.11.0  | 1.8.31      | plugin.AuditPluginDemo | NULL   | /home/users/hekai/auditplugindemo.zip |
+-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+
1 row in set (0.00 sec)

mysql> show plugins;
+-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+
| Name              | Type  | Description   | Version | JavaVersion | ClassName              | SoName | Sources                               |
+-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+
| audit_plugin_demo | AUDIT | just for test | 0.11.0  | 1.8.31      | plugin.AuditPluginDemo | NULL   | /home/users/hekai/auditplugindemo.zip |
+-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+
1 row in set (0.00 sec)

mysql> uninstall plugin audit_plugin_demo; 
Query OK, 0 rows affected (0.04 sec)
mysql> show plugins;
Empty set (0.00 sec)
```

TODO:

*Config.plugin_dir should be created if missing
2020-03-25 22:57:05 +08:00
c0282bbc58 Solve the problem of mv selector when there is having clause in query (#3176)
All of columns which belong to top of tupleIds in query should be considered in mv selector.
For example:

`select k1 from table group by k1 having sum(v1) >1;`

The candidate index should contain k1 and v1 columns instead of only k1.
The rollup which only has k1 column should not be selected.

The issue #3174 describe in detail.
2020-03-25 20:42:39 +08:00
dfd1a33712 [Dynamic Partition] Unify dynamic partition name and range (#3193)
Generates partition names based on the granularity.
eg:
Year:prefix2020
Day:    prefix20200325
Week: prefix2020_#,  # is the week of year.

At the same time, for all granularity, align the partition range to 00:00:00.
2020-03-25 18:37:05 +08:00
71bc815b20 [SQL] Support subquery in case when statement (#3135)
#3153
implement subquery support for  sub query in case when statement like
```
SELECT CASE
        WHEN (
            SELECT COUNT(*) / 2
            FROM t
        ) > k4 THEN (
            SELECT AVG(k4)
            FROM t
        )
        ELSE (
            SELECT SUM(k4)
            FROM t
        )
    END AS kk4
FROM t;
```

this statement will be rewrite to 
```
SELECT CASE
        WHEN t1.a > k4 THEN t2.a
        ELSE t3.a
    END AS kk4
FROM t, (
        SELECT COUNT(*) / 2 AS a
        FROM t
    ) t1,  (
        SELECT AVG(k4) AS a
        FROM t
    ) t2,  (
        SELECT SUM(k4) AS a
        FROM t
    ) t3;
```
2020-03-25 17:12:54 +08:00
b2518fc285 [SQL] Support non-correlated subquery in having clause (#3150)
This commit support the non-correlated subquery in having clause.
For example:

select k1, sum(k2) from table group by k1 having sum(k2) > (select avg(k1) from table);

Also the non-scalar subquery is supportted in Doris.
For example:

select k1, sum(k2) from table group by k1 having sum(k2) > (select avg(k1) from table group by k2);

Doris will check the result row numbers of subquery in executing.
If more then one row returned by subquery, the query will thrown exception.

The implement method:

The entire outer query is regarded as inline view of new query.
The subquery in having clause is changed to the where predicate in this new query.

After this commit, tpc-ds 23,24,44 are supported.

This commit also support the subquery in ArithmeticExpr.
For example:

select k1  from table where k1=0.9*(select k1 from t);
2020-03-25 16:29:09 +08:00
3cff89df7f [Dynamic Partition] Support for automatically drop partitions (#3081) 2020-03-25 10:24:46 +08:00