Files
doris/regression-test/suites/query_p0/cache/sql_cache.groovy
924060929 15f8014e4e [enhancement](Nereids) Enable parse sql from sql cache and fix some bugs (#33867)
* [enhancement](Nereids) Enable parse sql from sql cache (#33262)

Before this pr, the query must pass through parser, analyzer, rewriter, optimizer and translator, then we can check whether this query can use sql cache, if the query is too long, or the number of join tables too big, the plan time usually >= 500ms.

This pr reduce this time by skip the fashion plan path, because we can reuse the previous physical plan and query result if no any changed. In some cases we should not parse sql from sql cache, e.g. table structure changed, data changed, user policies changed, privileges changed, contains non-deterministic functions, and user variables changed.

In my test case: query a view which has lots of join and union, and the tables has empty partition, the query latency is about 3ms. if not parse sql from sql cache, the plan time is about 550ms

## Features
1. use Config.sql_cache_manage_num to control how many sql cache be reused in on fe
2. if explain plan appear some plans contains `LogicalSqlCache` or `PhysicalSqlCache`, it means the query can use sql cache, like this:
```sql
mysql> set enable_sql_cache=true;
Query OK, 0 rows affected (0.00 sec)

mysql> explain physical plan select * from test.t;
+----------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                  |
+----------------------------------------------------------------------------------+
| cost = 3.135                                                                     |
| PhysicalResultSink[53] ( outputExprs=[c1#0, c2#1] )                              |
| +--PhysicalDistribute[50]@0 ( stats=3, distributionSpec=DistributionSpecGather ) |
|    +--PhysicalOlapScan[t]@0 ( stats=3 )                                          |
+----------------------------------------------------------------------------------+
4 rows in set (0.02 sec)

mysql> select * from test.t;
+------+------+
| c1   | c2   |
+------+------+
|    1 |    2 |
|   -2 |   -2 |
| NULL |   30 |
+------+------+
3 rows in set (0.05 sec)

mysql> explain physical plan select * from test.t;
+-------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                           |
+-------------------------------------------------------------------------------------------+
| cost = 0.0                                                                                |
| PhysicalSqlCache[2] ( queryId=78511f515cda466b-95385d892d6c68d0, backend=127.0.0.1:9050 ) |
| +--PhysicalResultSink[52] ( outputExprs=[c1#0, c2#1] )                                    |
|    +--PhysicalDistribute[49]@0 ( stats=3, distributionSpec=DistributionSpecGather )       |
|       +--PhysicalOlapScan[t]@0 ( stats=3 )                                                |
+-------------------------------------------------------------------------------------------+
5 rows in set (0.01 sec)
```

(cherry picked from commit 03bd2a337d4a56ea9c91673b3bd4ae518ed10f20)

* fix

* [fix](Nereids) fix some sql cache consistence bug between multiple frontends (#33722)

fix some sql cache consistence bug between multiple frontends which introduced by [enhancement](Nereids) Enable parse sql from sql cache #33262, fix by use row policy as the part of sql cache key.
support dynamic update the num of fe manage sql cache key

(cherry picked from commit 90abd76f71e73702e49794d375ace4f27f834a30)

* [fix](Nereids) fix bug of dry run query with sql cache (#33799)

1. dry run query should not use sql cache
2. fix test sql cache in cloud mode
3. enable cache OneRowRelation and EmptyRelation in frontend to skip parse sql

(cherry picked from commit dc80ecf7f33da7b8c04832dee88abd09f7db9ffe)

* remove cloud mode

* remove @NotNull
2024-04-19 15:22:14 +08:00

216 lines
6.9 KiB
Groovy

// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
// The cases is copied from https://github.com/trinodb/trino/tree/master
// /testing/trino-product-tests/src/main/resources/sql-tests/testcases/aggregate
// and modified by Doris.
suite("sql_cache") {
// TODO: regression-test does not support check query profile,
// so this suite does not check whether cache is used, :)
def tableName = "test_sql_cache"
sql "ADMIN SET FRONTEND CONFIG ('cache_last_version_interval_second' = '0')"
sql """ DROP TABLE IF EXISTS ${tableName} """
sql """
CREATE TABLE IF NOT EXISTS ${tableName} (
`k1` date NOT NULL COMMENT "",
`k2` int(11) NOT NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`k1`, `k2`)
COMMENT "OLAP"
PARTITION BY RANGE(`k1`)
(PARTITION p202205 VALUES [('2022-05-01'), ('2022-06-01')),
PARTITION p202206 VALUES [('2022-06-01'), ('2022-07-01')))
DISTRIBUTED BY HASH(`k1`, `k2`) BUCKETS 32
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2"
)
"""
sql "sync"
sql """ INSERT INTO ${tableName} VALUES
("2022-05-27",0),
("2022-05-28",0),
("2022-05-29",0),
("2022-05-30",0),
("2022-06-01",0),
("2022-06-02",0)
"""
qt_sql_cache """
select
k1,
sum(k2) as total_pv
from
${tableName}
where
k1 between '2022-05-28' and '2022-06-30'
group by
k1
order by
k1;
"""
sql "set enable_sql_cache=true "
qt_sql_cache """
select
k1,
sum(k2) as total_pv
from
${tableName}
where
k1 between '2022-05-28' and '2022-06-30'
group by
k1
order by
k1;
"""
qt_sql_cache """
select
k1,
sum(k2) as total_pv
from
${tableName}
where
k1 between '2022-05-28' and '2022-06-30'
group by
k1
order by
k1;
"""
qt_sql_cache """
select
k1,
sum(k2) as total_pv
from
${tableName}
where
k1 between '2022-05-28' and '2022-05-28'
group by
k1
order by
k1
union all
select
k1,
sum(k2) as total_pv
from
${tableName}
where
k1 between '2022-05-28' and '2022-05-28'
group by
k1
order by
k1;
"""
qt_sql_cache """
select
k1,
sum(k2) as total_pv
from
${tableName}
where
k1 between '2022-05-28' and '2022-05-28'
group by
k1
order by
k1
union all
select
k1,
sum(k2) as total_pv
from
${tableName}
where
k1 between '2022-05-28' and '2022-05-28'
group by
k1
order by
k1;
"""
sql "SET enable_nereids_planner=true"
sql "SET enable_fallback_to_original_planner=false"
qt_sql_cache """
select
k1,
sum(k2) as total_pv
from
${tableName}
where
k1 between '2022-05-28' and '2022-06-30'
group by
k1
order by
k1;
"""
qt_sql_cache """
select
k1,
sum(k2) as total_pv
from
${tableName}
where
k1 between '2022-05-28' and '2022-06-30'
group by
k1
order by
k1;
"""
sql 'set default_order_by_limit = 2'
sql 'set sql_select_limit = 1'
qt_sql_cache """
select
k1,
sum(k2) as total_pv
from
${tableName}
where
k1 between '2022-05-28' and '2022-06-30'
group by
k1
order by
k1;
"""
qt_sql_cache """
select
k1,
sum(k2) as total_pv
from
${tableName}
where
k1 between '2022-05-28' and '2022-06-30'
group by
k1
order by
k1;
"""
sql "ADMIN SET FRONTEND CONFIG ('cache_last_version_interval_second' = '10')"
}