Commit Graph

18263 Commits

Author SHA1 Message Date
42e91149e4 [enhancement](auto-partition) Forbid use Auto and Dynamic partition at the same time (#33736) 2024-04-19 23:41:46 +08:00
bec7c36c46 [fix](stacktrace) Fix dwarf_location_info_mode is passed as parameter to stack trace (#33863)
dwarf_location_info_mode is passed as parameter to stack trace
2024-04-19 23:41:46 +08:00
ee687a43fd [fix](plsql) Fix regression test for routine select (#33860)
fix #33608, more comprehensive test
2024-04-19 23:41:46 +08:00
f2a0ac8ff2 [feature] (partition) Dynamic partition behavior changes (#33712) 2024-04-19 23:41:46 +08:00
25358564ca [Fix](compile) Fix gcc compile on master (#33864)
This is imported by #33511. wrongly used

ColumnStr<T> ();

which violate C++20 standard(see https://wg21.cmeerw.net/cwg/issue2237) but still supported by clang up until now(see llvm/llvm-project#58112)
2024-04-19 23:41:37 +08:00
74590e4836 [refine](node) Remove the cse DCHECK from the constructor (#33856)
It's possible that a failure in the fe caused the check to fail, and at that moment, it may not be possible to retrieve the corresponding query ID from be.out.
2024-04-19 23:41:37 +08:00
7e91e69eb9 [fix](compaction) fix single compaction (#33907)
* [fix](compaction)Fix single compaction to get all local versions #33849

add test and comment

* remove single replica compaction prepare input rowsets

reviesd
2024-04-19 23:30:25 +08:00
439027119e [fix](schema change) fix schema change check does not calculate reader merged rows (#33825) (#33908) 2024-04-19 22:57:25 +08:00
0ac7849a9d [exec](table_fun) opt bitmap/split vexplode table func performance (#33876) 2024-04-19 15:22:14 +08:00
15f8014e4e [enhancement](Nereids) Enable parse sql from sql cache and fix some bugs (#33867)
* [enhancement](Nereids) Enable parse sql from sql cache (#33262)

Before this pr, the query must pass through parser, analyzer, rewriter, optimizer and translator, then we can check whether this query can use sql cache, if the query is too long, or the number of join tables too big, the plan time usually >= 500ms.

This pr reduce this time by skip the fashion plan path, because we can reuse the previous physical plan and query result if no any changed. In some cases we should not parse sql from sql cache, e.g. table structure changed, data changed, user policies changed, privileges changed, contains non-deterministic functions, and user variables changed.

In my test case: query a view which has lots of join and union, and the tables has empty partition, the query latency is about 3ms. if not parse sql from sql cache, the plan time is about 550ms

## Features
1. use Config.sql_cache_manage_num to control how many sql cache be reused in on fe
2. if explain plan appear some plans contains `LogicalSqlCache` or `PhysicalSqlCache`, it means the query can use sql cache, like this:
```sql
mysql> set enable_sql_cache=true;
Query OK, 0 rows affected (0.00 sec)

mysql> explain physical plan select * from test.t;
+----------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                  |
+----------------------------------------------------------------------------------+
| cost = 3.135                                                                     |
| PhysicalResultSink[53] ( outputExprs=[c1#0, c2#1] )                              |
| +--PhysicalDistribute[50]@0 ( stats=3, distributionSpec=DistributionSpecGather ) |
|    +--PhysicalOlapScan[t]@0 ( stats=3 )                                          |
+----------------------------------------------------------------------------------+
4 rows in set (0.02 sec)

mysql> select * from test.t;
+------+------+
| c1   | c2   |
+------+------+
|    1 |    2 |
|   -2 |   -2 |
| NULL |   30 |
+------+------+
3 rows in set (0.05 sec)

mysql> explain physical plan select * from test.t;
+-------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                           |
+-------------------------------------------------------------------------------------------+
| cost = 0.0                                                                                |
| PhysicalSqlCache[2] ( queryId=78511f515cda466b-95385d892d6c68d0, backend=127.0.0.1:9050 ) |
| +--PhysicalResultSink[52] ( outputExprs=[c1#0, c2#1] )                                    |
|    +--PhysicalDistribute[49]@0 ( stats=3, distributionSpec=DistributionSpecGather )       |
|       +--PhysicalOlapScan[t]@0 ( stats=3 )                                                |
+-------------------------------------------------------------------------------------------+
5 rows in set (0.01 sec)
```

(cherry picked from commit 03bd2a337d4a56ea9c91673b3bd4ae518ed10f20)

* fix

* [fix](Nereids) fix some sql cache consistence bug between multiple frontends (#33722)

fix some sql cache consistence bug between multiple frontends which introduced by [enhancement](Nereids) Enable parse sql from sql cache #33262, fix by use row policy as the part of sql cache key.
support dynamic update the num of fe manage sql cache key

(cherry picked from commit 90abd76f71e73702e49794d375ace4f27f834a30)

* [fix](Nereids) fix bug of dry run query with sql cache (#33799)

1. dry run query should not use sql cache
2. fix test sql cache in cloud mode
3. enable cache OneRowRelation and EmptyRelation in frontend to skip parse sql

(cherry picked from commit dc80ecf7f33da7b8c04832dee88abd09f7db9ffe)

* remove cloud mode

* remove @NotNull
2024-04-19 15:22:14 +08:00
c747714c18 [fix](memory) Fix ExecEnv destroy memory tracking (#33781)
disable memory tracking when ExecEnv destroy.
fix memory tracker label convert to query id
2024-04-19 15:03:10 +08:00
f4704b3821 [improvement](storage) support glibc <2.21 for system call eventfd (#33218)
support glibc <2.21 for system call eventfd
2024-04-19 15:03:10 +08:00
Pxl
175e85d616 [Bug](runtime-filter) fix coredump on no null string type rf (#33869)
fix coredump on no null string type rf
2024-04-19 15:03:06 +08:00
8b061c7055 [Enhancement](group commit) Add fault injection case for group commit 2024-04-19 15:03:06 +08:00
ad75b9b142 [opt](auto bucket) add fe config autobucket_max_buckets (#33842) 2024-04-19 15:03:06 +08:00
e38d844d40 [fix](multi-table-load) fix single stream multi table load cannot finish (#33816) 2024-04-19 15:03:06 +08:00
659900040f [Fix](inverted index) fix wrong need read data opt when encounters columnA > columnB predicate (#33855) 2024-04-19 15:03:06 +08:00
1a6f8c443e [bugfix](paimon) Create paimon catalog with hadoop user (#33833)
When creating a catalog, paimon will create a warehouse on HDFS, so we need to use the corresponding user with permissions to create it.
2024-04-19 15:02:56 +08:00
6776a3ad1b [Fix](planner) fix create view star except and modify cast to sql (#33726) 2024-04-19 15:02:49 +08:00
a8ba933947 [Fix](nereids) fix bind order by expression logic (#33843) 2024-04-19 15:02:49 +08:00
ffd9da44a2 [fix](move-memtable) fix commit may fail due to duplicated reports (#32403) 2024-04-19 15:02:49 +08:00
2675e94a93 [feature](variable) add read_only and super_read_only (#33795) 2024-04-19 15:02:21 +08:00
56eb5ea00c [enhancement](partial-update) print more log while missed some rowsets (#33711) 2024-04-19 15:01:57 +08:00
5abc84af71 [fix](txn insert) Fix txn insert commit failed when schema change (#33706) 2024-04-19 15:01:57 +08:00
315f6e44c2 [Branch-2.1](Outfile) Fixed the problem that the concurrent Outfile wrote multiple Success files (#33870)
backport: #33016
2024-04-19 12:09:53 +08:00
561afde0c4 [feature](insert)support default value when create hive table (#33666)
Issue Number: #31442

hive3 support create table with column's default value
if use hive3, we can write default value to table
2024-04-19 11:31:33 +08:00
734520a77b [bugfix](hive)delete write path after hive insert (#33798)
Issue #31442

1. delete file according query id
2. delete write path after insert
2024-04-19 11:31:25 +08:00
Pxl
ba05ef4405 [Chore](runtime-filter) add tmp debug info to investigate unknown filter error #33857 2024-04-18 21:03:09 +08:00
1300317723 [Exec](join) Support column string64 to avoid join failed in string size overflow the uint32 (#33511) (#33850) 2024-04-18 19:43:08 +08:00
8f6f4cf0eb [Pick](Variant) pick #33734 #33766 #33707 to branch-2.1 (#33848)
* [Fix](Variant Type) forbit distribution info contains variant columns (#33707)

* [Fix](Variant) VariantRootColumnIterator::read_by_rowids with wrong null map size (#33734)

insert_range_from should start from `size` with `count` elements for null map

* [Fix](Variant) check column index validation for extracted columns (#33766)
2024-04-18 19:42:44 +08:00
c8a92b82cc [fix](restore) Reset index id for MaterializedIndexMeta (#33831) 2024-04-18 19:05:24 +08:00
46fa64f34b [minor](Nereids): remove useless getFilterConjuncts() filter() in Translator (#33801) 2024-04-18 19:05:24 +08:00
3eca9da0dd [refactor](filesystem)refactor filesystem interface (#33361)
1. Remame`list` to `globList` . The path of this `list` needs to have a wildcard character, and the corresponding hdfs interface is `globStatus`, so the modified name is `globList`.
2. If you only need to view files based on paths, you can use the `listFiles` operation.
3. Merge `listLocatedFiles` function into `listFiles` function.
2024-04-18 19:05:24 +08:00
34a97d5e8b [fix](Nereids)fix unstable plan shape in limit_push_down case 2024-04-18 19:05:24 +08:00
657a29fd9e [refactor](partitioner) refine get channel id logics (#33765) 2024-04-18 19:05:24 +08:00
Pxl
f17ac173b4 [Improvementation](join) empty_block shall be set true when build block only one row (#33721)
empty_block shall be set true when build block only one row
2024-04-18 19:05:17 +08:00
ea19224d14 [exec](table_fun) opt numbers table func performance (#33804) 2024-04-18 19:04:03 +08:00
75b47b7189 [opt](nereids)clear min/max column stats if table is partially analyzed (#33685) 2024-04-18 19:04:03 +08:00
e106d34190 [enhancement](plsql) regression for routine select and show create procedure (#33608)
add regression for routines and show create procedure
Issue Number: close #31297

add regression for routines and show create procedure
2024-04-18 19:04:03 +08:00
ad80a650e4 [fix][mow] segment_creator should not flush data when encouter abnormal exit (#33802) 2024-04-18 19:02:58 +08:00
04e30c91a0 [Fix](Variant) VariantRootColumnIterator::read_by_rowids with wrong null map size (#33734)
insert_range_from should start from `size` with `count` elements for null map
2024-04-18 19:02:58 +08:00
a05d738b6c [fix](planner) create view statement should forbid mv rewrite (#33784) 2024-04-18 19:02:58 +08:00
Pxl
8c535c51b5 [Improvement](materialized-view) support multiple agg function have same base table slot (#33774)
support multiple agg function have same base table slot
2024-04-18 19:02:49 +08:00
5a5b0c07d7 [fix](inverted index) fix incorrect case test_index_delete (#33609) 2024-04-18 19:02:49 +08:00
4de357ccfb [Fix](Variant Type) forbit distribution info contains variant columns (#33707) 2024-04-18 19:02:37 +08:00
a57e0d3500 [Pick](nerids) pick #33010 #32982 #33531 to branch 2.1 (#33829) 2024-04-18 18:40:36 +08:00
20b37e7a18 Add workload group id in workload policy's property (#33483) 2024-04-17 23:42:14 +08:00
048448eb32 [fix](Nereids) dphyper support evaluate join that has one side condition (#33702) 2024-04-17 23:42:14 +08:00
461561fed0 [minor](Nereids): remove useless stream filter() in Translator (#33758) 2024-04-17 23:42:14 +08:00
ee3b6fdf58 [fix](conf) make be conf disable_storage_page_cache modifiable (#33773)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2024-04-17 23:42:14 +08:00