Commit Graph

5453 Commits

Author SHA1 Message Date
76da9f181d fix join count (#22619) 2023-08-05 12:19:28 +08:00
839b469879 [fix](meta) set parallel_pipeline_task_num when upgrading from 1.2 to 2.0 (#22618) 2023-08-05 11:04:39 +08:00
55d46e0ab9 [Fix](ut)Fix NPE due to lack of editlog object in unit test of production consumption model (#22625) 2023-08-04 22:32:15 +08:00
265cded7da [Fix](Planner) fix window function in aggregation (#22603)
Problem:
When window function in aggregation function, executor would report an error like: Required field 'node_type' was not present!

Example:
SELECT SUM(MAX(c1) OVER (PARTITION BY c2, c3)) FROM test_window_in_agg;

Reason:
When analyze aggregate, analytic expr (window function carrior when analyze) transfered to slot and loss message. So when
serialize to thrift package, TExpr can not determine node_type of analytic expr.

Solved:
We do not support aggregate(window function) yet. So we report an error when analyze.
2023-08-04 19:15:51 +08:00
872280135d [exec](pipeline) revert FE pipeline instance num pr (#22617)
* Revert "[fix](executor) only mysql connect to set GlobalPipelineTask (#22205)"
* Revert "[feature](executor) using fe version to set instance_num (#22047)"
2023-08-04 19:07:14 +08:00
d974af5feb [Fix](Load)Multi table plan not include task info (#22613) 2023-08-04 18:52:22 +08:00
9f92861c91 [fix](stats) Load partition stats unexpectedly (#22589)
syncLoadColStats method invoke stale method to deserialize columnstats after supporting load part stats,
2023-08-04 18:50:38 +08:00
95aa4d8631 [Feature](Export) Supports concurrently export of table data (#21911) 2023-08-04 18:50:17 +08:00
672acb8784 [fix](show-table-status) fix hive view NPE and external meta cache refresh issue (#22377) 2023-08-04 16:55:10 +08:00
dc06c486e8 [fix](compatibility) Version 1.2 upgraded to 2.0 compatible with miniload metadata (#22590) 2023-08-04 16:52:51 +08:00
56e8ad197c [improvement](stats) Reduce unnecessary SQL from full auto analyze #22583
1. Remove bunch of SQLs related to partition's information
2. Fix the duplicate SQLs submission
3. Fix bug that table's stats not get updated after system job finished
2023-08-04 15:52:25 +08:00
7d1e08eafa [Fix](Nereids) rand() and uuid() should not fold constant (#22492)
rand() and uuid() should not fold constant and we change the default value of fold constant for non-deterministic function to false.
2023-08-04 15:36:03 +08:00
ef53a27887 [fix](nereids) allow in or exits subquery in binary operator (#22391)
support subquery in binary operator like if( xx  in ( subquery ), 1, 0 )
2023-08-04 15:35:19 +08:00
d379b04b39 [fix](planner) fix bug of push conjuncts through second phase agg (#22417)
If there is a second phase agg, the output of the 1st phase agg is its intermediate tuple not the output tuple.
This pr fix it
2023-08-04 15:21:18 +08:00
3d758de7a2 [improvement](binlog) gc be binlog metas when tablet is dropped. (#22447) 2023-08-04 14:38:13 +08:00
34164f69ba [Enhancement](binlog) Add Barrier log into BinlogManager (#22559)
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-08-04 14:37:12 +08:00
34b7f381b1 [fix](multi catalog)Filter .hive-staging dir under hive file path. #22574
Hive file path may contain temporary directory like this:

drwxrwxrwx   - root supergroup          0 2023-03-22 21:03 /usr/hive/warehouse/datalake_performance.db/clickbench_parquet_hits/.hive-staging_hive_2023-03-22_21-03-12_047_8461238469577574033-1
drwxrwxrwx   - root supergroup          0 2023-05-18 15:03 /usr/hive/warehouse/datalake_performance.db/clickbench_parquet_hits/.hive-staging_hive_2023-05-18_15-03-52_780_3065787006787646235-1
This will cause error when be try to read these files. Need to filter them during FE plan.
2023-08-04 14:14:53 +08:00
3d5b90befe [fix](tablet clone) fix not add colocate replica and print some logs #22378 2023-08-04 14:09:02 +08:00
658d75c816 [feature](Nereids): normalize join condition after expanding or condition NLJ (#22555) 2023-08-04 13:37:37 +08:00
d5a21de796 [Enhancement](planner)support fold constant for date_trunc() (#22122) 2023-08-04 13:32:48 +08:00
62b1a7bcf3 [tpcds](nereids) add rule to eliminate empty relation #22203
1. eliminate emptyrelation,
2. const fold after filter pushdown
2023-08-04 12:49:53 +08:00
0e9fad4fe9 [stats](nereids) improve Anti join stats estimation #22444
No impact on TPC-H
impact on TPC-DS 16/69/94  improved
2023-08-04 12:48:39 +08:00
d3cab017ec [chore](topn-opt) temporary disable two phase read for TableQueryPlanActionQ (#22543) 2023-08-04 11:53:48 +08:00
479e62de0f [Fix](multi catalog)Fix hive partition contains special character bug (#22541)
Hive partition path may contain special characters, need to encode it before creating a URI object based on the file path.
2023-08-03 23:53:25 +08:00
3447a70b25 [Fix](planner)fix delete stmt contains where but delete all data. (#22563) 2023-08-03 23:44:05 +08:00
a6f6b351fe [feature](profile) add DORIS_BUILD_SHORT_HASH in profile #22516 2023-08-03 21:25:26 +08:00
151120c907 [Improvement](statistics)Improve show analyze performance. #22484 2023-08-03 21:22:37 +08:00
469886eb4e [FIX](array)fix if function for array() #22553
[FIX](array)fix if function for array() #22553
2023-08-03 19:40:45 +08:00
60ca5b0bad [Improvement](statistics)Return meaningful error message when show column stats column name doesn't exist (#22458)
The error message was not good for not exist column while show column stats:
```
MySQL [hive.tpch100]> show column stats `lineitem` (l_extendedpric);
ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: null
```

This pr show a meaningful message:
```
mysql> show column stats `lineitem` (l_extendedpric);
ERROR 1105 (HY000): errCode = 2, detailMessage = Column: l_extendedpric not exists
```
2023-08-03 16:35:14 +08:00
27f6e4649e [improvement](stats) Catch exception properly #22503
Catch exception instead of throw to caller directly to avoid unexpected interruption of upper logic
2023-08-03 15:16:55 +08:00
3961b8df76 [refactor](Nereids) mv top-n two phase read rule from post processor to rewriter (#22487)
use three new plan node to represent defer materialize of TopN.
Example:

```
-- SQL
select * from t1 order by c1 limit 10;

-- PLAN
+------------------------------------------+
| Explain String                           |
+------------------------------------------+
| PhysicalDeferMaterializeResultSink       |
| --PhysicalDeferMaterializeTopN           |
| ----PhysicalDistribute                   |
| ------PhysicalDeferMaterializeTopN       |
| --------PhysicalDeferMaterializeOlapScan |
+------------------------------------------+
```
2023-08-03 14:28:13 +08:00
4f9969ce1e [feature](show-frontends-disk) Add Show frontend disks (#22040)
Co-authored-by: yuxianbing <yuxianbing@yy.com>
Co-authored-by: yuxianbing <iloveqaz123>
2023-08-03 14:04:48 +08:00
4322fdc96d [feature](Nereids): add or expansion in CBO(#22465) 2023-08-03 13:29:33 +08:00
85a95e206e [bugfix](profile) not output some variables correctly (#22537) 2023-08-03 13:17:02 +08:00
e670d84b72 [feature](executor) using max_instance_num to limit automatically instance (#22521) 2023-08-03 13:12:32 +08:00
596fd4d86d [improvement](file-scan) reduce the min size of file split (#22412)
Reduce from 128MB to 8MB.
So that user can set `file_split_size` more flexible.
2023-08-03 11:42:00 +08:00
fb644ad691 [improvement](stats) Add more logs and config options (#22436)
1. add more logs and make error messages more clear
2. sleep a while between retry analyze
3. make concurrency of sync analyze configurable
4. Ignore internal columns like delete sign to save resources
2023-08-03 09:55:29 +08:00
e5028314bc [Feature](Job)Support scheduler job (#21916) 2023-08-02 21:34:43 +08:00
8cac8df40c [Fix](Planner) fix create view tosql not include partition (#22482)
Problem:
When create view with join in table partitions, an error would rise like "Unknown column"

Example:
CREATE VIEW my_view AS SELECT t1.* FROM t1 PARTITION(p1) JOIN t2 PARTITION(p2) ON t1.k1 = t2.k1;
select * from my_view ==> errCode = 2, detailMessage = Unknown column 'k1' in 't2'

Reason:
When create view, we do tosql first in order to persistent view sql. And when doing tosql of table reference, partition key
word was removed to keep neat of sql string. But here when we remove partition keyword it would regarded as an alias.
So "PARTITION" keyword can not be removed.

Solved:
Add “PARTITION” keyword back to tosql string.
2023-08-02 20:04:59 +08:00
527782f3d3 [fix](nereids)move RecomputeLogicalPropertiesProcessor rule before topn optimization (#22488)
topn optimization will change MutableState. So need move RecomputeLogicalPropertiesProcessor rule before it
2023-08-02 17:36:56 +08:00
ddd90855a9 [vectorized](udaf) java udaf support with map type (#22397)
[vectorized](udaf) java udaf support with map type (#22397)
* test
* remove some unused
* update
* add case
2023-08-02 15:03:44 +08:00
16461fdc1c [feature](Nereids): pushdown COUNT through join (#22455) 2023-08-02 14:55:25 +08:00
41f984bb39 [fix](fe) Fix stmt forward #22469
The call of String.format() contains orphan %s that will cause following error.
Introduced from #21205
2023-08-02 10:34:04 +08:00
19d1f49fbe [improvement](compaction) compaction policy and options in the properties of a table (#22461) 2023-08-01 22:02:23 +08:00
809f67e478 [fix](nereids)fix bug of cast expr to decimalv3 without any check (#22466) 2023-08-01 21:59:47 +08:00
94dee833cd [fix](multi-catalog)fix compatible with hdfs HA empty prefix (#22424) 2023-08-01 21:48:16 +08:00
b8399148ef [fix](DOE) es catalog not working with pipeline,datetimev2, array and esquery (#22046) 2023-08-01 21:45:16 +08:00
d5d82b7c31 [stats](nereids) fix bug for avg-size (#22421) 2023-08-01 17:13:00 +08:00
d4a6ef3f8c [fix](Nereids) fix test framework of hypergraph (#22434) 2023-08-01 16:20:07 +08:00
26737dddff [feature](Nereids): pushdown MIN/MAX/SUM through join (#22264)
* [minor](Nereids): add more comment to explain code

* [feature](Nereids): pushdown MIN/MAX/SUM through join
2023-08-01 13:23:55 +08:00