before
mysql [test]>select cast(1 as DECIMALV3(16, 2)) / cast(3 as DECIMALV3(16, 2));
+-----------------------------------------------------------+
| CAST(1 AS DECIMALV3(16, 2)) / CAST(3 AS DECIMALV3(16, 2)) |
+-----------------------------------------------------------+
| 0.00 |
+-----------------------------------------------------------+
mysql [test]>select * from divtest;
+------+------+
| id | val |
+------+------+
| 3 | 5.00 |
| 2 | 4.00 |
| 1 | 3.00 |
+------+------+
mysql [test]>select cast(1 as decimalv3(16,2)) / val from divtest;
+-------------------------------------+
| CAST(1 AS DECIMALV3(16, 2)) / `val` |
+-------------------------------------+
| 0 |
| 0 |
| 0 |
+-------------------------------------+
after
mysql [test]>select cast(1 as DECIMALV3(16, 2)) / cast(3 as DECIMALV3(16, 2));
+-----------------------------------------------------------+
| CAST(1 AS DECIMALV3(16, 2)) / CAST(3 AS DECIMALV3(16, 2)) |
+-----------------------------------------------------------+
| 0.33 |
+-----------------------------------------------------------+
mysql [test]>select cast(1 as decimalv3(16,2)) / val from divtest;
+-------------------------------------+
| CAST(1 AS DECIMALV3(16, 2)) / `val` |
+-------------------------------------+
| 0.250000 |
| 0.200000 |
| 0.333333 |
+-------------------------------------+
This is because in the previous code, the constant 1.000 would be transformed into 1.
remove "ReduceType
* [Improve](performance) introduce SchemaCache to cache TabletSchame & Schema
1. When the system is under high-concurrency load with wide table point queries, the frequent memory allocation and deallocation of Schema become evident system bottlenecks. Additionally, the initialization of TabletSchema and Schema also becomes a CPU hotspot.Therefore, the introduction of a SchemaCache is implemented to cache these resources for reuse.
2. Make some variables wrapped with std::unique<unique_ptr>
Performance:
| 状态 | QPS | 平均响应时间 (avg) | P99 响应时间 |
|------------------|-----|------------------|-------------|
| 开启 SchemaCache | 501 | 20ms | 34ms |
| 关闭 SchemaCache | 321 | 31ms | 61ms |
* handle schema change with schema version
* remove useless header
* rebase
1. Delete the stats for data size, since it would cost too much time but useless
2. Make task time out configurable since when it's common to analyze a quite huge table that the default 10 min is not suitable
This pr is to fix the bug introduced by PR #19909
The bug failed to set column unique id for iceberg table, which will cause the query result for iceberg table are all NULL.
```
mysql> select * from iceberg_partition_lower_case_parquet limit 1;
+------+------+------+---------+
| k1 | k2 | k3 | city |
+------+------+------+---------+
| NULL | NULL | NULL | Beijing |
+------+------+------+---------+
1 row in set (0.60 sec)
```
After fix:
```
mysql> select * from iceberg_partition_lower_case_parquet limit 1;
+------+------+------+---------+
| k1 | k2 | k3 | city |
+------+------+------+---------+
| 1 | k2_1 | k3_1 | Beijing |
+------+------+------+---------+
1 row in set (0.35 sec)
```
when olap table is dynamic partition enable, if drop and recover olap table, the table should be added to DynamicPartitionScheduler again
---------
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
Refactoring the filtering conditions in the current ExecNode from an expression tree to an array can simplify the process of adding runtime filters. It eliminates the need for complex merge operations and removes the requirement for the frontend to combine expressions into a single entity.
By representing the filtering conditions as an array, each condition can be treated individually, making it easier to add runtime filters without the need for complex merging logic. The array can store the individual conditions, and the runtime filter logic can iterate through the array to apply the filters as needed.
This refactoring simplifies the codebase, improves readability, and reduces the complexity associated with handling filtering conditions and adding runtime filters. It separates the conditions into discrete entities, enabling more straightforward manipulation and management within the execution node.
Increase the functionality of advanced materialized view
This feature already supported by legacy planner with PR #19650
This PR implement it in Nereids. This PR implement the features as below:
1. Support multiple columns in aggregate function. eg: select sum(c1 + c2) from t1;
2. Supports complex expressions. eg: select abs(c1), sum(abc(c1+1) + 1) from t1;
TODO:
1. Support adding where in materialized view
* [Bug](point query) checkAndSetPointQuery before checkEnableTwoPhaseRead
1. checkEnableTwoPhaseRead rely on thr short circuit flag
2. add more metric to display lookup profile
* fix rebase
We can not use the string as the variable key to use in the hint.
Before this PR
mysql> SET enable_nereids_planner=true;
Query OK, 0 rows affected (0.01 sec)
mysql> set enable_fallback_to_original_planner=false;
Query OK, 0 rows affected (0.10 sec)
mysql> explain select /*+ SET_var("enable_nereids_planner" = "false") */ 1;
ERROR 1105 (HY000): Exception, msg: Nereids cannot parse the SQL, and fallback disabled. caused by:
no viable alternative at input 'select /*+ SET_var("enable_nereids_planner"'(line 1, pos 27)
After this PR
mysql> SET enable_nereids_planner=true;
Query OK, 0 rows affected (0.01 sec)
mysql> set enable_fallback_to_original_planner=false;
Query OK, 0 rows affected (0.10 sec)
mysql> select /*+ SET_var("enable_nereids_planner" = "false") */ 1;
+------+
| 1 |
+------+
| 1 |
+------+
1 row in set (0.00 sec)
Describe your changes.
Support the string for the hint key in the Parser.
Before change, when doing optimize use Nereids planner, input will serialize to memory first. And when bug happen, it would be dump to minidump file when catching the exception.
We found that serialization process will cause the performance when statistic message too large or when optimization time be small enough.
So the user minidump using should change to ONLY YOU OPEN MINIDUMP SWITCH(set enable_minidump=true;) can you use it.
1. Before this PR if rowset does not contain column which should be read for related SlotDescriptor will call `insert_default` to column, but it's not this real defautl value.Real default value relevant information should be provided by the frontend side.
2. Support fetch when light schema change is not enabled, but disable for AGG or UNIQUE MOR model
When the postgresql bit type size is 1, it reads as a java.lang.boolean via jdbc, and if we match against string,
it will display true or false. But the normal display should be a number,
so when I detect that the size of bit is 1, I will match it with boolean
1. Fix duplicate '/' in front-end request URI.
2. When the FileSystemSeparator is '\\', replace '\\' as '/'
Co-authored-by: labuladuo <labuladuo@douyu.tv>
Support the operator `PartitionTopN`, which can partition first and do the topn operation later in each partition. It used in the following case
```
-- Support push the filter down to the window and generate the PartitionTopN.
-- The plan change from `window -> filter` to `partitionTopN -> window -> filter`.
explain select * from (select * , row_number() over(partition by b order by a) as num from t ) tt where num <= 10;
-- Support push the limit down to the window and generate the PartitionTopN.
-- The plan change from `window -> limit` to `partitionTopN -> window -> limit `.
explain select row_number() over(partition by b order by a) as num from t limit 10;
-- Support push the topn down to the window and generate the PartitionTopN.
-- The plan change from `window -> topn` to `partitionTopN -> window -> topn `.
explain select row_number() over(partition by b order by a) as num from t order by num limit 10;
```
The FE part detail design:
1. Add the following rewrite rules:
- PUSHDOWN_FILTER_THROUGH_WINDOW
- PUSH_LIMIT_THROUGH_PROJECT_WINDOW
- PUSH_LIMIT_THROUGH_WINDOW
- PUSHDOWN_TOP_N_THROUGH_PROJECTION_WINDOW
- PUSHDOWN_TOP_N_THROUGH_WINDOW
2. Add the PartitionTopN node(LogicalPlan/ PhysicalPlan/ TranslatorPlan)
3. For the rewrite plan, there are several requests that need to meet:
- For the `Filter` part, only consider `</ <=/ =` conditions. And the filter conditions will be stored.
- For the `Window` part, we only support one window function. And we support the `row_number`, `rank`, `dense_rank` window functions. And the `partition by` key and `order by` key can not be empty at the same time. The `Window Frame` should be `UNBOUNDED to CURRENT`.
4. For the `PhysicalPartitionTopN`, the requested property is `Any`and the output property is its children's property.
That's the main details that are very important. For the other part, you can directly check the code.
Issue Number #18646
BE Part #19708
If the user manually removed a hive partition (remove the partition dir through hdfs), doris will failed to query the hive
table with an error message get file split failed for table. That is because the Hive metadata still contains the removed partition.
This pr is to fix this bug. Skip the not exist dirs.
The variable logType in ExternalCatalog is not persistent to disk, after refresh, it will be set to NULL and cause NPE. This pr is to fix the bug.
Also, remove the old type variable in ExternalCatalog, use logType instead.
Support collect hive external table statistics by running sql against hive table.
By running sql, we could collect all the statistics collected for Olap table, including the min, max value of String column.
With 3 BE (16 core, 64 GB), it cost less than 2 minutes to collect TPCH 100GB statistics for all columns of all tables.
Also less than 2 minutes to collect all columns statistics for SSB 100GB tables.
Support reading Hudi MOR table by using jni connector.
Note:
the FE part of the current PR is not completed all, and the BE part will be supplemented in next PR.