FIX
1. remove float and double literal toString and getStringValue introduced by
PR #23504 and PR #23271
These functions lead to wrong cast result of double and float literal
2. fix compute signature for datetimev2 always produce scale 6
3. fix stats calculator failed when generate node stats with two same column
4. constant fold on fe failed when cast double to integral
TODO
after fix the first problem, some mv matching not work well, fix them later
- test_dup_mv_div
- test_dup_mv_json
- test_tcu
materialized view def is as following:
> select l_linenumber, o_custkey
> from orders
> left join lineitem on lineitem.L_ORDERKEY = orders.O_ORDERKEY
> where o_custkey = 1;
when query is as following, it can be rewritten by mv above
it requires that query has reject null filters on the join right input,
current supported filter are "=", "<", "<=", ">", ">=", "<=>"
> select IFNULL(orders.O_CUSTKEY, 0) as custkey_not_null,
> case when l_linenumber in (1,2,3) then l_linenumber else o_custkey end as case_when
> from orders
> inner join lineitem on orders.O_ORDERKEY = lineitem.L_ORDERKEY
> where o_custkey = 1 and l_linenumber > 0;
Query rewrite by mv support bitmap_union and bitmap_union_count roll up, aggregate functions which supports roll up is listed as following:
| 查询中函数 | 物化视图中函数 | 函数上卷后 |
|------------------|--------------|--------------------|
| max | max | max |
| min | min | min |
| sum | sum | sum |
| count | count | sum |
| count(distinct ) | bitmap_union | bitmap_union_count |
| bitmap_union | bitmap_union | bitmap_union|
| bitmap_union_count | bitmap_union | bitmap_union_count |
this depends on https://github.com/apache/doris/pull/29256
when aggregate function roll up, we should check the qury and mv function argument is equal
such as mv def and query sql as following, it should not rewrite success, because the bitmap_union_basic field augument is
not equal to the `count(distinct case when o_shippriority > 10 and o_orderkey IN (1, 3) then o_custkey else null end)` field in query
mv def:
> select l_shipdate, o_orderdate, l_partkey, l_suppkey,
> sum(o_totalprice) as sum_total,
> max(o_totalprice) as max_total,
> min(o_totalprice) as min_total,
> count(*) as count_all,
> bitmap_union(to_bitmap(case when o_shippriority > 1 and o_orderkey IN (1, 3) then o_custkey else null end)) as bitmap_union_basic
> from lineitem
> left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate
> group by
> l_shipdate,
> o_orderdate,
> l_partkey,
> l_suppkey;
query sql:
> select t1.l_partkey, t1.l_suppkey, o_orderdate,
> sum(o_totalprice),
> max(o_totalprice),
> min(o_totalprice),
> count(*),
> count(distinct case when o_shippriority > 10 and o_orderkey IN (1, 3) then o_custkey else null end)
> from (select * from lineitem where l_shipdate = '2023-12-11') t1
> left join orders on t1.l_orderkey = orders.o_orderkey and t1.l_shipdate = o_orderdate
> group by
> o_orderdate,
> l_partkey,
> l_suppkey;
Add some cases for index compaction:
1. index compaction with duplicate key table
2. index compaction with unique key table
3. optimize index compaction with multi segments in one inverted index
Fix query rewrite by mv fail when self join, after fix query like following can be rewrited
def materialized view = """
select
a.o_orderkey,
count(distinct a.o_orderstatus) num1,
SUM(CASE WHEN a.o_orderstatus = 'o' AND a.o_shippriority = 1 AND a.o_orderdate = '2023-12-08' AND b.o_orderdate = '2023-12-09' THEN a.o_shippriority+b.o_custkey ELSE 0 END) num2,
SUM(CASE WHEN a.o_orderstatus = 'o' AND a.o_shippriority = 1 AND a.o_orderdate >= '2023-12-01' AND a.o_orderdate <= '2023-12-09' THEN a.o_shippriority+b.o_custkey ELSE 0 END) num3,
SUM(CASE WHEN a.o_orderstatus = 'o' AND a.o_shippriority in (1,2) AND a.o_orderdate >= '2023-12-08' AND b.o_orderdate <= '2023-12-09' THEN a.o_shippriority-b.o_custkey ELSE 0 END) num4,
AVG(a.o_totalprice) num5,
MAX(b.o_totalprice) num6,
MIN(a.o_totalprice) num7
from
orders a
left outer join orders b
on a.o_orderkey = b.o_orderkey
and a.o_custkey = b.o_custkey
group by a.o_orderkey;
"""
def query = """
select
a.o_orderkey,
SUM(CASE WHEN a.o_orderstatus = 'o' AND a.o_shippriority = 1 AND a.o_orderdate = '2023-12-08' AND b.o_orderdate = '2023-12-09' THEN a.o_shippriority+b.o_custkey ELSE 0 END) num2,
SUM(CASE WHEN a.o_orderstatus = 'o' AND a.o_shippriority = 1 AND a.o_orderdate >= '2023-12-01' AND a.o_orderdate <= '2023-12-09' THEN a.o_shippriority+b.o_custkey ELSE 0 END) num3,
SUM(CASE WHEN a.o_orderstatus = 'o' AND a.o_shippriority in (1,2) AND a.o_orderdate >= '2023-12-08' AND b.o_orderdate <= '2023-12-09' THEN a.o_shippriority-b.o_custkey ELSE 0 END) num4,
AVG(a.o_totalprice) num5,
MAX(b.o_totalprice) num6,
MIN(a.o_totalprice) num7
from
orders a
left outer join orders b
on a.o_orderkey = b.o_orderkey
and a.o_custkey = b.o_custkey
group by a.o_orderkey;
"""