When using an equal predicate on a column that applies an inverted index
with a parser, it requires remaining_after_evaluate. In this situation,
we cannot optimize the column without reading the data.
## Proposed changes
From (#36637)
this pr
1. picked #35630, which was reverted #36098 before.
2. picked #36344 from master
these two pr fixed existing bug about auto partition load.
---------
Co-authored-by: Kaijie Chen <ckj@apache.org>
pick from master #36316
expression cast( xx as decimal )'s datatype maybe decimalv3 or decimalv2
depending on enable_decimal_conversion value in fe conf file. if
enable_decimal_conversion is true, the datatype is decimalv3(9, 0), but
the datatype was decimalv3(38, 9) in 2.0 releases. So this pr change the
datatype same as 2.0 releases to keep the behavior consistent.
pick from master #35773
This PR introduces an optimization that adjusts the penalty applied
during join operations based on the volume of data on the build side.
Specifically, when the number of rows and width of the tables being
joined are equal, the materialization costs are now considered more
accurately. The update ensures that joins with a larger dataset on the
build side incur a higher penalty, improving overall query performance
and resource allocation.
cherry-pick #36193
Problem:
when using leading like:
leading(t1 {t2 t3} {t4 t5} t6)
it would not generate correct plan because levellist can not express
enough message of braces
Solved:
remove levellist express of leading levels and use reverse polish
expression
Algorithm:
leading(t1 {t2 t3} {t4 t5} t6)
==>
stack top to down(t1 t2 t3 join join t4 t5 join t6 join) when generate
leading join, we can pop items in stack, when it's a table, make
logicalscan when it's a join
operator, make logical join and push back to stack
1. `std::string` to `std::wstring` conversion only supports ASCII
characters. For non-ASCII characters, we need to use
`StringUtil::string_to_wstring`
2. Fix index_tool check_terms_stats_v2 and add field info to print
pick from master #36321
cherry-pick #34313 to branch-2.1
MergePercentileToArray is to perform a transformation in this case:
select ss_item_sk, percentile(ss_quantity,0.9), percentile(ss_quantity,0.6), percentile(ss_quantity,0.3)
from store_sales group by ss_item_sk;
==>
select ss_item_sk, percentile_array(ss_quantity,[0.3,0.6,0.9]) from store_sales group by ss_item_sk;
support data type ipv4/ipv6 with inverted index
and then we can query like "> or < or >= or <= or in/not in " this
conjuncts expr for ip with inverted index speeding up
## Proposed changes
Issue #31442
<!--Describe your changes.-->
1. The unit of the seventh parameter of `ZonedDateTime.of` is
nanosecond, so we should multiply the microsecond by 1000.
2. When writing to a non-partitioned iceberg table, the data path has an
extra slash
pick from master #34548
The modification involving CloudGlobalTransactionMgr was not picked up
to 2.1 because the 2.1 branch does not yet have the Thunderbolt
CloudGlobalTransactionMgr
## Proposed changes
This pr fixes some failed regression test about checking shape
<!--Describe your changes.-->
## Further comments
If this is a relatively large or complex change, kick off the discussion
at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why
you chose the solution you did and what alternatives you considered,
etc...