pick from master #36759
multi statement support by PR #3050.
But there is a minor issue in implementation.
as MySQL dev doc say in
https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_command_phase_sp.html#sect_protocol_command_phase_sp_multi_statement
server should only process multi statement
when client set CLIENT_MULTI_STATEMENTS.
When client not set CLIENT_MULTI_STATEMENTS, server should treat query
as single statement.
but Doris do slightly different with MySQL server. Doris always treat
query as multi statement, but only return multi result when client set
CLIENT_MULTI_STATEMENTS. When client do not set CLIENT_MULTI_STATEMENTS,
Doris will return the last statement result only.
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
pick from master #36478
intro a new rule VARIANT_SUB_PATH_PRUNING to prune variant sub path.
for example, variant slot v in table t has two sub path: 'c1' and 'c2',
after this rule, select v['c1'] from t will only scan one sub path 'c1'
of v to reduce scan time.
This rule accomplishes all the work using two components. The Collector
traverses from the top down, collecting all the element_at functions on
the variant types, and recording the required path from the original
variant slot to the current element_at. The Replacer traverses from the
bottom up, generating the slots for the required sub path on scan,
union, and cte consumer. Then, it replaces the element_at with the
corresponding slot.
cherry-pick #36161 to branch-2.1
NormalizeAggregate rewrite logic has a bug, for sql like this:
SELECT
CASE
1 WHEN CAST( NULL AS SIGNED ) THEN NULL
WHEN COUNT( DISTINCT CAST( NULL AS SIGNED ) ) THEN NULL
ELSE null
END ;
This is the plan after NormalizeAggregate, the LogicalAggregate only
output `count(DISTINCT cast(NULL as SIGNED))`#3, do not output cast(NULL
as SIGNED)#2, but the upper project use cast(NULL as SIGNED)#2, so Doris
report error "cast(NULL as SIGNED) not in aggregate's output".
LogicalResultSink[29] ( outputExprs=[__case_when_0#1] ) +--LogicalProject[26] ( distinct=false, projects=[CASE WHEN (1 = cast(NULL as SIGNED)#2) THEN NULL WHEN (1 = count(DISTINCT cast(NULL as SIGNED))#3) THEN NULL ELSE NULL END AS `CASE WHEN (1 = cast(NULL as SIGNED)) THEN NULL WHEN (1 = count(DISTINCT cast(NULL as SIGNED))) THEN NULL ELSE NULL END`#1], excepts=[] )
+--LogicalAggregate[25] ( groupByExpr=[], outputExpr=[count(DISTINCT cast(NULL as SIGNED)#2) AS `count(DISTINCT cast(NULL as SIGNED))`#3], hasRepeat=false )
+--LogicalProject[24] ( distinct=false, projects=[cast(NULL as SIGNED) AS `cast(NULL as SIGNED)`#2], excepts=[] )
+--LogicalOneRowRelation ( projects=[0 AS `0`#0] )
The problem is that the cast(NULL as SIGNED)#2 should not outputted by
LogicalAggregate, cast(NULL as SIGNED) should be computed in
LogicalProject.
This pr change the upper project projections rewrite logic:
aggregateOutputs is rewritten and become the upper-level LogicalProject
projections. During the rewriting process, the expressions inside the
agg function can be rewritten with expressions in aggregate function
arguments and group by expressions, but the ones outside the agg
function can only be rewritten with group by expressions.
---------
Co-authored-by: moailing <moailing@selectdb.com>
this pr
1. picked #35630, which was reverted #36098 before.
2. picked #36344 from master
these two pr fixed existing bug about auto partition load.
---------
Co-authored-by: Kaijie Chen <ckj@apache.org>
pick from master #36316
expression cast( xx as decimal )'s datatype maybe decimalv3 or decimalv2
depending on enable_decimal_conversion value in fe conf file. if
enable_decimal_conversion is true, the datatype is decimalv3(9, 0), but
the datatype was decimalv3(38, 9) in 2.0 releases. So this pr change the
datatype same as 2.0 releases to keep the behavior consistent.
bp: #36432
## Proposed changes
## Fixed Bugs introduced from #34307
1. `FileSystemCacheKey.equals()` compares properties by `==`, resulting
in creating new file system in each partition
2. `dfsFileSystem` is not synchronized, resulting in creating more file
systems than need.
3. `jobConf.iterator()` will produce more than 2000 pairs of key-value
bp #36045, and turn on batch split, which is turn off in #36109
Generate and get split batch concurrently.
`SplitSource.getNextBatch` remove the synchronization, and make each get their splits concurrently, and `SplitAssignment` generates splits asynchronously.
pick from master #35773
This PR introduces an optimization that adjusts the penalty applied
during join operations based on the volume of data on the build side.
Specifically, when the number of rows and width of the tables being
joined are equal, the materialization costs are now considered more
accurately. The update ensures that joins with a larger dataset on the
build side incur a higher penalty, improving overall query performance
and resource allocation.
cherry-pick #36193
Problem:
when using leading like:
leading(t1 {t2 t3} {t4 t5} t6)
it would not generate correct plan because levellist can not express
enough message of braces
Solved:
remove levellist express of leading levels and use reverse polish
expression
Algorithm:
leading(t1 {t2 t3} {t4 t5} t6)
==>
stack top to down(t1 t2 t3 join join t4 t5 join t6 join) when generate
leading join, we can pop items in stack, when it's a table, make
logicalscan when it's a join
operator, make logical join and push back to stack
cherry pick from #36326
An exception throw in TryEliminateUninterestedPredicates, for this case
CREATE TABLE `tbltest` (
`id` INT NULL,
`col2` VARCHAR(255) NULL,
`col3` VARCHAR(255) NULL,
`dt` DATE NULL
) ENGINE=OLAP
DUPLICATE KEY(`id`, `col2`)
PARTITION BY RANGE(`dt`)
(PARTITION p20240617 VALUES [('2024-06-17'), ('2024-06-18')))
DISTRIBUTED BY HASH(`id`) BUCKETS 10
PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);
select * from tbltest
where
case
when col2 = 'xxx' and col3='yyy' then false -- note this is not about partition column
when col2 in ('xxx') then false
when col2 like 'xxx%' then false
else true
end
The `CaseWhen` require children should be `WhenClause`, TryEliminateUninterestedPredicates maybe rewrite the WhenClause to true/false predicate, and cause this exception:
ERROR 1105 (HY000): errCode = 2, detailMessage = The children format needs to be [WhenClause+, DefaultValue?]
Original extractor(TryEliminateUninterestedPredicates.java) caused some errors while try to derive the expressions which can be used for pruning partitions.
I tried to write a new extractor(and with unit tests) for pruning partitions, it is more simple and reliable (I think).
The theory of extractor is pretty simple:
A:Sort the expression in two kinds:
1. evaluable-expression (let's mark it as E).
Expressions that can be evaluated in the partition pruning stage.
In the other word: not contains non-partition slots or deterministic expression.
2. un-evaluable-expression (let's mark it as UE).
Expressions that can NOT be evaluated in the partition pruning stage.
In the other word: contains non-partition slots or deterministic expression.
B: Travel the predicate, only point on AND and OR operator, following the rule:
(E and UE) -> (E and TRUE) -> E
(UE and UE) -> TRUE
(E and E) -> (E and E)
(E or UE) -> TRUE
(UE or UE) -> TRUE
(E or E) -> (E or E)
Currently the index change job and clone task can be executed at the
same time. If the clone task gets stuck at this point, it will cause the
index change job to get stuck as well and keep retrying. To solve this
problem, we can refer to alter job and make index change job exclusive
with clone task, and introduce the timeout to prevent infinite retries
of build index.
Add the following checks and status in FE.
1. Check if table is stable (build index is not allowed when clone is in
progress)
1.1. Tablet is HEALTHY.
1.2. Whether the tablet is included in the Tablet scheduler, if so, it
means the current tablet is doing clone.
2. When creating the index change job, set the timeout at the same time.
pick from master #35724