pick from master #36759
multi statement support by PR #3050.
But there is a minor issue in implementation.
as MySQL dev doc say in
https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_command_phase_sp.html#sect_protocol_command_phase_sp_multi_statement
server should only process multi statement
when client set CLIENT_MULTI_STATEMENTS.
When client not set CLIENT_MULTI_STATEMENTS, server should treat query
as single statement.
but Doris do slightly different with MySQL server. Doris always treat
query as multi statement, but only return multi result when client set
CLIENT_MULTI_STATEMENTS. When client do not set CLIENT_MULTI_STATEMENTS,
Doris will return the last statement result only.
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
pick from master #36478
intro a new rule VARIANT_SUB_PATH_PRUNING to prune variant sub path.
for example, variant slot v in table t has two sub path: 'c1' and 'c2',
after this rule, select v['c1'] from t will only scan one sub path 'c1'
of v to reduce scan time.
This rule accomplishes all the work using two components. The Collector
traverses from the top down, collecting all the element_at functions on
the variant types, and recording the required path from the original
variant slot to the current element_at. The Replacer traverses from the
bottom up, generating the slots for the required sub path on scan,
union, and cte consumer. Then, it replaces the element_at with the
corresponding slot.
cherry-pick #36161 to branch-2.1
NormalizeAggregate rewrite logic has a bug, for sql like this:
SELECT
CASE
1 WHEN CAST( NULL AS SIGNED ) THEN NULL
WHEN COUNT( DISTINCT CAST( NULL AS SIGNED ) ) THEN NULL
ELSE null
END ;
This is the plan after NormalizeAggregate, the LogicalAggregate only
output `count(DISTINCT cast(NULL as SIGNED))`#3, do not output cast(NULL
as SIGNED)#2, but the upper project use cast(NULL as SIGNED)#2, so Doris
report error "cast(NULL as SIGNED) not in aggregate's output".
LogicalResultSink[29] ( outputExprs=[__case_when_0#1] ) +--LogicalProject[26] ( distinct=false, projects=[CASE WHEN (1 = cast(NULL as SIGNED)#2) THEN NULL WHEN (1 = count(DISTINCT cast(NULL as SIGNED))#3) THEN NULL ELSE NULL END AS `CASE WHEN (1 = cast(NULL as SIGNED)) THEN NULL WHEN (1 = count(DISTINCT cast(NULL as SIGNED))) THEN NULL ELSE NULL END`#1], excepts=[] )
+--LogicalAggregate[25] ( groupByExpr=[], outputExpr=[count(DISTINCT cast(NULL as SIGNED)#2) AS `count(DISTINCT cast(NULL as SIGNED))`#3], hasRepeat=false )
+--LogicalProject[24] ( distinct=false, projects=[cast(NULL as SIGNED) AS `cast(NULL as SIGNED)`#2], excepts=[] )
+--LogicalOneRowRelation ( projects=[0 AS `0`#0] )
The problem is that the cast(NULL as SIGNED)#2 should not outputted by
LogicalAggregate, cast(NULL as SIGNED) should be computed in
LogicalProject.
This pr change the upper project projections rewrite logic:
aggregateOutputs is rewritten and become the upper-level LogicalProject
projections. During the rewriting process, the expressions inside the
agg function can be rewritten with expressions in aggregate function
arguments and group by expressions, but the ones outside the agg
function can only be rewritten with group by expressions.
---------
Co-authored-by: moailing <moailing@selectdb.com>
## Proposed changes
Fix type check of ubsan.
```
/root/doris/be/src/vec/exec/format/parquet/fix_length_plain_decoder.h:75:78: runtime error: member call on address 0x5582f35db5c0 which does not point to an object of type 'doris::vectorized::ColumnVector<signed char>'
0x5582f35db5c0: note: object is of type 'doris::vectorized::ColumnVector<int>'
83 55 00 00 78 c0 b0 5a 82 55 00 00 02 00 00 00 00 00 00 00 10 a0 00 d7 83 55 00 00 10 a0 00 d7
^~~~~~~~~~~~~~~~~~~~~~~
vptr for 'doris::vectorized::ColumnVector<int>'
doris::Status doris::vectorized::FixLengthPlainDecoder::_decode_values<false>(COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const>&, doris::vectorized::ColumnSelectVector&, bool) at fix_length_plain_decoder.h:75:78
```
When using an equal predicate on a column that applies an inverted index
with a parser, it requires remaining_after_evaluate. In this situation,
we cannot optimize the column without reading the data.
## Proposed changes
From (#36637)