if left bucket has no data, we do not generate left bucket instance.
These join should reserve all right side data. But because left instance
is not exists. So right data will be discard since no dest be set.
We ban these join temporarily until we could generate all instance
for left side in Coordinator.
For better performance and elasticity, we move memtable from loadchannel to
sink, VTabletSinkV2 is introduced, then there are VTabletWriter and
VTabletSinkV2 distributing rows to tablets. where clauses on mvs are
executed in VTabletWriter, while VTabletSinkV2 needs it too. So common code
is moved to row distribution.
Actually, we can layer code by rows' data flow, then the code is much more
understood and maintainable.
ScanNode -> Sink/Writer (RowDistribution -> IndexChannel / DeltaWriter)
In the workflow Code Checks, we use the event pull_request_target which has write permission to enable the actions to comment on our PRs. We should be careful with the write permission and must forbid from running any user code. The previous PR #24761 tried its best to achieve this goal.
However, there is a scenario lacking of consideration (See #26494). #26494 attacks the workflow by git submodule way. This PR fixes this scenario by checkouting the external action explicitly in the workflow.
1. not collect partition stats anymore
2. merge insert of stats
3. delete period collector since it is useless
4. remove enable_auto_sample
5. move some config related to stats to global session variable
Before this PR, when analyze a table, the insert count equals column count times 2
After this PR, insert count of analyze table would reduce to column count / insert_merge_item_count.
According to my test, when analyzing tpch lineitem, the insert sql count is 1
In the previous implementation, `bin/start_fe.sh --version` will
complain that "Frontend running as process xxx. Stop it first."
To show version
1. `bin/start_fe.sh --version` will print version info to fe.out
2. `bin/start_fe.sh --console --version` will print version info to stdout
Doris FE will check if stream load http request has auth token after checking password failed;
Plugin audit-log loader can use auth token if plugin config set use_auth_token to true
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
consider sql having in-subquery
SELECT count(*)
FROM sub_query_correlated_subquery6
WHERE k1 IN
(SELECT k1
FROM
(**SELECT k1,
sum(k3) AS bbb,
count(k2) AS aaa
FROM sub_query_correlated_subquery7
WHERE k1 > 0
AND k3 > 0
GROUP BY k1** ) y
WHERE y.aaa>0
AND k1>1);
The subquery part having agg is un-correlated, which can be unnested.
on the other side:
SELECT count(*)
FROM sub_query_correlated_subquery6
WHERE k1 IN
(SELECT k1
FROM
(**SELECT k1,
sum(k3) AS bbb,
count(k2) AS aaa
FROM sub_query_correlated_subquery7
WHERE k1 > 0
AND k3 > 0 and sub_query_correlated_subquery6.k1 > 2
GROUP BY k1** ) y
WHERE y.aaa>0
AND k1>1);
The subquery part having agg is correlated, which can't be unnested.
`VFileScanner` will try to append late arrival runtime filters in each loop of `ScannerScheduler::_scanner_scan`. However, `VFileScanner::_get_next_reader` only generates the `_push_down_conjuncts` in the first loop, so the late arrival runtime filters are ignored.