disable is null predicate constant fold rule for inline view
consider sql
select c.*
from (
select a.*, b.x
from test_insert a left join
(select 'some_const_str' x from test_insert) b on true
) c
where c.x is null;
when push “c.x is null” into c, after folding constant rule, it will get empty result. Because x is 'some_const_str' and "x is null" will be evaluated to false. This is wrong.
Optimize orc/parquet string dict filter in not_single_conjunct case. We can optimize this processing to filter block firstly by dict code, then filter by not_single_conjunct. Because dict code is int, it will filter faster than string.
For example:
```
select count(l_receiptdate) from lineitem_date_as_string where l_shipmode in ('MAIL', 'SHIP') and l_commitdate < l_receiptdate and l_receiptdate >= '1994-01-01' and l_receiptdate < '1995-01-01';
```
`l_receiptdate` and `l_shipmode` will using string dict filtering, and `l_commitdate < l_receiptdate` is the an not_single_conjunct which contains dict filter field. We can optimize this processing to filter block firstly by dict code, then filter by not_single_conjunct. Because dict code is int, it will filter faster than string.
### Test Result:
Before:
mysql> select count(l_receiptdate) from lineitem_date_as_string where l_shipmode in ('MAIL', 'SHIP') and l_commitdate < l_receiptdate and l_receiptdate >= '1994-01-01' and l_receiptdate < '1995-01-01';
+----------------------+
| count(l_receiptdate) |
+----------------------+
| 49314694 |
+----------------------+
1 row in set (6.87 sec)
After:
mysql> select count(l_receiptdate) from lineitem_date_as_string where l_shipmode in ('MAIL', 'SHIP') and l_commitdate < l_receiptdate and l_receiptdate >= '1994-01-01' and l_receiptdate < '1995-01-01';
+----------------------+
| count(l_receiptdate) |
+----------------------+
| 49314694 |
+----------------------+
1 row in set (4.85 sec)
forbid one phase agg for pattern: agg-unionAll
one phase agg plan: agg-union-hashDistribute-children
two phase agg plan: agg(global) - hashDistribute-agg(local)-union-randomDistribute
the key point is the cost of randomDistribute is much lower than the hashDistribute, and hence two-phase agg wins.
if left bucket has no data, we do not generate left bucket instance.
These join should reserve all right side data. But because left instance
is not exists. So right data will be discard since no dest be set.
We ban these join temporarily until we could generate all instance
for left side in Coordinator.
For better performance and elasticity, we move memtable from loadchannel to
sink, VTabletSinkV2 is introduced, then there are VTabletWriter and
VTabletSinkV2 distributing rows to tablets. where clauses on mvs are
executed in VTabletWriter, while VTabletSinkV2 needs it too. So common code
is moved to row distribution.
Actually, we can layer code by rows' data flow, then the code is much more
understood and maintainable.
ScanNode -> Sink/Writer (RowDistribution -> IndexChannel / DeltaWriter)
In the workflow Code Checks, we use the event pull_request_target which has write permission to enable the actions to comment on our PRs. We should be careful with the write permission and must forbid from running any user code. The previous PR #24761 tried its best to achieve this goal.
However, there is a scenario lacking of consideration (See #26494). #26494 attacks the workflow by git submodule way. This PR fixes this scenario by checkouting the external action explicitly in the workflow.
1. not collect partition stats anymore
2. merge insert of stats
3. delete period collector since it is useless
4. remove enable_auto_sample
5. move some config related to stats to global session variable
Before this PR, when analyze a table, the insert count equals column count times 2
After this PR, insert count of analyze table would reduce to column count / insert_merge_item_count.
According to my test, when analyzing tpch lineitem, the insert sql count is 1
In the previous implementation, `bin/start_fe.sh --version` will
complain that "Frontend running as process xxx. Stop it first."
To show version
1. `bin/start_fe.sh --version` will print version info to fe.out
2. `bin/start_fe.sh --console --version` will print version info to stdout
Doris FE will check if stream load http request has auth token after checking password failed;
Plugin audit-log loader can use auth token if plugin config set use_auth_token to true
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>