Use id as where predicate to load column statistic cache. This could improve performance, because id is the first order key in column statistics table.
Remove retry load when load stats cache fail. This case usually happens when BE is down or BE OOM, retry doesn't work in these cases and may increase BE work load.
We change memtable size from 200MB to 100MB to achieve smoother flush
performance. We change loadStreamPerNode from 20 to 60 to avoid stream
rpc to be the bottleneck when enable memtable_on_sink_node. We change
default s3&broker load parallelsim to make the most of CPUs on moderm
multi-core systems.
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
project child not always NamedExpression
failed msg
```
org.apache.doris.common.AnalysisException: errCode = 2, detailMessage = class org.apache.doris.nereids.trees.expressions.literal.VarcharLiteral cannot be cast to class org.apache.doris.nereids.trees.expressions.NamedExpression (org.apache.doris.nereids.trees.expressions.literal.VarcharLiteral and org.apache.doris.nereids.trees.expressions.NamedExpression are in unnamed module of loader 'app')
at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:623) ~[classes/:?]
at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:478) ~[classes/:?]
at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:457) ~[classes/:?]
at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:245) ~[classes/:?]
at org.apache.doris.qe.MysqlConnectProcessor.handleQuery(MysqlConnectProcessor.java:166) ~[classes/:?]
at org.apache.doris.qe.MysqlConnectProcessor.dispatch(MysqlConnectProcessor.java:193) ~[classes/:?]
at org.apache.doris.qe.MysqlConnectProcessor.processOnce(MysqlConnectProcessor.java:246) ~[classes/:?]
at org.apache.doris.mysql.ReadListener.lambda$handleEvent$0(ReadListener.java:52) ~[classes/:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
Caused by: java.lang.ClassCastException: class org.apache.doris.nereids.trees.expressions.literal.VarcharLiteral cannot be cast to class org.apache.doris.nereids.trees.expressions.NamedExpression (org.apache.doris.nereids.trees.expressions.literal.VarcharLiteral and org.apache.doris.nereids.trees.expressions.NamedExpression are in unnamed module of loader 'app')
at org.apache.doris.nereids.trees.plans.physical.PhysicalSetOperation.pushDownRuntimeFilter(PhysicalSetOperation.java:178) ~[classes/:?]
at org.apache.doris.nereids.trees.plans.physical.PhysicalHashJoin.pushDownRuntimeFilter(PhysicalHashJoin.java:229) ~[classes/:?]
at org.apache.doris.nereids.processor.post.RuntimeFilterGenerator.pushDownRuntimeFilterCommon(RuntimeFilterGenerator.java:386) ~[classes/:?]
```
We will Integrate new load job manager into new job scheduling framework so that the insert into task can be scheduled after the broker load sql is converted to insert into TVF(table value function) sql.
issue: https://github.com/apache/doris/issues/24221
Now support:
1. load data by tvf insert into sql, but just for simple load(columns need to be defined in the table)
2. show load stmt
- job id, label name, job state, time info
- simple progress
3. cancel load from db
4. support that enable new load through Config.enable_nereids_load
5. can replay job after restarting doris
TODO:
- support partition insert job
- support show statistics from BE
- support multiple task and collect task statistic
- support transactional task
- need add ut case
* [improvement] (nereids) Get partition related table disable nullable field and modify regression test, complete agg mv rules.
* make filed not null to create partition mv
Fix partition name NPE and sample for all table during auto analyze.
Sample for all tables because getData may have latency, which may cause full analyze a huge table and use too much resource. Sample for all tables to avoid this. Will improve the strategy later.