## Proposed changes
1. check data type whether can applied should not throw exception when real data type is subclass of signature data type
2. merge `SlotBinder` and `FunctionBinder` to `ExpressionAnalyzer` to skip rewrite the whole expression tree multiple times.
3. `ExpressionAnalyzer.buildCustomSlotBinderAnalyzer()` provide more refined code to bind slot by different parts and different priority
4. the origin slot binder has O(n^2) complexity, this pr use `Scope.nameToSlot` to support O(n) bind
5. modify some `Collection.stream()` to `ImmutableXxx.builder()` to remove some method call which are difficult to inline by jvm in the hot path, e.g. `Expression.<init>` and `AbstractTreeNode.<init>`
6. modify some `ImmutableXxx.copyOf(xxx)` to `Utils.fastToImmutableList(xxx)` to skip addition copy of the array
7. set init size to `Immutable.builder()` to skip some useless resize
8. lazy compute and cache some heavy operations, like `Scope.nameToSlot` and `CaseWhen.computeDataTypesForCoercion()`
(cherry picked from commit 83c2f5a95827136aac4f0a78c5e841e9a099858c)
Add materialized view availability regression test
when mv refresh_time is in the grace_period(unit is second), materialized view will be use to
query rewrite regardless of the base table is update or not
when mv refresh_time is out of the grace_period(unit is second), will check the base table is update or not
if update the materialized view will not be used to query rewrite
The `ADMIN SHOW` statement can not be executed with high version of mysql 8.x jdbc driver.
So I rename these statement, remove the `ADMIN` keywords.
1. ADMIN SHOW CONFIG -> SHOW CONFIG
2. ADMIN SHOW REPLICA -> SHOW REPLICA
3. ADMIN DIAGNOSE TABLET -> SHOW TABLET DIAGNOSIS
4. ADMIN SHOW TABLET -> SHOW TABLET
for compatibility, the old statements are still supported, but not recommend to use.
They will be removed in later version
- `CANCEL MATERIALIZED VIEW TASK taskId on mvName`
- CANCEL MATERIALIZED VIEW TASK, tasks("type"="mv") and jobs("type"="mv") support check auth use priv of mv
- tasks and jobs add column mvName and mvDbName,you can use `select * from tasks("type"="mv") where MvName="xxx"` get all tasks of mv
- fix `desc mv all` error
- fix p0 The task sequence is incorrect
in some case, if set incorrectly, will be cause BE core dump
10:18:19 *** SIGFPE integer divide by zero (@0x564853c204c8) received by PID 2132555
int max_scanners =
config::doris_scanner_thread_pool_thread_num / state->query_parallel_instance_num();
add:
select * from jobs("type"="mv");
select * from tasks("type"="mv");
select * from jobs("type"="insert");
select * from tasks("type"="insert");
add check priv for mv_infos("database"="xxx");
change JobType MTMV==>MV
Design Documentation Linked to #25514
Regression test add a new group: arrow_flight_sql,
./run-regression-test.sh -g arrow_flight_sql to run regression-test, can use jdbc:arrow-flight-sql to run all Suites whose group contains arrow_flight_sql.
./run-regression-test.sh -g p0,arrow_flight_sql to run regression-test, can use jdbc:arrow-flight-sql to run all Suites whose group contains arrow_flight_sql, and use jdbc:mysql to run other Suites whose group contains p0 but does not contain arrow_flight_sql.
Requires attention, the formats of jdbc:arrow-flight-sql and jdbc:mysql and mysql client query results are different, for example:
Datatime field type: jdbc:mysql returns 2010-01-02T05:09:06, mysql client returns 2010-01-02 05:09:06, jdbc:arrow-flight-sql also returns 2010-01-02 05:09 :06.
Array and Map field types: jdbc:mysql returns ["ab", "efg", null], {"f1": 1, "f2": "a"}, jdbc:arrow-flight-sql returns ["ab ","efg",null], {"f1":1,"f2":"a"}, which is missing spaces.
Float field type: jdbc:mysql and mysql client returns 6.333, jdbc:arrow-flight-sql returns 6.333000183105469, in query_p0/subquery/test_subquery.groovy.
If the query result is empty, jdbc:arrow-flight-sql returns empty and jdbc:mysql returns \N.
use database; and query should be divided into two SQL executions as much as possible. otherwise the results may not be as expected. For example: USE information_schema; select cast ("0.0101031417" as datetime) The result is 2000-01-01 03:14:1 (constant fold), select cast ("0.0101031417" as datetime) The result is null (no constant fold),
In addition, doris jdbc:arrow-flight-sql still has unfinished parts, such as:
Unsupported data type: Decimal256. INVALID_ARGUMENT: [INTERNAL_ERROR]Fail to convert block data to arrow data, error: [E3] write_column_to_arrow with type Decimal256
Unsupported null value of map key. INVALID_ARGUMENT: [INTERNAL_ERROR]Fail to convert block data to arrow data, error: [E33] Can not write null value of map key to arrow.
Unsupported data type: ARRAY<MAP<TEXT,TEXT>>
jdbc:arrow-flight-sql not support connecting to specify DB name, such asjdbc:arrow-flight-sql://127.0.0.1:9090/{db_name}", In order to be compatible with regression-test, use db_nameis added before all SQLs whenjdbc:arrow-flight-sql` runs regression test.
select timediff("2010-01-01 01:00:00", "2010-01-02 01:00:00");, error java.lang.NumberFormatException: For input string: "-24:00:00"
Introduction to Main Classes:
- MTMVService:MTMV services for other modules to call
- MTMVHookService:All operations that affect the MTMV
- MTMVJobManager:All operations that affect the MTMV job
- MTMVCacheManager:All operations that affect the MTMV Cache
- MTMVTask&MTMVJob:Inherit from job framework
Concurrent schema change and txn may cause dead lock. An example:
Txn T commit but not publish;
Run schema change or rollup on T's related partition, add alter replica R;
sc/rollup add a sched txn watermark M;
Restart fe;
After fe restart, T's loadedTblIndexes will clear because it's not save to disk;
T will publish version to all tablet, including sc/rollup's new alter replica R;
Since R not contains txn data, so the T will fail. It will then always waitting for R's data;
sc/rollup wait for txn before M to finish, only after that it will let R copy history data;
Since T's not finished, so sc/rollup will always wait, so R will nerver copy history data;
Txn T and sc/rollup will wait each other forever, cause dead lock;
Fix: because sc/rollup will ensure double write after the sched watermark M, so for finish transaction, when checking a alter replica:
if txn id is bigger than M, check it just like a normal replica;
otherwise skip check this replica, the BE will modify history data later.