Minidump file wants to get information as much as possible, but when close the switch, these methods should not be called after refactor pr: #20049. Other place of doing more jobs after add Minidump feature also be checked.
This pr fix following two problems:
Problem1: Alter column comment make add dynamic partition failed inside issue #10811
create table with dynamic partition policy;
restart FE;
alter distribution column comment;
alter dynamic_partition.end to trigger add new partition by dynamic partition scheduler;
Then we got the error log, and the new partition create failed.
dynamic add partition failed: errCode = 2, detailMessage = Cannot assign hash distribution with different distribution cols. default is: [id int(11) NULL COMMENT 'new_comment_of_id'], db: default_cluster:example_db, table: test_2
Problem2: rename distributed column, make old partition insert failed. inside #20405
The key point of the reproduce steps is restart FE.
It seems all versions will be affected, include master and lts-1.1 and so on.
Currently, sql-block-rule can only be used for query statements, while it's useful for other stmts like insert / delete / alter / drop etc. Now remove the limitation and expand its using scenario.
1. Implement write/read for AnalysisManager
2. If database or table has any column with complex type, the analyze stmt would fail directly. Enable to ignore complex type columns and analyze rest of them in this PR
in legacy planner, when we new exchange, it inherit its child's limit and offset.
but in Nereids, we should not do this. because if we need set limit or offset, we will set it manually.
In this PR, we use a new ctor of ExchangeNode to ensure not set limit or offset unexpected.
This PR does the following:
1. This PR is a substantial refactor of the JDBC client architecture. The previous monolithic JDBC client has been refactored into an abstract base class `JdbcClient`, and a set of database-specific subclasses (e.g., `JdbcMySQLClient`, `JdbcOracleClient`, etc.), and the JdbcClient required config, abstract into an object. This allows for improved modularity, easier addition of support for new databases, and cleaner, more maintainable code. This change is backward-compatible and does not affect existing functionality.
2. As a result of client refactoring, OceanBaseClient can automatically recognize the mode of operation as MySQL or Oracle, so we cancel the oceanbase_mode property in the Jdbc Catalog, but due to the cancellation of the property, When creating a single OceanBase Jdbc Table, the table type needs to be filled in as oceanbase(mysql mode) or oceanbase_oracle(oracle_mode). The above work is a change in the usage behavior, please note.
3. For the PostgreSQL Jdbc Catalog, I did two things:
1. The adaptation to MATERIALIZED VIEW and FOREIGN TABLE is added
2. Fixed reading jsonb, which had been incorrectly changed to json in a previous PR
4. fix some jdbc catalog test case
5. modify oceanbase jdbc doc
And,Thanks @wolfboys for the guidance
fix below bugs:
1. not check filter's expression, aggregate function, grouping scalar function and window expression should not appear in filter
2. show not change nullable of aggregate function when it is window function in window expression
3. bitmap and other metric types should not appear in order by or partition by of window expression
1. In the past, we use a BE table named `analysis_jobs` to persist the status of analyze jobs/tasks, however there are many flaws such as, if BE crashed analyze job/task would failed however the status of analyze job/task couldn't get updated.
2. Support `DROP ANALYZE JOB [job_id]` to delete analyze job
3. Support `SHOW ANALYZE TASK STATUS [job_id] ` to get the task status of specific job
4. Restrict the execute condition of auto analyze, only when the last execution of auto analyze job finished a while ago could be executed again
5. Support analyze whole DB
Parallel scanning can result in some read amplification, for example, select * from xx where limit 1 actually requires only one row of data. However, due to parallel scanning of multiple tablets, read amplification occurs, leading to performance bottlenecks in high-concurrency scenarios. This PR Adding a SessionVariable to enforce serial scanning can help mitigate this issue.
Nereids planner include all columns index in TFileScanRangeParams, this may cause the column projection incorrect for
text format table. Because csv reader use the column index position to split a line. Extra column index will cause get
wrong split result. This PR is to reset the column index after Projection, remove the useless column index.
fix 3 bugs:
1. failed to insert into a table with mv.
```sql
create table t (
id int,
c1 int,
c2 int,
c3 int
) duplicate key(id)
distributed by hash(id) buckets 4
create materialized view k12s3m as select id, sum(c1), max(c3) from t group by id;
insert into t select -4, -4, -4, 'd';
```
insert will rise exception because mv column is not handled. now we will add a target column and value as defineExpr.
2. failed to insert into a table with not all the columns.
```sql
insert into t(c1, c2) select c1, c2 from t
```
and t(id ukey, c1, c2, c3), will insert too many data, we fix it by change the output partitions.
3. failed to insert into a table with complex select.
the select statement has join or agg, fix the bug by the way similar to the one at 2nd bug.
When using nereids, if we use compare operator of bitmap type, an analyze exception need to be throwed.
like:
select id from (select BITMAP_EMPTY() as c0 from expr_test) as ref0 where c0 = 1 order by id
Which c0 in subq0 is a bitmap type, this scenario is not supported right now.
update in-filter usage in pipeline mode:
1. if the target is local, we use in-bloom filter. Let BE choose in or bloom according to actual distinctive number
2. set default runtime_filter_max_in_num to 1024
Support refreshing ldap cache:
refresh ldap all;
refresh ldap;
refresh ldap for user1;
Support for caching non-existent ldap users.
When logging in with a doris user that does not exist in the Ldap service after ldap is enabled, avoid accessing the ldap service every time in scenarios such as show databases; that require a lot of authentication.