This commits forbid struct and map type to be distributed key/aggregation key.
The sql such as:
select distinct stuct_col from struct_table
will report an error.
1、support stream load with json, csv format for map
2、fix olap convertor when compaction action in map column which has null
3、support select outToFile for map
4、add some regression-test
Steps:
1. drop the old MTMV jobs
2. clear the old task records and clean the running and pending tasks
3. set the new scheduler info in MTMV and replay it in followers.
4. create a job in the master node.
Note that if you change the refresh info of MTMV, the old MTMV tasks will be cleaned.
The background is described in this issue: #15723,
where users used Apache Druid to satisfy such lambada requirements before.
We will not make Doris dropping data not belonged to current time window automatically like Druid,
which is not flexible. We demand a ability to support mutable/immutable partition, the PR works this way:
1. Support mutable property for a partition.
2. The mutable property of a partition is passed from FE to BE in a load procedure
3. If a record's partition is immutable, we mark this row as "un selected" which will not be included in computation of 'max_filter_ratio',
so that data write to immutable partition will be neglected and not cause load failure.
Use Example:
1. Add immutable partition or modify an partition to be immutable:
- alter table test_tbl add [temporary] partition xxx values less than ('xxx') ('mutable' = 'true');
- alter table test_tbl modify partition xx set ('mutable' = 'false');
2. Write 5 records into table, two of then belongs to immutable partition
Introduced a new function non_nullable to BE, which can extract concrete data column from a nullable column. If the input argument is already not a nullable column, raise an error.
From the original logic, query like `select * from a where exists (select * from b order by 1) order by 1 limit 1` is a query contains subquery,
but the top query will pass `checkEnableTwoPhaseRead` and set `isTwoPhaseOptEnabled=true`.So check the double plan is a general topn query plan is needed, and rollback the needMaterialize flag setted by the previous `analyze`.
* [improve](dynamic table) refine SegmentWriter columns writer generate
```
Dynamic Block consists of two parts, dynamic part of columns and static part of columns
static dynamic
| ----- | ------- |
the static ones are original _tablet_schame columns
the dynamic ones are auto generated and extended from file scan.
```
**We should only consisder to use Block info to generte columns when it's a dynamic table load procudure.**
And seperate the static ones and dynamic ones
* test
Set column names from path to lower case in case-insensitive case.
This is for Iceberg columns from path. Iceberg columns are case sensitive,
which may cause error for table with partitions.
Currently not support insert {1, 'a'} into struct<f1:tinyint, f2:varchar(20)>
This commit will support implicitly cast the char type in the struct to varchar.
Add implicitly cast for struct-type.
colocated join is depended on if the both side of the join conjuncts are simple column with same distribution policy etc. So the key is to figure out the original source column in scan node if there is one. To do that, we should check the slot from both lhs and rhs of outputSmap in join node.
In current implementation, the class Auth is used for:
Manager all authentication and authorization info such as user, role, password, privileges.
Provide an interface for privilege checking
Some user may want to integrate external access management system such as Apache Ranger.
So we should provide a way to let user set their own access controller.
This PR mainly changes:
A new class SystemAccessController
This access controller is used to check the global level privileges and resource privileges.
A new interface CatalogAccessController
This interface is used to check catalog/database/tbl level privileges.
It has a default implements InternalCatalogAccessController.
All privilege checking methods are moved from Auth to either SystemAccessController or
InternalCatalogAccessController
A new class AccessControllerManager
This is the entry point of privilege authentication. All methods previously called from Auth
now are called from AccessControllerManager
Now, user can implement the interface CatalogAccessController to use their own access controller.
And when creating external catalog, user can specified the access controller class name, so that
different external catalog can use different access controller.
For performance reason, we want to remove constant column from groupingExprs.
For example:
`select sum(T.A) from T group by T.B, 'xyz'` is equivalent to `select sum(T.A) from T group by T.B`
We can remove constant column `abc` from groupingExprs.
But there is an exception when all groupingExpr are constant
For example:
sql1: `select 'abc' from t group by 'abc'`
is not equivalent to
sql2: `select 'abc' from t`
sql3: `select 'abc', sum(a) from t group by 'abc'`
is not equivalent to
sql4: `select 1, sum(a) from t`
(when t is empty, sql3 returns 0 tuple, sql4 return 1 tuple)
We need to keep some constant columns if all groupingExpr are constant.
Consider sql5 `select a from (select "abc" as a, 'def' as b) T group by b, a;`
if the constant column `a` is in select list, this column should not be removed.
sql5 is transformed to
sql6 `select a from (select "abc" as a, 'def' as b) T group by a;`
Use FE cluster token to auth stream load.
This auth is only open for be, and fe auth still only support http basic auth.
I will use this auth for mysql load to build a no-auth stream load from fe to be.
And this will avoid double auth in mysql load.
More information to see the design doc.
Doris can use mysql-jdbc-jar to connect doris database, but doris has some data type that mysql without.
Such as DecimalV3 and Date/DatetimeV2
I add some case judgments in `Mysql Catalog` , so that Jdbc catalog can identify the data type of DORIS
when pg table have some unsupported column type like: point, polygon, jsonb......
jdbc catalog will convert it to string type in doris. but get result set in java is org.postgresql.util.PGobject
Some test need this pr: #16442
change implement of auth to rbac
each user has one default role which can not be drop;
if you grant priv to user,it will grant to default role ,
In the current pr, the user can still only have one role other than the default role, but in the future, the user and role will be many-to-many
rename PaloRole,PaloAuth,PaloPrivilege to Role,Auth,Privilege
This commit support:
1、Insert + select for struct/map type
2、Json stream load for struct type
3、m[key] function for map type
How to use:
Set the fe config to create table for struct and map type
1、admin set frontend config("enable_struct_type" = "true");
2、admin set frontend config("enable_map_type" = "true");
#16547
Co-authored-by: xy720 <xuyang25@baidu.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: hucheng01 <hucheng01@baidu.com>
There is something wrong with the `test_broker_load` suite(s3 auth problem).
So I ignore this case temporarily.
cc @wsjz , please help to solve it and add it back