doris

Author	SHA1	Message	Date
Mingyu Chen	97230a54fb	[Refactor](auth)(step-2) Add AccessController to support customized authorization (#16802 ) Support specifying AccessControllerFactory when creating catalog create catalog hive properties( ... "access_controller.class" = "org.apache.doris.mysql.privilege.RangerAccessControllerFactory", "access_controller.properties.prop1" = "xxx", "access_controller.properties.prop2" = "yyy", ... ) So that user can specified their own access controller, such as RangerAccessController Add interface to check column level privilege A new method of CatalogAccessController: checkColsPriv(), for checking column level privileges. TODO: Support grant column level privileges statements in Doris Add TestExternalCatalog/Database/Table/ScanNode These classes are used for FE unit test. In unit test you can create catalog test1 properties( "type" = "test" "catalog_provider.class" = "org.apache.doris.datasource.ColumnPrivTest$MockedCatalogProvider" "access_controller.class" = "org.apache.doris.mysql.privilege.TestAccessControllerFactory", "access_controller.properties.key1" = "val1", "access_controller.properties.key2" = "val2" ); To create a test catalog, and specify catalog_provider to mock database/table/schema metadata Set roles in current user identity in connection context The roles can be used for authorization in access controller.	2023-02-20 10:32:48 +08:00
zhangstar333	5291f14aff	[vectorized](udf) java udf support array type (#16841 )	2023-02-20 10:00:25 +08:00
zhangdong	1c6c28b8fb	[Enhance](ComputeNode) K8sDeployManager support domain (#16897 ) Describe your changes. 1.DeployManager adds the ability to obtain domain names from third-party systems 2.When the DeployManager determines whether the node exists, add the domain name judgment logic 3.rename Backend.getHost() to getIp() 4.Delete the logic for handling UnknownHostException in FQDNManager, because there are two cases of UnknownHostException. If it occurs temporarily, it can wait for the next detection. If the node is deleted, the logic can be handed over to DeployManager for processing.	2023-02-19 21:30:18 +08:00
xy720	73f7979b73	[fix](struct-type) forbid struct-type to be distributed key/aggregation key and add more tests (#16626 ) This commits forbid struct and map type to be distributed key/aggregation key. The sql such as: select distinct stuct_col from struct_table will report an error.	2023-02-19 15:16:36 +08:00
huangzhaowei	96a3c60d3b	[feature-wip](MTMV) Support alter statement (#16817 ) Steps: 1. drop the old MTMV jobs 2. clear the old task records and clean the running and pending tasks 3. set the new scheduler info in MTMV and replay it in followers. 4. create a job in the master node. Note that if you change the refresh info of MTMV, the old MTMV tasks will be cleaned.	2023-02-19 12:15:17 +08:00
jakevin	d4cebb39ba	[fix](Nereids): fix SemiJoinLogicalJoinTransposeProject. (#16883 )	2023-02-18 23:12:34 +08:00
zhengshengjun	e2e6a0dd83	[Feature](load) Support mutable property for partition (#16036 ) The background is described in this issue: #15723, where users used Apache Druid to satisfy such lambada requirements before. We will not make Doris dropping data not belonged to current time window automatically like Druid, which is not flexible. We demand a ability to support mutable/immutable partition, the PR works this way: 1. Support mutable property for a partition. 2. The mutable property of a partition is passed from FE to BE in a load procedure 3. If a record's partition is immutable, we mark this row as "un selected" which will not be included in computation of 'max_filter_ratio', so that data write to immutable partition will be neglected and not cause load failure. Use Example: 1. Add immutable partition or modify an partition to be immutable: - alter table test_tbl add [temporary] partition xxx values less than ('xxx') ('mutable' = 'true'); - alter table test_tbl modify partition xx set ('mutable' = 'false'); 2. Write 5 records into table, two of then belongs to immutable partition	2023-02-18 23:09:34 +08:00
xy720	45427b86be	[regression](struct-type) add more regression tests for struct and map type (#16790 ) This commit forbid struct and map column in Materialized view and add more regression tests.	2023-02-18 20:42:17 +08:00
AKIRA	861e4bc64a	[fix](planner) Nullable of slot descriptor is mistaken and cause BE crash #16862	2023-02-18 20:39:56 +08:00
Stalary	070f42c463	[Enhancement](Es): Support config like whether push down to es (#16800 ) Support config like whether push down to es and refactor some code Like transform to wildcard query and push down to es, this increases the cpu consumption of the es, I add a switch control it.	2023-02-17 21:56:11 +08:00
谢健	fd5d7d6097	[refactor](Nereids) remove local sort (#16819 ) After adding phase in sort, the locatSort is no longer needed change the order of sortPhase in constructor	2023-02-17 18:52:41 +08:00
pengxiangyu	6a1e3d3435	[fix](cooldown)Fix bug for single cooldown compaction, add remote meta (#16812 ) * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction	2023-02-17 15:13:06 +08:00
lihangyu	6acee1ce88	[Fix](topn opt) double check plan From OriginalPlanner to make sure optimized SQL is a general topn query (#16848 ) From the original logic, query like `select * from a where exists (select * from b order by 1) order by 1 limit 1` is a query contains subquery, but the top query will pass `checkEnableTwoPhaseRead` and set `isTwoPhaseOptEnabled=true`.So check the double plan is a general topn query plan is needed, and rollback the needMaterialize flag setted by the previous `analyze`.	2023-02-17 10:59:35 +08:00
zhangdong	1fc5023d97	[Enhance](ComputeNode) K8sDeployManager support computeNode (#16789 ) 1.allow have no ELECTABLE or BACKEND 2.add cn NodeType 3.delete deprecated code	2023-02-17 09:08:14 +08:00
Gabriel	b35998a3b7	[Bug](datetimev2) Support cast datetimev2 to datetimev2 with different precision #16826	2023-02-17 08:42:36 +08:00
starocean999	4c7f19ab02	[enhancement](nereids) add eliminate left nullaware anti join rule (#16774 ) if no join conjunct is nullable, the left null aware anti join can be converted to left anti join	2023-02-16 21:54:14 +08:00
mch_ucchi	407ccaaff7	[FIx](planner) create table as select with null_type select item cause be core bug (#16778 ) sql: create table t as select null as k will cause be core sometime. now we change it null_type to tinyint nullable to avoid it.	2023-02-16 20:01:13 +08:00
Jibing-Li	292926e5aa	[Fix](multi catalog)Fix partition case bug (#16763 ) Set column names from path to lower case in case-insensitive case. This is for Iceberg columns from path. Iceberg columns are case sensitive, which may cause error for table with partitions.	2023-02-16 15:47:23 +08:00
奕冷	fa052b1a87	[fix](Stmt)pre-block create stmt with column type `ALL` (#16757 )	2023-02-16 15:05:13 +08:00
Gabriel	105a4fb41a	[regression](fuzzy) Make pipeline engine fuzzy test mode (#16807 )	2023-02-16 15:02:27 +08:00
mch_ucchi	b6f2dfa994	[test](Nereids) add not nullable test for scalar functions (#16498 )	2023-02-16 11:57:19 +08:00
Gabriel	0bb6005143	[Improvement](thrift) optimize thrift messages (#16383 ) Now we use a thrift message per fragment instance. However, there are many same messages between instances in a fragment. So this PR aims to extract the same messages and we only need to send thrift message once for a fragment	2023-02-16 11:07:46 +08:00
zhangdong	118ce9cb16	[Enhance](ComputeNode) change logic of BeSelectionPolicy.getCandidateBackends (#16737 ) The previous logic is how many cn can be returned at most. Instead, if the number of cn is less than expectBeNum, need to use mix to fill in, until the number of cn equals with expectBeNum or mix nodes are also used up	2023-02-16 10:31:24 +08:00
jakevin	958aee38e9	[fix](Nereids): fix Master Bors problem. (#16794 )	2023-02-16 01:53:53 +08:00
jakevin	ecadd4b392	[feature](Nereids): add OuterJoinAssoc rule (#16676 ) * move isIntersecting. * [feature](Nereids): add OuterJoinAssoc rule * fix bug * fix	2023-02-15 19:19:28 +08:00
Pxl	f4ed52906a	[Feature](Materialized-View) change mv rewrite from bottom up to up bottom && Compatible with old … (#16750 ) 1.change mv rewrite from bottom up to up bottom 2.compatible with old version mv 3.restore some ut codes (but disable) 4. fix some ut introduced by [fix](planner)fix bug for missing slot #16601 and [Feature](Materialized-View) support multiple slot on one column in materialized view #16378	2023-02-15 17:24:46 +08:00
xy720	0c56a4622c	[Feature](struct-type) Add implicitly cast for struct-type (#16613 ) Currently not support insert {1, 'a'} into struct<f1:tinyint, f2:varchar(20)> This commit will support implicitly cast the char type in the struct to varchar. Add implicitly cast for struct-type.	2023-02-15 16:55:00 +08:00
Xiangyu Wang	a6bda81dba	[Fix](profile) fix `/query_profile` action. (#16540 ) Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>	2023-02-15 14:27:21 +08:00
jakevin	ad46e529d8	[feature](Nereids): Infer isNotNull from filter and eliminate OuterJoin (#16411 )	2023-02-15 13:33:21 +08:00
starocean999	13134c1bfe	[fix](fe)should check slot from both lhs and rhs of outputSmap of join node for colocate join (#16738 ) colocated join is depended on if the both side of the join conjuncts are simple column with same distribution policy etc. So the key is to figure out the original source column in scan node if there is one. To do that, we should check the slot from both lhs and rhs of outputSmap in join node.	2023-02-15 12:44:20 +08:00
Mingyu Chen	69c70d27bd	[Refactor](auth) Add AccessController to support customized authorization (#16679 ) In current implementation, the class Auth is used for: Manager all authentication and authorization info such as user, role, password, privileges. Provide an interface for privilege checking Some user may want to integrate external access management system such as Apache Ranger. So we should provide a way to let user set their own access controller. This PR mainly changes: A new class SystemAccessController This access controller is used to check the global level privileges and resource privileges. A new interface CatalogAccessController This interface is used to check catalog/database/tbl level privileges. It has a default implements InternalCatalogAccessController. All privilege checking methods are moved from Auth to either SystemAccessController or InternalCatalogAccessController A new class AccessControllerManager This is the entry point of privilege authentication. All methods previously called from Auth now are called from AccessControllerManager Now, user can implement the interface CatalogAccessController to use their own access controller. And when creating external catalog, user can specified the access controller class name, so that different external catalog can use different access controller.	2023-02-15 11:40:44 +08:00
谢健	db9319b881	[refactor](Nereids) add two phase sort (#16586 ) 1. Add a rule that generates two-phase sort and one-phase sort 2. Add phase for PhysicalSort TODO: I'll remove PhysicalLocalSort in next PR.	2023-02-15 10:40:57 +08:00
zhengshengjun	d013d529c8	[Feature](ipv6)Support IPV6 (#14063 ) Support IPV6 in Apache Doris, the main changes are: 1. enable binding to IPV6 address if network priority in config file contains an IPV6 CIDR string 2. BRPC and HTTP support binding to IPV6 address 3. BRPC and HTTP support visiting IPV6 Services	2023-02-14 21:43:10 +08:00
wangbo	acf5540a9f	[fix](planner)Fix colocate query failed #16459 Issue Number: close #16458 Co-authored-by: wangbo36@meituan.com <wangbo36@meituan.com>	2023-02-14 18:51:28 +08:00
谢健	4444abc828	avoid contruct groupExpr in graph-simplifier (#16436 ) Signed-off-by: xiejiann <jianxie0@gmail.com>	2023-02-14 17:03:21 +08:00
Pxl	ea78184551	[Feature](Materialized-View) support multiple slot on one column in materialized view (#16378 )	2023-02-14 16:10:50 +08:00
Mingyu Chen	5e80823c86	[improvement](dynamic-partition) add storage_medium property for dynamic partition (#16648 )	2023-02-14 15:14:52 +08:00
Jibing-Li	0d9714b179	[Fix](multi catalog)Support read hive1.x orc file. (#16677 ) Hive 1.x may write orc file with internal column name (_col0, _col1, _col2...). This will cause query result be NULL because column name in orc file doesn't match with column name in Doris table schema. This pr is to support query Hive orc files with internal column names. For now, we haven't see any problem in Parquet file, will send new pr to fix parquet if any problem show up in the future.	2023-02-14 14:32:27 +08:00
zhangstar333	af5dc7565e	[bug](udf) fix udf return type of decimal check scale must is 9 (#16497 )	2023-02-14 10:53:53 +08:00
zhangstar333	bceb0b58a1	[fix](udf) fix create udf function with uppercase database name can't recognize (#16410 )	2023-02-14 10:52:11 +08:00
Stalary	69d3878d9b	[Bug](CTAS): Ctas rollback ignore some case (#16255 ) Currently, some error are caught due to table can not drop when execute ctas, I add a session variable to control drop or not table.	2023-02-14 09:19:37 +08:00
lihangyu	de85c57715	[Improve](point query) support retry different backends in PointQueryExecutor (#16380 )	2023-02-14 07:31:31 +08:00
YueW	f3ab55d27d	[Optimization](index) Optimization for no need to read raw data for index column that only in where clause (#16569 )	2023-02-14 00:12:45 +08:00
xueweizhang	90af1b0113	[fix](subquery) fix bug of using constexpr and some agg func(like count,max) as subquery's output (#16579 ) Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2023-02-14 00:11:56 +08:00
lihangyu	36955a6769	[regression-test](dynamic-table) add regression test for dynamic table (#16656 )	2023-02-14 00:03:19 +08:00
plat1ko	5014ad03e7	[feature](cooldown) Auto delete unused remote files (#16588 )	2023-02-13 23:59:39 +08:00
YangShaw	77a3288ce7	[feature](Nereids) support window function (#14397 )	2023-02-13 21:20:56 +08:00
minghong	ded698127e	[fix](planner) fix bug for missing slot (#16601 ) In previous version, if the output slot of analyticExpr is not materialized, the analyticExpr is pruned. But there are some cases that it cannot be pruned. For example: SELECT count() FROM T1, (SELECT dd FROM ( SELECT 1.1 as cc, ROW_NUMBER() OVER() as dd FROM T2 ) V1 ORDER BY cc DESC limit 1 ) V2; analyticExpr(ROW_NUMBER() OVER() as dd) is not materialized, but we have to generate WindowGroup for it. tmp.dd is used by upper count(), we have to generate data for tmp.dd In this fix, if an inline view only output one column(in this example, the 'dd'), we materialize this column. TODO: In order to prune 'ROW_NUMBER() OVER() as dd', we need to rethink the rule of choosing a column for count(*). (refer to SingleNodePlanner.materializeTableResultForCrossJoinOrCountStar) V2 can be transformed to SELECT cc FROM ( SELECT 1.1 as cc, ROW_NUMBER() OVER() as dd FROM T2 ) V1 ORDER BY cc DESC limit 1 ) V2; Except the byte size of cc and dd, we need to consider the cost to generate cc and dd.	2023-02-13 15:27:47 +08:00
huangzhaowei	77be0d13c3	[BugFix](Load) Add a secure path for MySql Load to load local file from fe node (#16653 ) MySql load can load fe server node, but it will cause secure issue that user use it to detect the fe node local file. For this reason, add a configuration named mysql_load_server_secure_path to set a secure path to load data. By default, load fe local file feature is disabled by this configuration.	2023-02-13 14:39:51 +08:00
minghong	a2b9b9edd7	[fix](planner) fix bug in agg on constant column (#16442 ) For performance reason, we want to remove constant column from groupingExprs. For example: `select sum(T.A) from T group by T.B, 'xyz'` is equivalent to `select sum(T.A) from T group by T.B` We can remove constant column `abc` from groupingExprs. But there is an exception when all groupingExpr are constant For example: sql1: `select 'abc' from t group by 'abc'` is not equivalent to sql2: `select 'abc' from t` sql3: `select 'abc', sum(a) from t group by 'abc'` is not equivalent to sql4: `select 1, sum(a) from t` (when t is empty, sql3 returns 0 tuple, sql4 return 1 tuple) We need to keep some constant columns if all groupingExpr are constant. Consider sql5 `select a from (select "abc" as a, 'def' as b) T group by b, a;` if the constant column `a` is in select list, this column should not be removed. sql5 is transformed to sql6 `select a from (select "abc" as a, 'def' as b) T group by a;`	2023-02-13 11:26:08 +08:00

1 2 3 4 5 ...

3803 Commits