doris

Author	SHA1	Message	Date
huangzhaowei	dd39287a0a	[fix](compile) Fix compile error in external file scan node (#10274 )	2022-06-20 22:39:32 +08:00
huangzhaowei	8a8d24b4a6	[feature-wip](multi-catalog) External file scan node (#9973 )	2022-06-20 17:46:24 +08:00
Kikyou1997	8531dcb885	[refactor](nereids) Abstract interface of statistics framework for new optimizer reuse (#10240 ) As the statistics framework could not be reused by new optmizer before, so I abstract some interface to make it reusable. 1. Make Slot extends the Id 2. Add new interface:ExprStats,PlanStats 3. Move definition of PlanNode.NodeType to statistics sub-directory	2022-06-20 15:16:06 +08:00
pengxiangyu	087fc596b1	[feature] add remote storage policy config for create table properties (#10159 ) Add remote storage policy config for create table properties. It will set storage policy for table and partitions in `CREATE TABLE` and `ALTER TABLE`. This policy will be used when partition is being migrated from local to remote. grammy: 1. `CREATE TABLE TblPxy1 (...) ENGINE=olap DISTRIBUTED BY HASH (aa) BUCKETS 1 PROPERTIES( "remote_storage_policy" = "testPolicy3" );` 2. `ALTER TABLE TblPxy01 SET ("remote_storage_policy" = "testPolicy3");` 3. `ALTER TABLE TblPxy01 MODIFY PARTITION p2 SET ("remote_storage_policy" = "testPolicy3");`	2022-06-20 12:42:23 +08:00
camby	2f37e108e3	[feature-wip](array-type) add ArrayType support for FeFunctions (#10041 ) FEFunctionSignature do not support ArrayType as args, then following SQL failed: `> select array_contains([1,2,3], 1);` ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: org.apache.doris.catalog.ArrayType cannot be cast to org.apache.doris.catalog.ScalarType	2022-06-20 09:35:06 +08:00
Mingyu Chen	9a1f1c3864	[improvement](variables) change session variable when set global variable (#10238 ) Currently, when setting variables with `global` keywords, it will not affect the current session variable's value. That is always make user confused. This CL mainly changes: 1. Change session variable when set global variable	2022-06-20 09:05:50 +08:00
Ashin Gau	f728fd4933	[fix](auth) Authentication exception when the name of database or table contains an underscore in grant statement. (#10213 )	2022-06-19 22:20:01 +08:00
Mingyu Chen	67f341f44e	[TLP](step-1) Remove incubator prefix (#10230 ) Remove some `incubator-` prefix in source code. The document is not modified, will be done in next PR.	2022-06-19 19:34:52 +08:00
Adonis Ling	6b61b970f5	[chore] Fix a warning reported by maven (#10205 )	2022-06-19 10:34:27 +08:00
xiepengcheng01	1d3496c6ab	[feature] support backup/restore connect to HDFS (#10081 )	2022-06-19 10:26:20 +08:00
camby	0e404edf54	[improvement] Change array offset type from UInt32 to UInt64 (#10070 ) Now column `Array<T>` contains column `offsets` and `data`, and type of column `offsets` is UInt32 now. If we call array_union to merge arrays repeatedly, the size of array may overflow. So we need to extend it before `Array Data Type` release.	2022-06-19 10:24:08 +08:00
morrySnow	b7b78ae707	[style](fe)the last step of fe CheckStyle (#10134 ) 1. fix all checkstyle warning 2. change all checkstyle rules to error 3. remove some java doc rules a. RequireEmptyLineBeforeBlockTagGroup b. JavadocStyle c. JavadocParagraph 4. suppress some rules for old codes a. all java doc rules only affect on Nereids b. DeclarationOrder only affect on Nereids c. OverloadMethodsDeclarationOrder only affect on Nereids d. VariableDeclarationUsageDistance only affect on Nereids e. suppress OneTopLevelClass on org/apache/doris/load/loadv2/dpp/ColumnParser.java f. suppress OneTopLevelClass on org/apache/doris/load/loadv2/dpp/SparkRDDAggregator.java g. suppress LineLength on org/apache/doris/catalog/FunctionSet.java h. suppress LineLength on org/apache/doris/common/ErrorCode.java	2022-06-17 21:02:45 +08:00
huangzhaowei	6baa694bc1	[feature-wip](multi-catalog) Catalog operation syntax (#10033 ) Impl catalog operation syntax	2022-06-17 17:50:31 +08:00
924060929	d51166dd2a	[Enhancement](Nereids) Automatic compute logical properties (#10176 ) Automatic compute logical properties	2022-06-17 11:31:05 +08:00
Adonis Ling	5e47b03595	[feature-wip](array-type) Add array aggregation functions (#10108 )	2022-06-17 11:07:49 +08:00
Stalary	b9f8df0264	[Bug] Compatible with Datagrip, fix checkStyle (#10143 ) * Compatible with Datagrip, fix checkStyle * ADD: comment	2022-06-17 11:05:17 +08:00
Kikyou1997	67e95276fb	[fix](optimizer) Fix the default join reorder algorithm (#10174 ) Default join reorder algorithm not working for the most cases.	2022-06-17 10:59:33 +08:00
Jibing-Li	2a1d1b951a	[data lake]Add HMS external data source. (#10088 )	2022-06-17 08:49:15 +08:00
zhangstar333	44e979e43b	[Vectorized][Function] add orthogonal bitmap agg functions (#10126 ) * [Vectorized][Function] add orthogonal bitmap agg functions save some file about orthogonal bitmap function add some file to rebase update functions file * refactor union_count function refactor orthogonal union count functions * remove bool is_variadic	2022-06-17 08:48:41 +08:00
Jibing-Li	f1c9105af1	[feature] Support hive on s3 (#10128 ) Support query hive table on S3. Pass AK/SK, Region and s3 endpoint to hive table while creating the external table. example create table sql: ``` CREATE TABLE `region_s3` ( `r_regionkey` integer NOT NULL, `r_name` char(25) NOT NULL, `r_comment` varchar(152) ) engine=hive properties ("database"="default", "table"="region_s3", “hive.metastore.uris"="thrift://127.0.0.1:9083", “AWS_ACCESS_KEY”=“YOUR_ACCESS_KEY", “AWS_SECRET_KEY”=“YOUR_SECRET_KEY", "AWS_ENDPOINT"="s3.us-east-1.amazonaws.com", “AWS_REGION”=“us-east-1”); ```	2022-06-16 19:15:46 +08:00
Jerry Hu	4b9d500425	[improvement](profile) Add table name and predicates (#10093 )	2022-06-16 10:59:31 +08:00
Pxl	5805f8077f	[Feature] [Vectorized] Some pre-refactorings or interface additions for schema change part2 (#10003 )	2022-06-16 10:50:08 +08:00
yiguolei	90f229c038	[refactor] remove useless plugin test code (#10061 ) * remove plugin test code * remove plugin test Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-06-16 10:43:28 +08:00
camby	ca88f258d9	[improvement] remove unused codes and docs for `SHOW USER` (#10107 ) * remove unused codes and docs for `SHOW USER`	2022-06-15 21:49:08 +08:00
ccoffline	49f4437396	[fix] Fix disk used pct only consider the data that used by Doris (#9705 )	2022-06-15 16:28:56 +08:00
924060929	76a968d1dd	[Enhancement][Refactor](Nereids) generate pattern by operator and refactor Plan and NODE_TYPE generic type (#10019 ) This pr support 1. remove the generic type from operator, remove some NODE_TYPE from plan and expression 2. refactor Plan and NODE_TYPE generic type 3. support child class matching by TypePattern 4. analyze the code of operator and generate pattern makes it easy to create rules. e.g. ```java class LogicalJoin extends LogicalBinaryOperator; class PhysicalFilter extends PhysicalUnaryOperator; ``` will generate the code ```java interface GeneratedPatterns extends Patterns { default PatternDescriptor<LogicalBinaryPlan<LogicalJoin, Plan, Plan>, Plan> logicalJoin() { return new PatternDescriptor<LogicalBinaryPlan<LogicalJoin, Plan, Plan>, Plan>( new TypePattern(LogicalJoin.class, Pattern.FIXED, Pattern.FIXED), defaultPromise() ); } default <C1 extends Plan, C2 extends Plan> PatternDescriptor<LogicalBinaryPlan<LogicalJoin, C1, C2>, Plan> logicalJoin(PatternDescriptor<C1, Plan> child1, PatternDescriptor<C2, Plan> child2) { return new PatternDescriptor<LogicalBinaryPlan<LogicalJoin, C1, C2>, Plan>( new TypePattern(LogicalJoin.class, child1.pattern, child2.pattern), defaultPromise() ); } default PatternDescriptor<PhysicalUnaryPlan<PhysicalFilter, Plan>, Plan> physicalFilter() { return new PatternDescriptor<PhysicalUnaryPlan<PhysicalFilter, Plan>, Plan>( new TypePattern(PhysicalFilter.class, Pattern.FIXED), defaultPromise() ); } default <C1 extends Plan> PatternDescriptor<PhysicalUnaryPlan<PhysicalFilter, C1>, Plan> physicalFilter(PatternDescriptor<C1, Plan> child1) { return new PatternDescriptor<PhysicalUnaryPlan<PhysicalFilter, C1>, Plan>( new TypePattern(PhysicalFilter.class, child1.pattern), defaultPromise() ); } } ``` and then we don't have to add pattern for new operators. this function utilizing jsr269 to do something in compile time, and utilizing antlr4 to analyze the code of `Operator`, then we can generate corresponding pattern. pattern generate steps: 1. maven-compiler-plugin in the pom.xml will compile fe-core three terms. first term will compile `PatternDescribable.java` and `PatternDescribableProcessor.java` 2. second compile term will compile `PatternDescribableProcessPoint.java`, and enable annotation process `PatternDescribableProcessor`, PatternDescribableProcessor will receive the event and know that `PatternDescribableProcessPoint` class contains the `PatternDescribable` annotation. 3. `PatternDescribableProcessor` will not process `PatternDescribableProcessPoint`, but find all java file exists in `operatorPath` that specify in pom.xml, and then parse to Java AST(abstract syntax tree). 5. PatternDescribableProcessor collect java AST and use `PatternGeneratorAnalyzer` to analyze AST, find the child class file for `PlanOperator` then generate `GeneratedPatterns.java` by the AST. 6. third compile term will compile `GeneratedPatterns.java` and other java file.	2022-06-15 11:44:54 +08:00
pengxiangyu	c4d0fba713	Add storage policy for remote storage migration (#9997 )	2022-06-15 11:00:06 +08:00
zhangstar333	4c24586865	[Vectorized][UDF] support java-udaf (#9930 )	2022-06-15 10:53:44 +08:00
plat1ko	f4e2f78a1a	[fix] Fix the bug that data balance causes tablet loss (#9971 ) 1. Provide a FE conf to test the reliability in single replica case when tablet scheduling are frequent. 2. According to #6063, almost apply this fix on current code.	2022-06-15 09:52:56 +08:00
gtchaos	f7b5f36da4	[feature] Support read hive external table and outfile into HDFS that authenticated by kerberos (#9579 ) At present, Doris can only access the hadoop cluster with kerberos authentication enabled by broker, but Doris BE itself does not supports access to a kerberos-authenticated HDFS file. This PR hope solve the problem. When create hive external table, users just specify following properties to access the hdfs data with kerberos authentication enabled: ```sql CREATE EXTERNAL TABLE t_hive ( k1 int NOT NULL COMMENT "", k2 char(10) NOT NULL COMMENT "", k3 datetime NOT NULL COMMENT "", k5 varchar(20) NOT NULL COMMENT "", k6 double NOT NULL COMMENT "" ) ENGINE=HIVE COMMENT "HIVE" PROPERTIES ( 'hive.metastore.uris' = 'thrift://192.168.0.1:9083', 'database' = 'hive_db', 'table' = 'hive_table', 'dfs.nameservices'='hacluster', 'dfs.ha.namenodes.hacluster'='n1,n2', 'dfs.namenode.rpc-address.hacluster.n1'='192.168.0.1:8020', 'dfs.namenode.rpc-address.hacluster.n2'='192.168.0.2:8020', 'dfs.client.failover.proxy.provider.hacluster'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider', 'dfs.namenode.kerberos.principal'='hadoop/_HOST@REALM.COM' 'hadoop.security.authentication'='kerberos', 'hadoop.kerberos.principal'='doris_test@REALM.COM', 'hadoop.kerberos.keytab'='/path/to/doris_test.keytab' ); ``` If you want to `select into outfile` to HDFS that kerberos authentication enable, you can refer to the following SQL statement： ```sql select * from test into outfile "hdfs://tmp/outfile1" format as csv properties ( 'fs.defaultFS'='hdfs://hacluster/', 'dfs.nameservices'='hacluster', 'dfs.ha.namenodes.hacluster'='n1,n2', 'dfs.namenode.rpc-address.hacluster.n1'='192.168.0.1:8020', 'dfs.namenode.rpc-address.hacluster.n2'='192.168.0.2:8020', 'dfs.client.failover.proxy.provider.hacluster'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider', 'dfs.namenode.kerberos.principal'='hadoop/_HOST@REALM.COM' 'hadoop.security.authentication'='kerberos', 'hadoop.kerberos.principal'='doris_test@REALM.COM', 'hadoop.kerberos.keytab'='/path/to/doris_test.keytab' ); ```	2022-06-14 20:07:03 +08:00
Kikyou1997	25b9d6eba2	[feature](nereids) Plan Translator (#9993 ) Issue Number: close #9621 Add following physical operator: PhysicalAgg PhysicalSort PhysicalHashJoin Add basic logic of plan translator 1. add new agg phase enum for nereids 2. remove the Analyzer from PlanContext.java 3. implement PlanTranslator::visitPhysicalFilter	2022-06-14 19:39:55 +08:00
shee	2fadaddda0	[Enhancement] (Nereids) scalar expression rewrite framework (#9942 ) Issue Number: close #9633 The scalar expression is rewritten using the visitor pattern as a traversal. In the abstract class ExpressionVisitor, which contains all predicate to rewrite. We have provided a rewrite rules interface ExpressionRewriteRule, AbstractExpressionRewriteRule class implements the interface and expanded the ExpressionVisitor, if we want to realize an expression rewriting rules, Direct implementation AbstractExpressionRewriteRule provided in the method of traversing the predicate. There are two rules to refer: NormalizeExpressionRule and SimplifyNotExprRule	2022-06-14 16:20:48 +08:00
Mingyu Chen	81e0a348a7	[fix] fix bug that show proc "/cluster_balance/history_tablets" return malformat error (#10073 )	2022-06-14 15:34:16 +08:00
Gabriel	bdcf2e7ed2	[Improvement] set table name in olap scanner (#10102 )	2022-06-14 08:18:18 +08:00
Jibing-Li	f26b81e4dd	[feature](multi-catalog) Change DatabaseIf APIs' return type to TableIf. (#10044 ) Change the DatabaseIf APIs' return type to TableIf. Use generics in DatabaseIf, to avoid changing the return type in Database. Currently Database class use type Table, I'm try to avoid changing it to TableIf. Because in this case, we need to change a lot of code.	2022-06-13 10:55:44 +08:00
Adonis Ling	415b6b8086	[feature-wip](array-type) Support array type which doesn't contain null (#9809 )	2022-06-12 23:35:28 +08:00
Lijia Liu	036276c1d3	[fix] Do not send drop task when replay drop table (#10062 ) When doing checkpoint, FE will sends DropTask to BE. This PR prohibit this conduct.	2022-06-12 09:59:38 +08:00
morrySnow	3f575e3e7c	[fix](planner) produce wrong result when use bucket shuffle join with colocate left table (#10045 ) When plan bucket shuffle join, we need to know left table bucket number. Currently, we use tablet number directly based on the assumption that left table has only one partition. But, when left table is colocated table, it could have more than one partition. In this case, some data in right table will be dropped incorrectly and produce wrong result for query. reproduce could follow regression test in PR.	2022-06-11 21:44:47 +08:00
shee	a7cca930b9	[fix](planner) fix don't rewrite nested union statement bug (#8513 ) Issue Number: close #8512	2022-06-10 19:43:45 +08:00
yinzhijian	4135e59f77	[fix](fe) select stmt will make BE coredump when its castExpr is like cast(int as array<>) (#9995 ) * [fix](fe) select stmt will make BE coredump when its castExpr is like cast(int as array<>) * fix implicit cast scalar type bug * Revert "fix implicit cast scalar type bug" This reverts commit 1f05b6bab72430214dca88f386b50ef9a081e60a. * only check array cast, retrigger	2022-06-10 15:03:09 +08:00
Pxl	495c34fa29	[Bug] [Vectorized] code dump on aggregate node over union node (#10040 ) * miss check passthrough on vectorized * format and add test * update	2022-06-10 15:02:14 +08:00
Mingyu Chen	81a9284305	[improvement][refactor](image) refactor the read and load method of meta image #10005	2022-06-10 14:56:14 +08:00
Jing Shen	4a474420c8	[feature](function) Add ntile function (#9867 ) Add ntile function. For non-vectorized-engine, I just implemented like Impala, rewrite ntile to row_number and count. But for vectorized-engine, I implemented WindowFunctionNTile.	2022-06-10 10:32:40 +08:00
camby	6fab1cbf3c	[feature-wip](array-type) Add array functions size and cardinality (#9921 ) Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-06-09 15:03:03 +08:00
dujl	050cbba6e5	[fix][hudi] use lowerCase to get hudi fileFormatType (#9873 ) use lowerCase of inputFormatName to get hudi fileFormatType	2022-06-09 12:13:02 +08:00
tarepanda1024	449bfe10d1	fix: fix a thread safe problem in LoadAction.java (#9955 )	2022-06-09 00:34:07 +08:00
smallx	342ab52270	[fix] Fix type description in PrimitiveType (#9985 )	2022-06-09 00:30:32 +08:00
BePPPower	99fb830023	[feature] datetime column type support auto-initialized with default … (#9972 )	2022-06-09 00:28:03 +08:00
Mingyu Chen	5f56e17ef2	[feature-wip](multi-catalog)(step2) Introduce Internal Data Source (#9953 )	2022-06-08 22:02:22 +08:00
yiguolei	d9bbf67b9e	[DefaultConfigChange]enable query vectorization and storage vectorization and storage low cardinality optimization by default (#9848 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-06-08 15:29:43 +08:00

1 2 3 4 5 ...

2218 Commits