doris

Author	SHA1	Message	Date
924060929	0a95ebf602	[feature](Nereids) Add scalar function code generator and some function trait (#12671 ) This pr did these things: 1. Change the nullable mode of 'from_unixtime' and 'parse_url' from DEPEND_ON_ARGUMENT to ALWAYS_NULLABLE, which nullable configuration was missing previously. 2. Add some new interfaces for origin NullableMode. This change inspired by the grammar of scala's mix-in trait, It help us to quickly understand the traits of function without read the lengthy procedural code and save the work to write some template code, like `class Substring extends ScalarFunction implements ImplicitCastInputTypes, PropagateNullable`. These are the interfaces: - PropagateNullable: equals to NullableMode.DEPEND_ON_ARGUMENT - AlwaysNullable: equals to NullableMode.ALWAYS_NULLABLE - AlwaysNotNullable: equals to NullableMode.ALWAYS_NOT_NULLABLE - others ComputeNullable: equals to NullableMode.CUSTOM 3. Add `GenerateScalarFunction` to generate nereids-style function code from legacy functions, but not actual generate any new function class yet, because the function's trait is not ready for use. I need add some traits for the legacy function's CompareMode and NonDeterministic, this thought is the same as ComputeNullable.	2022-09-16 21:27:30 +08:00
jiafeng.zhang	d7ffb4e26e	[deps](httpv2)upgrade springboot version to 2.7.3 (#11963 )	2022-08-24 08:49:57 +08:00
Zhengguo Yang	95091256b0	[chore](deps) update bdbje tp doris bdbje, update libhdfs3 to improve performance (#11497 )	2022-08-04 17:10:56 +08:00
jiafeng.zhang	388db05ef9	[bugfix](log4j) Upgrade log4j to 2.18.0 (#11368 )	2022-07-31 22:21:33 +08:00
luozenglin	d3c88471ad	[tracing] Support opentelemtry collector. (#10864 ) * [tracing] Support opentelemtry collector. 1. support for exporting traces to multiple distributed tracing system via collector; 2. support using collector to process traces.	2022-07-29 16:49:40 +08:00
Stalary	87b1f4c071	[feature](multi-catalog) Support es datasource (#10565 )	2022-07-27 23:16:17 +08:00
Gabriel	e769597fd2	[Improvement] (datetime) support microsecond for date literal (#10917 ) * [Improvement] (datetime) support microsecond for date literal * remove joda dependency	2022-07-18 21:39:39 +08:00
Zhengguo Yang	b037aca4fd	[improvement](dynamic-partition) add replication allocation check for dynamic partition when creating table(#10892 )	2022-07-18 18:02:33 +08:00
jiafeng.zhang	006d7c9225	[fix]The spring boot startup banner is lost, and the maven package does not package the pictures in the resources directory (#10955 )	2022-07-18 16:00:14 +08:00
Mingyu Chen	97861f517a	Revert "[chore][nereids] Bump the version of antlr4 to 4.10.1 (#10780 )" (#10876 ) This reverts commit b4927a8f151c60357387302723fa808e523d17e3.	2022-07-15 17:05:08 +08:00
Adonis Ling	b4927a8f15	[chore][nereids] Bump the version of antlr4 to 4.10.1 (#10780 )	2022-07-15 10:43:05 +08:00
924060929	e78cca1009	(Refactor)[Nereids] Combine operator and plan (#10786 ) in #9755, we split plan into plan & operator, but in subsequent development, we found the rule became complex and counter intuition: 1. we must create an operator instance, then wrap a plan by the operator type. 2. relational algebra(operator) not contains children e.g. ```java logicalProject().then(project -> { List<NamedExpression> boundSlots = bind(project.operator.getProjects(), project.children(), project); LogicalProject op = new LogicalProject(flatBoundStar(boundSlots)); // wrap a plan return new LogicalUnaryPlan(op, project.child()); }) ``` after combine operator and plan, the code become to: ```java logicalProject().then(project -> { List<NamedExpression> boundSlots = bind(project.getProjects(), project.children(), project); return new LogicalProject(flatBoundStar(boundSlots), project.child()); }) ``` Originally, we thought it would be convenient for `Memo.copyIn()` after split plan & operator, because Memo don't known how to re-new the plan(assembling child plan in the children groups) by the plan type. So plan must provide the `withChildren()` abstract method to assembling children. The less plan type, the lower code cost we have(logical/physical with leaf/unary/binary plan, about 6 plans, no concrete plan e.g. LogicalAggregatePlan). But the convenient make negative effect that difficult to understand, and people must known the concept then can develop some new rules, and rule become ugly. So we combine the plan & operator, make the rule as simple as possible, the negative effect is we must overwrite some withXxx for all concrete plan, e.g. LogicalAggregate, PhysicalHashJoin.	2022-07-13 19:05:15 +08:00
luozenglin	d5ea677282	[feature](tracing) Support query tracing to improve doris observability by introducing OpenTelemetry. (#10533 ) The collection of query traces is implemented in fe and be, and the spans are exported to zipkin. DSIP: https://cwiki.apache.org/confluence/display/DORIS/DSIP-012%3A+Introduce+opentelemetry	2022-07-09 15:50:40 +08:00
Stalary	614b782d4d	[feature](doris-on-es) Support es external table not assign schema (#9583 )	2022-07-03 23:19:05 +08:00
Compilation Success	3b3debf5a4	[build] Fix nested resource path error when as maven project from eclipse (#10427 ) 1. Fix nested resource path error when as maven project from eclipse 2. Add instructions of "Eclipse import FE as maven project" in developer guide	2022-07-01 18:03:54 +08:00
Mingyu Chen	8a49c7ef04	[chore] Rename Doris binary output format	2022-06-24 15:30:05 +08:00
Adonis Ling	6b61b970f5	[chore] Fix a warning reported by maven (#10205 )	2022-06-19 10:34:27 +08:00
924060929	76a968d1dd	[Enhancement][Refactor](Nereids) generate pattern by operator and refactor Plan and NODE_TYPE generic type (#10019 ) This pr support 1. remove the generic type from operator, remove some NODE_TYPE from plan and expression 2. refactor Plan and NODE_TYPE generic type 3. support child class matching by TypePattern 4. analyze the code of operator and generate pattern makes it easy to create rules. e.g. ```java class LogicalJoin extends LogicalBinaryOperator; class PhysicalFilter extends PhysicalUnaryOperator; ``` will generate the code ```java interface GeneratedPatterns extends Patterns { default PatternDescriptor<LogicalBinaryPlan<LogicalJoin, Plan, Plan>, Plan> logicalJoin() { return new PatternDescriptor<LogicalBinaryPlan<LogicalJoin, Plan, Plan>, Plan>( new TypePattern(LogicalJoin.class, Pattern.FIXED, Pattern.FIXED), defaultPromise() ); } default <C1 extends Plan, C2 extends Plan> PatternDescriptor<LogicalBinaryPlan<LogicalJoin, C1, C2>, Plan> logicalJoin(PatternDescriptor<C1, Plan> child1, PatternDescriptor<C2, Plan> child2) { return new PatternDescriptor<LogicalBinaryPlan<LogicalJoin, C1, C2>, Plan>( new TypePattern(LogicalJoin.class, child1.pattern, child2.pattern), defaultPromise() ); } default PatternDescriptor<PhysicalUnaryPlan<PhysicalFilter, Plan>, Plan> physicalFilter() { return new PatternDescriptor<PhysicalUnaryPlan<PhysicalFilter, Plan>, Plan>( new TypePattern(PhysicalFilter.class, Pattern.FIXED), defaultPromise() ); } default <C1 extends Plan> PatternDescriptor<PhysicalUnaryPlan<PhysicalFilter, C1>, Plan> physicalFilter(PatternDescriptor<C1, Plan> child1) { return new PatternDescriptor<PhysicalUnaryPlan<PhysicalFilter, C1>, Plan>( new TypePattern(PhysicalFilter.class, child1.pattern), defaultPromise() ); } } ``` and then we don't have to add pattern for new operators. this function utilizing jsr269 to do something in compile time, and utilizing antlr4 to analyze the code of `Operator`, then we can generate corresponding pattern. pattern generate steps: 1. maven-compiler-plugin in the pom.xml will compile fe-core three terms. first term will compile `PatternDescribable.java` and `PatternDescribableProcessor.java` 2. second compile term will compile `PatternDescribableProcessPoint.java`, and enable annotation process `PatternDescribableProcessor`, PatternDescribableProcessor will receive the event and know that `PatternDescribableProcessPoint` class contains the `PatternDescribable` annotation. 3. `PatternDescribableProcessor` will not process `PatternDescribableProcessPoint`, but find all java file exists in `operatorPath` that specify in pom.xml, and then parse to Java AST(abstract syntax tree). 5. PatternDescribableProcessor collect java AST and use `PatternGeneratorAnalyzer` to analyze AST, find the child class file for `PlanOperator` then generate `GeneratedPatterns.java` by the AST. 6. third compile term will compile `GeneratedPatterns.java` and other java file.	2022-06-15 11:44:54 +08:00
Zhengguo Yang	24ad11af6a	[deps] upgrade fabric8 k8s client to compitable new k8s cluster (#9933 )	2022-06-06 10:00:36 +08:00
dujl	8092439634	[feature](hudi) Step2: Support query hudi external table(include cow and mor table) (#9752 ) support query cow and mor hudi table.	2022-05-30 09:43:36 +08:00
Zhengguo Yang	be026addde	[security] update canal version to fix fastjson security issue (#9763 )	2022-05-25 18:22:37 +08:00
dujl	72e0042efb	[feature-wip](hudi) Step1: Support create hudi external table (#9559 ) support create hudi table support show create table for hudi table ### Design 1. create hudi table without schema(recommanded) ```sql CREATE [EXTERNAL] TABLE table_name ENGINE = HUDI [COMMENT "comment"] PROPERTIES ( "hudi.database" = "hudi_db_in_hive_metastore", "hudi.table" = "hudi_table_in_hive_metastore", "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083" ); ``` 2. create hudi table with schema ```sql CREATE [EXTERNAL] TABLE table_name [(column_definition1[, column_definition2, ...])] ENGINE = HUDI [COMMENT "comment"] PROPERTIES ( "hudi.database" = "hudi_db_in_hive_metastore", "hudi.table" = "hudi_table_in_hive_metastore", "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083" ); ``` When create hudi table with schema, the columns must exist in corresponding table in hive metastore.	2022-05-17 11:30:23 +08:00
Shuo Wang	1746f61388	[refactor](test) Refactor FE unit test framework that starts a FE server. (#9388 ) Currently, we use `UtFrameUtils` to start a FE server in the FE unit test. Each test class has to do some initialization and clean up stuff with the JUnit4 `@BeforeClass` and `@AfterClass` annotation. It's redundant and boring. Besides, almost all the APIs in `UtFrameUtils` has a `ConnectContext` parameter, which is not easy to use. This PR proposes to use an inherit-manner, i.e., wrap all the common logic in base class `TestWithFeService`, leveraging the JUnit5 `@BeforeAll` and `@AfterAll` annotation to narrow down the setup and cleanup lifecycle to each test class instance. At the same time, the derived concrete test class could directly use utility methods inherited from the base class, without calling a util class and passing a `ConnectContext` argument. `UtFrameUtils` and `DorisAssert` are marked as deprecated. We could remove these two classes if this refactor works well for a time.	2022-05-07 21:28:42 +08:00
morrySnow	ce02c661e3	[WIP-feature](Optimizer) Nereids code base (#9392 ) Nereids(new optimizer) code base Nereids is new query planner for Doris. It include three main parts: parser, analyzer and optimizer. The parser, generated by ANTLR4, transforms SQL into a logical plan with a tree structure. Analysis and optimization are performed on the logical plan of the tree structure. Each transformation is defined as a rule. The rule is applied to the logical plan using pattern matching. The implementation of the optimizer follows the approach in the Cascades paper.	2022-05-06 16:22:29 +08:00
morrySnow	784681f106	[FE Code Style][step 0]add github action to check incremental code in pr (#9328 ) 1. add rules to checkstyle 2. add github action to check incremental code in pr	2022-05-01 17:30:29 +08:00
Mingyu Chen	ce50c4d826	[feature](diagnose) support "ADMIN DIAGNOSE TABLET" stmt (#8839 ) `ADMIN DIAGNOSE TABLET tablet_id` This statement makes it easier to quickly diagnose the status of a tablet. See "ADMIN-DIAGNOSE-TABLET.md" for details ``` mysql> admin diagnose tablet 10196; +----------------------------------+------------------------------+------------+ \| Item \| Info \| Suggestion \| +----------------------------------+------------------------------+------------+ \| TabletExist \| Yes \| \| \| TabletId \| 10196 \| \| \| Database \| default_cluster:db1: 10192 \| \| \| Table \| tbl1: 10194 \| \| \| Partition \| tbl1: 10193 \| \| \| MaterializedIndex \| tbl1: 10195 \| \| \| Replicas(ReplicaId -> BackendId) \| {"10197":10002} \| \| \| ReplicasNum \| OK \| \| \| ReplicaBackendStatus \| Backend 10002 is not alive. \| \| \| ReplicaVersionStatus \| OK \| \| \| ReplicaStatus \| OK \| \| \| ReplicaCompactionStatus \| OK \| \| +----------------------------------+------------------------------+------------+ ```	2022-04-07 11:30:03 +08:00
jiafeng.zhang	eed4908790	[chore](deps) upgrade spring to 2.6.2 to 2.6.6 (#8802 )	2022-04-03 10:52:31 +08:00
jiafeng.zhang	13f1f94f86	[chore] upgrade log4j version to 2.17.2 (#8774 ) upgrade log4j version to 2.17.2	2022-04-02 21:29:25 +08:00
Mingyu Chen	b98da02611	[chore][fix](httpv2) Use mariadb-java-client for http query api (#8716 ) In #8319, I remove mysql-connector-java dependency because of license incompatibility. But we need a mysql compatible driver for http query api. So I choose mariadb-java-client, which is under LGPL.	2022-03-30 09:59:45 +08:00
Mingyu Chen	50a59f3f86	[license] Organize third-party dependent licenses for bianry releases (#8350 )	2022-03-07 23:18:58 +08:00
Mingyu Chen	9961b2c860	[refactor] Remove mysql-connector and replace org.json with com.googlecode.json-simple (#8319 ) 1. mysql-connector-java mysql-connector-java is under GLPv2 license, which is not compatible with APLv2, and Doris does not use it. 2. org.json org.json is under JSON license, which is not compatible with APLv2. I use `json-simple` to replace it.	2022-03-05 14:41:04 +08:00
Mingyu Chen	93c638f3a2	[fix][chore](insert)(fe) Fix analysis error of insert stmt and modify grpc-netty dependency (#8265 ) This bug is introduced from #8112. Also , I change the `grpc-netty` dependency to `grpc-netty-shaded`, to avoid dependency conflict: ``` java.lang.NoSuchMethodError: io.netty.buffer.PooledByteBufAllocator. ```	2022-03-01 11:12:10 +08:00
qiye	87b96cfcd6	[feature](iceberg) Step3: Support query iceberg external table (#8179 ) 1. Add Iceberg scan node 2. Add Iceberg/Hive table type in thrift 3. Support querying Iceberg tables of format types `parquet` and `orc`	2022-02-26 17:04:11 +08:00
qiye	3b8d48f08b	[feature-wip](iceberg) Step1: Support create Iceberg external table (#7391 ) Close related #7389 Support create Iceberg external table in Doris. This is the first step to support Iceberg external table. ### Create Iceberg external table This pr describes two ways to create Iceberg external tables. Both ways do not require explicitly specifying column definitions, Doris automatically converts them based on Iceberg's column definitions. 1. Create an Iceberg external table directly ```sql CREATE [EXTERNAL] TABLE table_name ENGINE = ICEBERG [COMMENT "comment"] PROPERTIES ( "iceberg.database" = "iceberg_db_name", "iceberg.table" = "icberg_table_name", "iceberg.hive.metastore.uris" = "thrift://192.168.0.1:9083", "iceberg.catalog.type" = "HIVE_CATALOG" ); ``` 2. Create an Iceberg database and automatically create all the tables under that db. ```sql CREATE DATABASE db_name [COMMENT "comment"] PROPERTIES ( "iceberg.database" = "iceberg_db_name", "iceberg.hive.metastore.uris" = "thrift://192.168.0.1:9083", "iceberg.catalog.type" = "HIVE_CATALOG" ); ``` ### Show table creation 1. For individual tables you can view them with `help show create table`. ```sql mysql> show create table iceberg_db.logs_1; +--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| Table \| Create Table \| +--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| logs_1 \| CREATE TABLE `logs_1` ( `level` varchar(-1) NOT NULL COMMENT "null", `event_time` datetime NOT NULL COMMENT "null", `message` varchar(-1) NOT NULL COMMENT "null" ) ENGINE=ICEBERG COMMENT "ICEBERG" PROPERTIES ( "iceberg.database" = "doris", "iceberg.table" = "logs_1", "iceberg.hive.metastore.uris" = "thrift://10.10.10.10:9087", "iceberg.catalog.type" = "HIVE_CATALOG" ) \| +--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ ``` 2. For Iceberg database, you can view it with `help show table creation`. ```sql mysql> show table creation from iceberg_db; +--------+---------+---------------------+---------------------------------------------------------+ \| Table \| Status \| Create Time \| Error Msg \| +--------+---------+---------------------+---------------------------------------------------------+ \| logs \| fail \| 2021-12-14 13:50:10 \| Cannot convert unknown type to Doris type: list<string> \| \| logs_1 \| success \| 2021-12-14 13:50:10 \| \| +--------+---------+---------------------+---------------------------------------------------------+ 2 rows in set (0.00 sec) ``` This is a new syntax. Show table creation records in Iceberg database: Syntax: ```sql SHOW TABLE CREATION [FROM db] [LIKE mask] ```	2022-01-27 10:22:47 +08:00
Zhengguo Yang	4bdeef3b64	[chore][fix][doc](fe-plugin)(mysqldump) fix build auditlog plugin error (#7804 ) 1. fix problems when build fe_plugins 2. format 3. add docs about dump data using mysql dump	2022-01-26 09:11:23 +08:00
Mingyu Chen	4ac8b3c9a9	[fix][s3] Fix bug that can not visit aliyun oss with aws s3 sdk (#7691 ) Close #7690 1. Exclude httpclient and httpcore dependencies from thrift@0.13 Explicitly use httpclient@4.5.13 and httpcore@4.4.15 https://stackoverflow.com/questions/59265959/java-lang-bootstrapmethoderror-call-site-initialization-exception-from-athena-j 2. Exclude aws-java-sdk-s3 dependency from hadoop-aws Explicitly use aws-java-sdk-s3@1.11.95 https://github.com/aws/aws-sdk-java/issues/1032	2022-01-11 15:00:31 +08:00
Zhengguo Yang	738d2d2e07	[refactor] update parent pom version and optimize build scripts (#7548 )	2022-01-05 10:45:11 +08:00
jiafeng.zhang	85c30fc720	[deps] Upgrade Log4j to 2.7.1 to solve the CVE-2021-44832 security vulnerability (#7536 ) Upgrade Log4j to 2.7.1 to solve the CVE-2021-44832 security vulnerability Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>	2021-12-30 10:21:37 +08:00
Zhengguo Yang	2872dbfeb8	[refactor] Standardize the writing of pom files, prepare for deployment to maven (#7477 )	2021-12-30 10:16:37 +08:00
jiafeng.zhang	7a1bb5b335	log4j upgrade to 2.17.0 (#7440 ) Solved the third security vulnerability CVE-2021-45105 that was discovered	2021-12-21 09:28:02 +08:00
jiafeng.zhang	e64da03866	[deps](log4j) Upgrade log4j 2 to 2.16.0 (#7394 ) Upgrade log4j 2 to 2.16.0, the official strongly recommends upgrading to this version	2021-12-14 15:57:16 +08:00
jiafeng.zhang	568f6611df	[deps](log4j) upgrade log4j (#7364 ) to 2.15.0	2021-12-10 23:19:11 +08:00
Zhengguo Yang	200210e708	[fix] (ut) fix fe unit test failed, this is because we fix the MAX_PHYSICAL_PACKET_LENGTH to 0xffffff	2021-12-06 11:13:01 +08:00
qiye	5b01f7bba2	[Feature] Support query hive table (#6569 ) Users can directly query the data in the hive table in Doris, and can use join to perform complex queries without laboriously importing data from hive. Main changes list below: FE: Extend HiveScanNode from BrokerScanNode HiveMetaStoreClientHelper communicate with HIVE and HDFS. BE: Treate HiveScanNode as BrokerScanNode, treate HiveTable as BrokerTable. broker_scanner.cpp: suppot read column from HDFS path. orc_scanner.cpp: support read hdfs file. POM: Add hive.version=2.3.7, hive-metastore and hive-exec Add hadoop.version=2.8.0, hadoop-hdfs Upgrade commons-lang to fix incompatiblity of Java 9 and later. Thrift: Add THiveTable Add read_by_column_def in TBrokerRangeDesc	2021-11-16 11:59:07 +08:00
Mingyu Chen	e8cabfff27	[S3] Support path style endpoint (#6962 ) Add a use_path_style property for S3 Upgrade hadoop-common and hadoop-aws to 2.8.0 to support path style property Fix some S3 URI bugs Add some logs for tracing load process.	2021-11-01 10:48:10 +08:00
Zhengguo Yang	24d38614a0	[Dependency] Upgrade thirdparty libs (#6766 ) Upgrade the following dependecies: libevent -> 2.1.12 OpenSSL 1.0.2k -> 1.1.1l thrift 0.9.3 -> 0.13.0 protobuf 3.5.1 -> 3.14.0 gflags 2.2.0 -> 2.2.2 glog 0.3.3 -> 0.4.0 googletest 1.8.0 -> 1.10.0 snappy 1.1.7 -> 1.1.8 gperftools 2.7 -> 2.9.1 lz4 1.7.5 -> 1.9.3 curl 7.54.1 -> 7.79.0 re2 2017-05-01 -> 2021-02-02 zstd 1.3.7 -> 1.5.0 brotli 1.0.7 -> 1.0.9 flatbuffers 1.10.0 -> 2.0.0 apache-arrow 0.15.1 -> 5.0.0 CRoaring 0.2.60 -> 0.3.4 orc 1.5.8 -> 1.6.6 libdivide 4.0.0 -> 5.0 brpc 0.97 -> 1.0.0-rc02 librdkafka 1.7.0 -> 1.8.0 after this pr compile doris should use build-env:1.4.0	2021-10-15 13:03:04 +08:00
xy720	c8c571af37	[New Feature] Support synchronizing MySQL binlog in real time [stage 1] (#6289 ) This commit is the first stage of #6287 In this commit, we support: 1、Sync Job 1)、 Creating sync job and data channel in Fe. 2)、Pause sync job. 3)、Resume sync job. 4)、Stop sync job. 5)、Show sync jobs. 2、Canal 1)、Subscribing and getting the binlog data of canal with creating syncjob.	2021-08-08 21:39:34 +08:00
jiafeng.zhang	fb123b2b4b	[httpv2] Switch SpringBoot built-in container to jetty (#6149 ) Fix the 307 redirect problem	2021-07-08 09:54:28 +08:00
Mingyu Chen	11cce06962	[Feature] Support create history dynamic partition (#5703 ) 1. Add a new dynamic partition property `create_history_partition`. If set to true, Doris will create all partitions from `start` to `end`. 2. Add a new FE config `max_dynamic_partition_num` To limit the number of partitions created when creating one table.	2021-05-08 12:05:19 +08:00
Zhengguo Yang	b121ad6b95	[Refactor] Remove jprotobuf and use grpc client to connect brpc service (#5650 )	2021-04-21 10:25:58 +08:00

1 2

62 Commits