doris

Author	SHA1	Message	Date
Xinyi Zou	d96e2dfefb	[feature-wip](arrow-flight)(step5) Support JDBC and PreparedStatement and Fix Bug (#27661 )	2023-11-29 21:17:20 +08:00
zy-kkk	34c3cde0de	Revert "[feature-wip](catalog) support deltalake catalog step1-metadata (#22493 )" (#27095 ) This reverts commit 5b641ebd40fff71e632ee9be4ede58b744b602b9. Currently, Deltalake Catalog is not a usable feature. We will continue to implement it in the datalake plug-in system in the future, so we will delete it from the FE code for now.	2023-11-20 16:10:33 +08:00
Lei Zhang	3044b8397e	[feature](fe) Add coverage tool for FE UT (#26203 )	2023-11-11 19:54:04 +08:00
zhiqiang	a5565f68b2	[Refactor](opentelemetry) Remove opentelemetry (#26605 )	2023-11-09 18:05:34 +08:00
JingDas	e3d0e55794	[feature-wip] (Nereids) Support transforming trino dialect SQL to logical plan (#21855 ) Support transforming trino dialect SQL to logical plan (#21854) ## Proposed changes Issue Number: #21854 Use io.trino.sql.tree.AstVisitor as vistor, visit coorresponding trino node and transform it to doris logical plan. ## Further comments Here are some examples for function transforming as following: ascii('a') function is in doris and codepoint('a') funtion in trino, they have the same feature and have the same method signature, so we can use [TrinoFnCallTransformer](`3b37b76886/fe/fe-core/src/main/java/org/apache/doris/nereids/parser/trino/TrinoFnCallTransformer.java`) to handle them. another example for ComplexTransformer as following: date_diff('second', TIMESTAMP '2020-12-25 22:00:00', TIMESTAMP '2020-12-25 21:00:00')" fuction in trino and seconds_diff(2020-12-25 22:00:00, 2020-12-25 21:00:00)") fuction in doris. They have different method signature, we cant not handle it by TrinoFnCallTransformer simply and we should handle it by individual complex transformer [DateDiffFnCallTransformer](`3b37b76886/fe/fe-core/src/main/java/org/apache/doris/nereids/parser/trino/DateDiffFnCallTransformer.java`).	2023-10-16 05:10:55 -05:00
Xinyi Zou	87a30dc41d	[feature-wip](arrow-flight)(step3) Support authentication and user session (#24772 )	2023-09-27 14:53:58 +08:00
Calvin Kirs	ce8dfd3561	[Chore](dependency)grpc library dependencies are unified (#24794 )	2023-09-23 14:29:34 +08:00
Adonis Ling	b86f09418f	[chore](build) Fix the FE build on CentOS 6 (#24798 ) Using grpc-java whose version is newer than 1.34.0 will break the build on CentOS 6 due to the obsolete GLIBC.	2023-09-22 19:58:12 +08:00
Calvin Kirs	85a1fbd5d3	[Improve](stats)Use Log4j class library instead of Quartz (#24732 ) Quartz new version not support java 8	2023-09-22 15:23:58 +08:00
Calvin Kirs	c832e018d0	[Dependence](Fe)Upgrade Fe dependencies (#24606 ) * be scanner - Upgrade avro to 1.11.2 fe - Upgrade quartz to 2.5.0-rc1 - Upgrade maxcompute to 0.45-2-publish - Binding avro-ipc to 1.11.2 * Binding hbase version to 2.5.5 binding nimbusds version to 9.35	2023-09-22 10:14:42 +08:00
Xinyi Zou	fc12362a6d	[feature-wip](arrow-flight)(step2) FE support Arrow Flight server (#24314 ) This is a POC, the design documentation will be updated soon	2023-09-20 14:42:54 +08:00
morrySnow	da5c78019c	[opt](fe-ui) support read hardware info from aarch64 MacOS (#23708 ) update the version of oshi and jna to support read hardware info from aarch64 MacOS	2023-08-31 18:16:33 +08:00
zy-kkk	5b641ebd40	[feature-wip](catalog) support deltalake catalog step1-metadata (#22493 )	2023-08-29 10:31:37 +08:00
Calvin Kirs	e17779f193	[Dependency](fe)Upgrade dependency version (#22496 ) Upgrade guava to 32.1.2-jre Set ck dependency scope to provided Upgrade okio to 3.4.0 Upgrade snake yaml to 1.33 Upgrade aws-java-sdk to 1.12.519 Upgrade hadoop to 3.3.6	2023-08-11 10:54:37 +08:00
AKIRA	582acad8a1	[feature](stats) Enable period time with cron expr (#22095 ) Support such grammar ANALYZE TABLE test WITH CRON "* * * * * ?" Such job would be scheduled as the cron expr specifie, but natively support minute-level schedule only	2023-07-26 17:25:57 +08:00
AKIRA	964ac4e601	[opt](nereids) Retry when async analyze task failed (#21889 ) Retry at most 5 times when async analyze task execution failed	2023-07-26 17:16:56 +08:00
zhangdong	7fcf702081	[improvement](multi catalog)paimon support filesystem metastore (#21910 ) 1.support filesystem metastore 2.support predicate and project when split 3.fix partition table query error todo: Now you need to manually put paimon-s3-0.4.0-incubating.jar in be/lib/java_extensions when use s3 filesystem doc pr: #21966	2023-07-24 22:02:57 +08:00
Calvin Kirs	30b1b93353	[dependency](fe)Dependency version upgrade (#21191 ) Keep hadoop-aliyun version consistent with hadoop main version (3.3.5) upgrade jackson to 2.14.3 upgrade netty version to 4.1.94.final binding check.freamework version to 3.32.0 upgrade snappy-java to 1.1.10.1 upgrade hudi version to 0.13.1 upgrade spring version to 2.7.13 upgrade orc version to 1.8.4 revert nonsensical changes	2023-06-29 10:01:33 +08:00
slothever	d4240ac21b	[fix](multi-catalog)add oss sdk, supported oss properties (#21029 )	2023-06-26 13:00:44 +08:00
Siyang Tang	46f0295b78	[feature](load-refactor-with-tvf) S3 load with S3 tvf and native insert (#19937 )	2023-06-25 17:45:31 +08:00
lexluo09	57656b2459	[Enhancement](java-udf) java-udf module split to sub modules (#20185 ) The java-udf module has become increasingly large and difficult to manage, making it inconvenient to package and use as needed. It needs to be split into multiple sub-modules, such as : java-commom、java-udf、jdbc-scanner、hudi-scanner、 paimon-scanner. Co-authored-by: lexluo <lexluo@tencent.com>	2023-06-13 09:41:22 +08:00
Ashin Gau	9a83d78dfe	[Enhancement](hudi) support hudi mor table, step2 follow #19909 (#20570 ) PR(https://github.com/apache/doris/pull/19909) has implemented the framework of hudi reader for MOR table. This PR completes all functions of reading MOR table and enables end-to-end queries. Key Implementations: 1. Use hudi meta information to generate the table schema, not from hive client. 2. Use hive client to list hudi partitions, so it strongly depends the sync-tools(https://hudi.apache.org/docs/syncing_metastore/) which syncs the partitions of hudi into hive metastore. However, we may get the hudi partitions directly from .hoodie directory. 3. Remove `HudiHMSExternalCatalog`, because other catalogs like glue is compatible with hive catalog. 4. Read the COW table originally from c++. 5. Hudi RecordReader will use ProcessBuilder to start a hotspot debugger process, which may be stuck when attaching the origin JNI process, soI use a tricky method to kill this useless process.	2023-06-10 12:25:53 +08:00
yuxuan-luo	fe63a0a3bb	[Feature](multi-catalog)support paimon catalog (#19681 ) CREATE CATALOG paimon_n2 PROPERTIES ( "dfs.ha.namenodes.HDFS1006531" = "nn2,nn1", "dfs.namenode.rpc-address.HDFS1006531.nn2" = "172.16.65.xx:4007", "dfs.namenode.rpc-address.HDFS1006531.nn1" = "172.16.65.xx:4007", "hive.metastore.uris" = "thrift://172.16.65.xx:7004", "type" = "paimon", "dfs.nameservices" = "HDFS1006531", "hadoop.username" = "hadoop", "paimon.catalog.type" = "hms", "warehouse" = "hdfs://HDFS1006531/data/paimon1", "dfs.client.failover.proxy.provider.HDFS1006531" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" );	2023-06-06 15:08:30 +08:00
slothever	b7fc17da68	[feature-wip](multi-catalog)(step2)support read max compute data by JNI (#19819 ) Issue Number: #19679	2023-06-05 22:10:08 +08:00
Nick Young	499f443779	[feature](iceberg) Support read iceberg data on gcs (#19815 )	2023-05-20 12:40:03 +08:00
luozenglin	f68d3a660e	[improvement](opentelemetry) upgrade opentelemetry jar to v1.26.0 and opentelemetry-cpp to v1.8.3 (#19733 ) why upgrade? anything wrong? Try to fix the problem about opentelemetry::v1::ext::http::client::curl::HttpOperation::Send(), I have updated the pr info.	2023-05-18 18:46:20 +08:00
slothever	3f2d1ae9a4	[feature-wip](multi-catalog)(step1)support connect to max compute (#19606 ) Issue Number: #19679 support connect to max compute metadata by odps sdk	2023-05-16 11:30:27 +08:00
Adonis Ling	ccd22c508a	[chore](fe) Fix the build on Centos 6 (#19255 )	2023-05-06 14:50:56 +08:00
Mingyu Chen	c9fa10ac10	[fix](doc) avoid generate config doc automatically (#19302 ) After #19246, when compilng FE, it will automatically generate Config and Session Variables doc and overwrite the origin one. Need to avoid it because it is not ready to use yet	2023-05-05 20:39:05 +08:00
Mingyu Chen	70236adc1f	[Refactor](doc)(config)(variable) use script to generate doc for FE config and session variables (#19246 ) The document of configs(FE and BE) and session variables is hard to maintain. Because developer need to modify both code and document. And you can see that some of config's document is missing. So I plan to write the document of config or variables directly in code, and using script to generate document automatically. How To This CL mainly changes: Add field in Config and Session Variables' annaotion description: The description of the config or variable item. It is a String array. And first element is in Chinese, second is in English options: the valid options if the config or variable is enum. Add a scripts docs/generate-config-and-variable-doc.sh Simple run sh docs/generate-config-and-variable-doc.sh and it will generate docs of FE config and variables, And save it under docs/admin-manual/config/fe-config.md and docs/advanced/variables.md, both in Chinese and in English. And there are template markdowns for this script to read and replace with real doc content. TODO Too many description need to be filled. I will finish them in next PR. And now the origin doc remain unchanged. Find a way to check the description field of config and variables, to make sure we won't missing it. Generate doc for BE config.	2023-05-05 14:42:43 +08:00
Calvin Kirs	5459cd9c30	[Improve](fe)Upgrade dependencies and optimize jar package management (#18882 ) bind netty-version to 4.1.89-final bind jettison to 1.5.4 upgrade hadoop version to 3.3.5 upgrade range-plugins-common to 2.4.0 bind bcprov-jdk15on to 2.4.0 upgrade and bind woodstox to 6.5.1 upgrade and bind kerby to 2.0.3 upgrade hudi to 0.13.0 upgrade parquet to 1.13.0 upgrade maven-source-plugin to 3.2.1 upgrade maven-assembly-plugin to 3.3.0 upgrade maven-javadoc-plugin to 3.3.2 upgrade maven-shade-plugin to 3.3.4 upgrade maven-clean-plugin to 3.1.0 Remove meaningless plugins Optimize doris maven path Unify the Java modules for management in fe	2023-05-04 10:07:37 +08:00
Calvin Kirs	57982ddc46	[Fix](catalog)Fix hudi-catalog get file split error (#18644 ) (#18673 ) `hudi-common` depends on `parque-avro`, but the dependency scope is `provide`. When we use `hudi-catalog`, `HoodieAvroWriteSupport` will be called. This method depends on `parque-avro`, so it will generate ClassNotFound Describe your changes.	2023-04-16 21:56:14 +08:00
Calvin Kirs	b39846c2c7	[Fix](Catalog)Delete duplicate defined dependencies to avoid class loading exceptions (#18628 ) `iceberg-hive-metastore` and `hive-storage-api` have been defined in hive-catalog-shade, and some classes in the shade have been renamed, so we cannot declare them again. The classes in the shade should be kept. The `hive-metastore-api` used in `ranger` can also use the jar in the `shade`. Since we rename the tool class used inside the `hive`, this has no effect.	2023-04-13 22:12:19 +08:00
Calvin Kirs	75fd4b70fa	[improve](fe)Optimize fe binary package packaging (#18554 )	2023-04-12 12:58:45 +08:00
Calvin Kirs	5f981b0b1f	[fix](catalog)Use hive-catalog-shade to solve thrift version compatibility issues (#18504 ) `Hive 3` uses the `thrift-0.9.3` package, and `Doris` uses the `thrift-0.16.0` package. These two packages are not compatible, so we use the `hive-sahde` package to manage hive dependencies in a unified way. This jar package renames the `thrift` class , so the problem of conflict can be resolved.	2023-04-11 13:19:39 +08:00
slothever	d0219180a9	[feature-wip](multi-catalog)add properties converter (#18005 ) Refactor properties of each cloud , use property converter to convert properties accessing fe metadata and be data. user docs #18287	2023-04-06 09:55:30 +08:00
Mingyu Chen	c2dd005efb	[fix](chore) fix BE compile and FE protoc artifact issue (#18120 ) add <optional> head to solve the compilation issue use 3.12.9 as the protoc.artifact's version, because there is no 3.12.21 See: https://repo.maven.apache.org/maven2/com/google/protobuf/protoc/ Remove --show-progress arguments of wget because it is not supported in low version wget	2023-03-27 08:53:42 +08:00
zhangdong	93cfd5cd2b	[Enhance](ComputeNode)support k8s watch (#17442 ) Describe your changes. 1.Add the watch mechanism to listen for changes in k8s statefulSet and update nodes in time. 2.For broker, there is only one name by default when using deployManager 3.Refactoring code makes it easier to understand and maintain 4.Fix jar package conflicts between okhttp-ws and okhttp Previously, the logic of k8sDeployManager.getGroupHostInfos was to call the endpoints () interface of k8s, which would cause if the pod was unexpectedly restarted, k8sDeployManager would delete the pod before the restart from the fe or be list and add the pod after the restart to the fe or be list, which obviously does not meet our expectations. Now, after fqdn is enabled, we call the statefulSets() interface of k8s to listen for the number of copies to determine whether we need to be online or offline. In addition, the watch mechanism is added to avoid the possible A-B-A problem caused by timed polling. For the sake of stability, when the watch mechanism does not receive messages for a period of time, it will be degraded to the polling mode. Now several environment variables have been added，ENV_FE_STATEFULSET，ENV_FE_OBSERVER_STATEFULSET，ENV_BE_STATEFULSET，ENV_BROKER_STATEFULSET，ENV_CN_STATEFULSET For statefulsetName，One-to-one correspondence with ENV_FE_SERVICE，ENV_FE_OBSERVER_SERVICE，ENV_BE_SERVICE，ENV_BROKER_SERVICE，ENV_CN_SERVICE，If a serviceName is configured, the corresponding statefulsetName must be configured, otherwise the program cannot be started.	2023-03-20 11:36:32 +08:00
morrySnow	295b26db00	[chore](fe) update aspectj-maven-plugin to 1.14.0 version (#17890 ) In #17797 , we introduced aspectj to help log exception easily. However, the plugin version 1.11 do not support jdk9 and later. For support compile FE with jdk11 update aspectj-maven-plugin to 1.14.0 version add new dependency org.aspectj.aspectjrt 1.9.7 to fe-core according to: aspectj java version compatibility aspectj-maven-plugin issue aspectj release note intro to aspectj	2023-03-19 14:50:09 +08:00
NetShrimp	0ec10d4836	[Enhancement](fe exception) write a java annotation to catch throwable from a method and print log (#17797 ) How it works? Aspectj is used to implement the aspect function of annotations. During the compilation process, the aspectj-maven-plugin plugin will automatically weave the code with aspect annotations into the generated classes file. When to use to? When a method wants to add a try catch to save exception information, the LogException annotation can be used. When there is a method that does not allow errors, the NoException annotation can be used. What is the result when adding this annotation? Use the LogException annotation to automatically capture exceptions into the Log file, and the code can be more concise. Use the NoException annotation to automatically capture the exception to the Log file and exit the program when an exception occurs.	2023-03-17 08:52:27 +08:00
Adonis Ling	310bdb60f4	[chore](maven) Prefer protoc in thirdparty to the one in maven artifacts (#17596 ) The prebuilt protoc-gen-grpc-java binary uses glibc on Linux and the version of glibc which Centos 6 uses is too old.	2023-03-09 16:21:38 +08:00
Calvin Kirs	b6128f9b65	[dependenct](fe) Replace jackson-mapper-asl with fastxml-jsckson (#17303 )	2023-03-09 09:35:58 +08:00
Calvin Kirs	d908d5fe01	[dependency](fe)Dependency Upgrade (#17377 ) * Upgrade log4j to 2.X - binding log4j version to 2.18.0 - used log4j-1.2-api complete smooth upgrade * Upgrade filerupload to 1.5 * Upgrade commons-io to 2.7 * Upgrade commons-compress to 1.22 * Upgrade gson to 2.8.9 * Upgrade guava to 30.0-jre * Binding jackson version to 2.14.2 * Upgrade netty-all to 4.1.89.final * Upgrade protobuf to 3.21.12 * Upgrade kafka-clints to 3.4.0 * Upgrade calcite version to 1.33.0 * Upgrade aws-java-sdk to 1.12.302 * Upgrade hadoop to 3.3.4 * Upgrade zookeeper to 3.4.14 * Binding tomcat-embed-core to 8.5.86 * Upgrade apache parent pom to 25 * Use hive-exec-core as a hive dependency, add the missing jar-hive-serde separately * Basic public dependencies are extracted to parent dependencies * Use jackson uniformly as the basic json tool * Remove springloaded, spring-boot-devtools has the same functionality * Modify the spark-related dependency scope to provide, which should be provided at runtime	2023-03-08 14:28:40 +08:00
Tiewei Fang	48c2d806d7	[enhencement](jdbc catalog) Use Druid instead of HikariCP in JdbcClient (#17395 ) This pr does three things: 1. Use Druid instead of HikariCP in JdbcClient 2. when download udf jar, add the name of the jar package after the local file name. 3. refactor some jdbcResource code	2023-03-07 08:51:10 +08:00
Yulei-Yang	449f2953c9	[Improvement](auth)(step-1) add ranger authorizer for hms catalog (#17153 )	2023-03-03 09:45:08 +08:00
slothever	51bbae27b8	[feature-wip](iceberg) add dlf and glue catalog impl for iceberg catalog (#16602 ) iceberg catalog supports DLF on Alibaba Cloud and AWS Glue Catalog	2023-02-23 14:02:41 +08:00
Adonis Ling	d56043ab5a	[feature-wip](MTMV) Support setting variables in query statement (#16060 ) ## Use case ```shell mysql> CREATE TABLE t_user ( -> event_day DATE, -> id bigint, -> username varchar(20) -> ) -> DISTRIBUTED BY HASH(id) BUCKETS 10 -> PROPERTIES ('replication_num' = '1'); Query OK, 0 rows affected (0.07 sec) mysql> CREATE TABLE t_user_pv( -> event_day DATE, -> id bigint, -> pv bigint -> ) -> DISTRIBUTED BY HASH(id) BUCKETS 10 -> PROPERTIES ('replication_num' = '1'); Query OK, 0 rows affected (0.09 sec) mysql> CREATE MATERIALIZED VIEW mv -> BUILD IMMEDIATE REFRESH COMPLETE -> KEY (username) -> DISTRIBUTED BY HASH(username) BUCKETS 10 -> PROPERTIES ('replication_num' = '1') -> AS SELECT /+ SET_VAR(exec_mem_limit=1048576, query_timeout=3600) / t1.username ,t2.pv FROM t_user t1 LEFT JOIN t_user_pv t2 on t1.id = t2.id; Query OK, 0 rows affected (0.10 sec) ```	2023-01-30 01:05:41 +08:00
jiafeng.zhang	da28d2faee	[deps](http)Upgrade springboot version to 2.7.8 (#16158 ) * Upgrade springboot version to 2.7.8 * fix	2023-01-28 20:13:50 +08:00
Mingyu Chen	726427b795	[refactor](fe) refactor and upgrade dependency tree of FE and support AWS glue catalog (#16046 ) 1. Spark dpp Move `DppResult` and `EtlJobConfig` to sparkdpp package in `fe-common` module. So taht `fe-core` is longer depends on `spark-dpp` module, so that the `spark-dpp.jar` will not be moved into `fe/lib`, which reduce the size of FE output. 2. Modify start_fe.sh Modify the CLASSPATH to make sure that doris-fe.jar is at front, so that when loading classes with same qualified name, it will be got from doris-fe.jar firstly. 3. Upgrade hadoop and hive version hadoop: 2.10.2 -> 3.3.3 hive: 2.3.7 -> 3.1.3 4. Override the IHiveMetastoreClient implementations from dependency `ProxyMetaStoreClient.java` for Aliyun DLF. `HiveMetaStoreClient.java` for origin Apache Hive metastore. Because I need to modified some of their method to make them compatible with different version of Hive. 5. Exclude some unused dependencies to reduce the size of FE output Now it is only 370MB (Before is 600MB) 6. Upgrade aws-java-sdk version to 1.12.31 7. Support AWS Glue Data Catalog 8. Remove HudiScanNode(no longer support)	2023-01-20 14:42:16 +08:00
jiafeng.zhang	d48abd91df	[deps](fe)upgrade deps version (#15262 ) upgrade hadoop version to 2.10.2 jackson-databind to 2.14.1	2022-12-24 22:18:10 +08:00

1 2 3

121 Commits