doris

Author	SHA1	Message	Date
Zhengguo Yang	57b7a416d2	[chore](build) add apache snapshot maven repo to repositories (#11549 )	2022-08-06 07:15:28 +08:00
Zhengguo Yang	95091256b0	[chore](deps) update bdbje tp doris bdbje, update libhdfs3 to improve performance (#11497 )	2022-08-04 17:10:56 +08:00
jiafeng.zhang	388db05ef9	[bugfix](log4j) Upgrade log4j to 2.18.0 (#11368 )	2022-07-31 22:21:33 +08:00
huangzhaowei	6963c41a04	[dependency] Upgrade Apache Commons Validator version to the latest one (#10508 )	2022-07-22 17:03:46 +08:00
Stalary	68b9a2936a	[improvement](doe) Step1: Fe generates the DSL and is used to explain (#9895 ) For the first step, I will only change FE and then change BE once I make sure the DSL is ok.	2022-07-18 23:20:58 +08:00
Gabriel	e769597fd2	[Improvement] (datetime) support microsecond for date literal (#10917 ) * [Improvement] (datetime) support microsecond for date literal * remove joda dependency	2022-07-18 21:39:39 +08:00
Mingyu Chen	67f341f44e	[TLP](step-1) Remove incubator prefix (#10230 ) Remove some `incubator-` prefix in source code. The document is not modified, will be done in next PR.	2022-06-19 19:34:52 +08:00
morrySnow	b7b78ae707	[style](fe)the last step of fe CheckStyle (#10134 ) 1. fix all checkstyle warning 2. change all checkstyle rules to error 3. remove some java doc rules a. RequireEmptyLineBeforeBlockTagGroup b. JavadocStyle c. JavadocParagraph 4. suppress some rules for old codes a. all java doc rules only affect on Nereids b. DeclarationOrder only affect on Nereids c. OverloadMethodsDeclarationOrder only affect on Nereids d. VariableDeclarationUsageDistance only affect on Nereids e. suppress OneTopLevelClass on org/apache/doris/load/loadv2/dpp/ColumnParser.java f. suppress OneTopLevelClass on org/apache/doris/load/loadv2/dpp/SparkRDDAggregator.java g. suppress LineLength on org/apache/doris/catalog/FunctionSet.java h. suppress LineLength on org/apache/doris/common/ErrorCode.java	2022-06-17 21:02:45 +08:00
Zhengguo Yang	24ad11af6a	[deps] upgrade fabric8 k8s client to compitable new k8s cluster (#9933 )	2022-06-06 10:00:36 +08:00
dujl	8092439634	[feature](hudi) Step2: Support query hudi external table(include cow and mor table) (#9752 ) support query cow and mor hudi table.	2022-05-30 09:43:36 +08:00
dujl	72e0042efb	[feature-wip](hudi) Step1: Support create hudi external table (#9559 ) support create hudi table support show create table for hudi table ### Design 1. create hudi table without schema(recommanded) ```sql CREATE [EXTERNAL] TABLE table_name ENGINE = HUDI [COMMENT "comment"] PROPERTIES ( "hudi.database" = "hudi_db_in_hive_metastore", "hudi.table" = "hudi_table_in_hive_metastore", "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083" ); ``` 2. create hudi table with schema ```sql CREATE [EXTERNAL] TABLE table_name [(column_definition1[, column_definition2, ...])] ENGINE = HUDI [COMMENT "comment"] PROPERTIES ( "hudi.database" = "hudi_db_in_hive_metastore", "hudi.table" = "hudi_table_in_hive_metastore", "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083" ); ``` When create hudi table with schema, the columns must exist in corresponding table in hive metastore.	2022-05-17 11:30:23 +08:00
morrySnow	122cc3b772	[chore](fe code style)add suppressions to fe check style (#9429 ) Current fe check style check all files. But some rules should be only applied on production files. Add suppressions to suppress some rules on test files.	2022-05-12 12:16:55 +08:00
leo65535	d1b85d51a0	[code style](fe) Include test sources (#9366 ) Include test sources, we also need to check them.	2022-05-09 09:40:44 +08:00
Shuo Wang	1746f61388	[refactor](test) Refactor FE unit test framework that starts a FE server. (#9388 ) Currently, we use `UtFrameUtils` to start a FE server in the FE unit test. Each test class has to do some initialization and clean up stuff with the JUnit4 `@BeforeClass` and `@AfterClass` annotation. It's redundant and boring. Besides, almost all the APIs in `UtFrameUtils` has a `ConnectContext` parameter, which is not easy to use. This PR proposes to use an inherit-manner, i.e., wrap all the common logic in base class `TestWithFeService`, leveraging the JUnit5 `@BeforeAll` and `@AfterAll` annotation to narrow down the setup and cleanup lifecycle to each test class instance. At the same time, the derived concrete test class could directly use utility methods inherited from the base class, without calling a util class and passing a `ConnectContext` argument. `UtFrameUtils` and `DorisAssert` are marked as deprecated. We could remove these two classes if this refactor works well for a time.	2022-05-07 21:28:42 +08:00
morrySnow	784681f106	[FE Code Style][step 0]add github action to check incremental code in pr (#9328 ) 1. add rules to checkstyle 2. add github action to check incremental code in pr	2022-05-01 17:30:29 +08:00
Zhengguo Yang	b6b6e17eb7	[chore] (workflow)add sonarcloud workflow to check code quality and security (#9252 )	2022-04-28 11:09:56 +08:00
Stalary	af2295f971	MOD: remove <scope>provided</scope> (#9177 )	2022-04-25 10:00:57 +08:00
jiafeng.zhang	13f1f94f86	[chore] upgrade log4j version to 2.17.2 (#8774 ) upgrade log4j version to 2.17.2	2022-04-02 21:29:25 +08:00
Mingyu Chen	b98da02611	[chore][fix](httpv2) Use mariadb-java-client for http query api (#8716 ) In #8319, I remove mysql-connector-java dependency because of license incompatibility. But we need a mysql compatible driver for http query api. So I choose mariadb-java-client, which is under LGPL.	2022-03-30 09:59:45 +08:00
Mingyu Chen	22cf6ea17c	[chore] Modify build.sh and refactor dependency of FE submodules (#8732 ) This PR fixes the #8731 and refactor the `build.sh` script. The build.sh script is currently responsible for the compilation of the following Doris components. 1. FE - fe-common - fe-core - spark-dpp - hive-udf - java-udf - ui 2. BE - palo_be - meta_tool 3. broker In the FE module. - The 4 submodules `fe-common, fe-core, spark-dpp and ui` together form Frontend. - `spark-dpp, hive-udf and java-udf` can be compiled separately to produce jar packages for individual use. In the BE module. - `palo_be` can start the BE process separately. - `meta_tool` can be compiled separately to produce binaries. The modified build.sh script has the following changes: 1. there is no longer an option to compile `ui` separately, build together with `--fe`. 2. `fe/be/spark-dpp/hive-udf/java-udf/palo_be/meta_tool` can be compiled separately. 3. all components except `java-udf` will be compiled by default (`java-udf` is in development) Remaining issues: Several submodules of FE have messy dependencies. For example, `java-udf` depends on `fe-core`, and `fe-core` depends on `spark-dpp`, resulting in a large binary jar of `java-udf`. It needs to be reorganized afterwards.	2022-03-30 00:13:24 +08:00
Zhengguo Yang	b2861f36c4	[chore] optimize aws thirdparty package download. (#8637 )	2022-03-28 09:35:51 +08:00
Gabriel	b89e4c7bba	[feature-wip](java-udf) support java UDF with fixed-length input and output (#8516 ) This feature is propsoed in [DSIP-1](https://cwiki.apache.org/confluence/display/DORIS/DSIP-001%3A+Java+UDF). This PR support fixed-length input and output Java UDF. Phase I in DIP-1 is done after this PR. To support Java UDF effeciently, I use no data copy in JNI call and all compute operations are off-heap in Java. To achieve that, I use a UdfExecutor instead. For users, a UDF class must have a public evaluate method.	2022-03-23 10:32:50 +08:00
Zhengguo Yang	f3c44bcd75	[chore][fix](librdkafka) disable librdkafka assert and update some thirdparty (#8425 ) 1. comment librdkafka `rd_assert(thrd_is_current(rkb->rkb_thread));` to avoid core dump 2. upgrade arrow to 7.0.0 3. upgrade aws sdk to 1.9 4. upgrade orc to 1.7.2	2022-03-12 22:09:06 +08:00
Mingyu Chen	50a59f3f86	[license] Organize third-party dependent licenses for bianry releases (#8350 )	2022-03-07 23:18:58 +08:00
Mingyu Chen	9961b2c860	[refactor] Remove mysql-connector and replace org.json with com.googlecode.json-simple (#8319 ) 1. mysql-connector-java mysql-connector-java is under GLPv2 license, which is not compatible with APLv2, and Doris does not use it. 2. org.json org.json is under JSON license, which is not compatible with APLv2. I use `json-simple` to replace it.	2022-03-05 14:41:04 +08:00
Mingyu Chen	315bfe2d0e	Revert "[chore](dependency) upgrade-grpc-version (#8218 )" (#8250 ) This reverts commit df7e848cbbc8170c7bd83d812d7cac58b5574570. Reverts apache/incubator-doris#8218 Because when using grpc 1.44.1, the corresponding `protoc-gen-grpc-java` plugin requried GLIBC_2.14, which is not found in CentOS 6. So I suggest to revert this commit this time. And considering upgrading this component after most systems have reached glibc version 2.14. And for Mac M1, you may have to change this version manually for now	2022-03-02 10:16:25 +08:00
Mingyu Chen	93c638f3a2	[fix][chore](insert)(fe) Fix analysis error of insert stmt and modify grpc-netty dependency (#8265 ) This bug is introduced from #8112. Also , I change the `grpc-netty` dependency to `grpc-netty-shaded`, to avoid dependency conflict: ``` java.lang.NoSuchMethodError: io.netty.buffer.PooledByteBufAllocator. ```	2022-03-01 11:12:10 +08:00
qiye	87b96cfcd6	[feature](iceberg) Step3: Support query iceberg external table (#8179 ) 1. Add Iceberg scan node 2. Add Iceberg/Hive table type in thrift 3. Support querying Iceberg tables of format types `parquet` and `orc`	2022-02-26 17:04:11 +08:00
wunan1210	df7e848cbb	[chore](dependency) upgrade-grpc-version (#8218 ) upgrade grpc.version, so macos with M1 chip can build Fe correctly. 1.30.0 -> 1.44.1	2022-02-24 23:17:32 +08:00
lihuigang	264f38471c	[feature](spark-load) add Hive Bitmap UDFs (#8036 ) Hive Bitmap UDF provides UDFs for generating bitmap and bitmap operations in hive tables. The bitmap in Hive is exactly the same as the Doris bitmap. The bitmap in Hive can be imported into Doris through spark bitmap load.	2022-02-17 10:45:20 +08:00
qiye	3b8d48f08b	[feature-wip](iceberg) Step1: Support create Iceberg external table (#7391 ) Close related #7389 Support create Iceberg external table in Doris. This is the first step to support Iceberg external table. ### Create Iceberg external table This pr describes two ways to create Iceberg external tables. Both ways do not require explicitly specifying column definitions, Doris automatically converts them based on Iceberg's column definitions. 1. Create an Iceberg external table directly ```sql CREATE [EXTERNAL] TABLE table_name ENGINE = ICEBERG [COMMENT "comment"] PROPERTIES ( "iceberg.database" = "iceberg_db_name", "iceberg.table" = "icberg_table_name", "iceberg.hive.metastore.uris" = "thrift://192.168.0.1:9083", "iceberg.catalog.type" = "HIVE_CATALOG" ); ``` 2. Create an Iceberg database and automatically create all the tables under that db. ```sql CREATE DATABASE db_name [COMMENT "comment"] PROPERTIES ( "iceberg.database" = "iceberg_db_name", "iceberg.hive.metastore.uris" = "thrift://192.168.0.1:9083", "iceberg.catalog.type" = "HIVE_CATALOG" ); ``` ### Show table creation 1. For individual tables you can view them with `help show create table`. ```sql mysql> show create table iceberg_db.logs_1; +--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| Table \| Create Table \| +--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| logs_1 \| CREATE TABLE `logs_1` ( `level` varchar(-1) NOT NULL COMMENT "null", `event_time` datetime NOT NULL COMMENT "null", `message` varchar(-1) NOT NULL COMMENT "null" ) ENGINE=ICEBERG COMMENT "ICEBERG" PROPERTIES ( "iceberg.database" = "doris", "iceberg.table" = "logs_1", "iceberg.hive.metastore.uris" = "thrift://10.10.10.10:9087", "iceberg.catalog.type" = "HIVE_CATALOG" ) \| +--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ ``` 2. For Iceberg database, you can view it with `help show table creation`. ```sql mysql> show table creation from iceberg_db; +--------+---------+---------------------+---------------------------------------------------------+ \| Table \| Status \| Create Time \| Error Msg \| +--------+---------+---------------------+---------------------------------------------------------+ \| logs \| fail \| 2021-12-14 13:50:10 \| Cannot convert unknown type to Doris type: list<string> \| \| logs_1 \| success \| 2021-12-14 13:50:10 \| \| +--------+---------+---------------------+---------------------------------------------------------+ 2 rows in set (0.00 sec) ``` This is a new syntax. Show table creation records in Iceberg database: Syntax: ```sql SHOW TABLE CREATION [FROM db] [LIKE mask] ```	2022-01-27 10:22:47 +08:00
Zhengguo Yang	4bdeef3b64	[chore][fix][doc](fe-plugin)(mysqldump) fix build auditlog plugin error (#7804 ) 1. fix problems when build fe_plugins 2. format 3. add docs about dump data using mysql dump	2022-01-26 09:11:23 +08:00
Mingyu Chen	4ac8b3c9a9	[fix][s3] Fix bug that can not visit aliyun oss with aws s3 sdk (#7691 ) Close #7690 1. Exclude httpclient and httpcore dependencies from thrift@0.13 Explicitly use httpclient@4.5.13 and httpcore@4.4.15 https://stackoverflow.com/questions/59265959/java-lang-bootstrapmethoderror-call-site-initialization-exception-from-athena-j 2. Exclude aws-java-sdk-s3 dependency from hadoop-aws Explicitly use aws-java-sdk-s3@1.11.95 https://github.com/aws/aws-sdk-java/issues/1032	2022-01-11 15:00:31 +08:00
Zhengguo Yang	ad35067a2a	[chore][docs] add deploy spark/flink connectors to maven release repo docs (#7616 )	2022-01-06 23:23:33 +08:00
Zhengguo Yang	738d2d2e07	[refactor] update parent pom version and optimize build scripts (#7548 )	2022-01-05 10:45:11 +08:00
jiafeng.zhang	85c30fc720	[deps] Upgrade Log4j to 2.7.1 to solve the CVE-2021-44832 security vulnerability (#7536 ) Upgrade Log4j to 2.7.1 to solve the CVE-2021-44832 security vulnerability Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>	2021-12-30 10:21:37 +08:00
Zhengguo Yang	2872dbfeb8	[refactor] Standardize the writing of pom files, prepare for deployment to maven (#7477 )	2021-12-30 10:16:37 +08:00
jiafeng.zhang	7a1bb5b335	log4j upgrade to 2.17.0 (#7440 ) Solved the third security vulnerability CVE-2021-45105 that was discovered	2021-12-21 09:28:02 +08:00
jiafeng.zhang	e64da03866	[deps](log4j) Upgrade log4j 2 to 2.16.0 (#7394 ) Upgrade log4j 2 to 2.16.0, the official strongly recommends upgrading to this version	2021-12-14 15:57:16 +08:00
jiafeng.zhang	568f6611df	[deps](log4j) upgrade log4j (#7364 ) to 2.15.0	2021-12-10 23:19:11 +08:00
Zhengguo Yang	200210e708	[fix] (ut) fix fe unit test failed, this is because we fix the MAX_PHYSICAL_PACKET_LENGTH to 0xffffff	2021-12-06 11:13:01 +08:00
caiconghui	07296a301b	[chore](fe) Fix build error caused by Inaccessible pentaho-aggdesigner-algorithm jar (#7161 )	2021-11-20 21:48:26 +08:00
Zhengguo Yang	52ebb3d8f5	[feat](mysql-compatibility) Increase compatibility with mysql (#7041 ) Increase compatibility with mysql 1. Added two system tables files and partitions 2. Improved the return logic of mysql error code to make the error code more compatible with mysql 3. Added lock/unlock tables statement and show columns statement for compatibility with mysql dump 4. Compatible with mysqldump tool, now you can use mysql dump to dump data and table structure from doris now use mysqldump may print error message like ``` $ mysqldump -h127.0.0.1 -P9130 -uroot test_query_qa > a mysqldump: Error: 'errCode = 2, detailMessage = select list expression not produced by aggregation output (missing from GROUP BY clause?): `EXTRA`' when trying to dump tablespaces ``` This error message not effect the export file, you can add `--no-tablespaces` to avoid this error	2021-11-20 21:39:37 +08:00
qiye	5b01f7bba2	[Feature] Support query hive table (#6569 ) Users can directly query the data in the hive table in Doris, and can use join to perform complex queries without laboriously importing data from hive. Main changes list below: FE: Extend HiveScanNode from BrokerScanNode HiveMetaStoreClientHelper communicate with HIVE and HDFS. BE: Treate HiveScanNode as BrokerScanNode, treate HiveTable as BrokerTable. broker_scanner.cpp: suppot read column from HDFS path. orc_scanner.cpp: support read hdfs file. POM: Add hive.version=2.3.7, hive-metastore and hive-exec Add hadoop.version=2.8.0, hadoop-hdfs Upgrade commons-lang to fix incompatiblity of Java 9 and later. Thrift: Add THiveTable Add read_by_column_def in TBrokerRangeDesc	2021-11-16 11:59:07 +08:00
zh0122	974a894688	Update Spring version to fix CVE-2020-5421 (#7023 )	2021-11-06 13:29:24 +08:00
Mingyu Chen	e8cabfff27	[S3] Support path style endpoint (#6962 ) Add a use_path_style property for S3 Upgrade hadoop-common and hadoop-aws to 2.8.0 to support path style property Fix some S3 URI bugs Add some logs for tracing load process.	2021-11-01 10:48:10 +08:00
Zhengguo Yang	24d38614a0	[Dependency] Upgrade thirdparty libs (#6766 ) Upgrade the following dependecies: libevent -> 2.1.12 OpenSSL 1.0.2k -> 1.1.1l thrift 0.9.3 -> 0.13.0 protobuf 3.5.1 -> 3.14.0 gflags 2.2.0 -> 2.2.2 glog 0.3.3 -> 0.4.0 googletest 1.8.0 -> 1.10.0 snappy 1.1.7 -> 1.1.8 gperftools 2.7 -> 2.9.1 lz4 1.7.5 -> 1.9.3 curl 7.54.1 -> 7.79.0 re2 2017-05-01 -> 2021-02-02 zstd 1.3.7 -> 1.5.0 brotli 1.0.7 -> 1.0.9 flatbuffers 1.10.0 -> 2.0.0 apache-arrow 0.15.1 -> 5.0.0 CRoaring 0.2.60 -> 0.3.4 orc 1.5.8 -> 1.6.6 libdivide 4.0.0 -> 5.0 brpc 0.97 -> 1.0.0-rc02 librdkafka 1.7.0 -> 1.8.0 after this pr compile doris should use build-env:1.4.0	2021-10-15 13:03:04 +08:00
Mingyu Chen	fa290383dc	[Doc] Modify README to add some statistical indicators (#6486 ) 1. Add license/total line/release badegs. 2. Add monthly active contributor and contributor growth graph 3. fix a pom.xml bug 4. Modify some routine load log on BE side	2021-08-25 09:36:26 +08:00
Kuncle	7a8837c962	[Maven][Dependency][Bug][DOE] fix sync es metadata issue on jdk 13 (#6250 )	2021-07-18 22:16:38 +08:00
Mingyu Chen	a4b1622ceb	[HttpV2] Add more httpv2 APIs (#6210 ) 1. /api/cluster_overview to view some statistic info of the cluster 2. /api/meta/ to view the database/table schema 3. /api/import/file_review to review the file content with format CSV or PARQUET.	2021-07-18 22:14:42 +08:00

1 2

95 Commits