Commit Graph

64 Commits

Author SHA1 Message Date
b27122df8b [Docs] add rpc function document (#7984)
* add rpc function document
2022-02-20 23:26:52 +08:00
13d217b947 [community](doc) Refactor the release and verify doc (#8136)
Because of the latest moving of some code to a new repository, the documentation for release and verification
needs to be reorganized. There are 5 relevant documents as follows.

1. release-prepare.md
    General instructions for the release and related preparation work.
2. release-doris-core.md
    The Doris Core release process
3. release-doris-connectors
    The Doris Connectors release process
4. release-complete.md
   Release completion after release polling passed.
5. release-verify.md
    Release verification methods.
2022-02-20 16:31:26 +08:00
486a0586ac [community] modify the doc of verifying apache release (#8084) 2022-02-17 10:53:31 +08:00
264f38471c [feature](spark-load) add Hive Bitmap UDFs (#8036)
Hive Bitmap UDF provides UDFs for generating bitmap and bitmap operations in hive tables.
The bitmap in Hive is exactly the same as the Doris bitmap.
The bitmap in Hive can be imported into Doris through spark bitmap load.
2022-02-17 10:45:20 +08:00
92b690f3eb [feature-wip](iceberg) Step2: add table creation strict mode and support refresh iceberg table or db. (#7981)
1. Add `iceberg_table_creation_strict_mode` in `fe.conf` to control iceberg external table creation, when data type  is not supported in Doris.
2. Add `REFRESH` syntax to synchronize the Iceberg table and database.
3. Support create Iceberg external table with specific column definitions.
2022-02-10 15:08:04 +08:00
c3b010b277 [refactor] Remove flink/spark connectors (#8004)
As we discussed in dev@doris[1]
Flink/Spark connectors has been moved to new repo: https://github.com/apache/incubator-doris-connectors

[1] https://lists.apache.org/thread/hnb7bf0l6y6rzb9pr6lhxz3jjoo04skl
2022-02-10 15:00:36 +08:00
4c22f3c6e1 (docs) Improve Flink Connector documentation description (#7946) 2022-02-08 09:50:38 +08:00
3b8d48f08b [feature-wip](iceberg) Step1: Support create Iceberg external table (#7391)
Close related #7389

Support create Iceberg external table in Doris. 

This is the first step to support Iceberg external table.

### Create Iceberg external table
This pr describes two ways to create Iceberg external tables. Both ways do not require explicitly specifying column definitions, Doris automatically converts them based on Iceberg's column definitions.

1. Create an Iceberg external table directly

```sql
    CREATE [EXTERNAL] TABLE table_name 
    ENGINE = ICEBERG
    [COMMENT "comment"]
    PROPERTIES (
    "iceberg.database" = "iceberg_db_name",
    "iceberg.table" = "icberg_table_name",
    "iceberg.hive.metastore.uris"  =  "thrift://192.168.0.1:9083",
    "iceberg.catalog.type"  =  "HIVE_CATALOG"
    );
```

2. Create an Iceberg database and automatically create all the tables under that db.

```sql
    CREATE DATABASE db_name 
    [COMMENT "comment"]
    PROPERTIES (
    "iceberg.database" = "iceberg_db_name",
    "iceberg.hive.metastore.uris" = "thrift://192.168.0.1:9083",
    "iceberg.catalog.type" = "HIVE_CATALOG"
    );
```

### Show table creation

1. For individual tables you can view them with `help show create table`.

```sql 
mysql> show create table iceberg_db.logs_1;
+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table  | Create Table                                                                                                                                                                                                                                                                                                                                                 |
+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logs_1 | CREATE TABLE `logs_1` (
  `level` varchar(-1) NOT NULL COMMENT "null",
  `event_time` datetime NOT NULL COMMENT "null",
  `message` varchar(-1) NOT NULL COMMENT "null"
) ENGINE=ICEBERG
COMMENT "ICEBERG"
PROPERTIES (
"iceberg.database" = "doris",
"iceberg.table" = "logs_1",
"iceberg.hive.metastore.uris"  =  "thrift://10.10.10.10:9087",
"iceberg.catalog.type"  =  "HIVE_CATALOG"
) |
+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```

2. For Iceberg database, you can view it with `help show table creation`.

```sql
mysql> show table creation from iceberg_db;
+--------+---------+---------------------+---------------------------------------------------------+
| Table  | Status  | Create Time         | Error Msg                                               |
+--------+---------+---------------------+---------------------------------------------------------+
| logs   | fail    | 2021-12-14 13:50:10 | Cannot convert unknown type to Doris type: list<string> |
| logs_1 | success | 2021-12-14 13:50:10 |                                                         |
+--------+---------+---------------------+---------------------------------------------------------+
2 rows in set (0.00 sec)
```

  This is a new syntax.
  
  Show table creation records in Iceberg database:
  
  Syntax:
  ```sql
      SHOW TABLE CREATION [FROM db] [LIKE mask]
  ```
2022-01-27 10:22:47 +08:00
38660d5095 [docs]correct SeaTunnel website hyperlink and PPMC into the Member List (#7886)
* [docs]correct SeaTunnel website hyperlink
* [docs]Add the new PPMC into the Member List

* Update members.md
2022-01-26 15:24:52 +08:00
40f993ca15 [docs][seatunnel] add seatunnel flink doris sink doc (#7844) 2022-01-24 21:13:06 +08:00
60c6bb4f92 [Feature][flink-connector] support flink delete option (#7457)
* Flink Connector supports delete option on Unique models
Co-authored-by: wudi <wud3@shuhaisc.com>
2022-01-23 20:24:41 +08:00
4c17c370e7 [Doc]Modify the audit log plugin to record the SQL statement field type as String (#7798)
Modify the audit log plugin to record the SQL statement field type as String
2022-01-20 21:36:02 +08:00
4768dd4efa [docs] Documentation corrections (#7787) 2022-01-20 16:18:48 +08:00
db2649525f [docs](website) Add Database ODBC version correspondence (#7675) 2022-01-13 15:28:02 +08:00
2de79832fc [docs](hive)(function) fix Hive type error and optimize alias function example (#7694)
1. fix Hive type error 
2. optimize alias function example
2022-01-11 15:07:32 +08:00
3a8a85b739 [Optimize][Extension] optimize extension datax doriswriter,Remove import doris via csv in Dataxwriter, only support via json (#7568)
* 1.Remove import doris via csv in Dataxwriter, only support via json;
2.Format Dataxwriter code;
3.Optimize exception handling and reduce multiple output of exception logs;
4.Update the dataxwriter's documentation;

* Delete DorisCsvCodec.java

delete unused file extension/DataX/doriswriter/src/main/java/com/alibaba/datax/plugin/writer/doriswriter/DorisCsvCodec.java

* 1.remove `format` config key;
2.Optimize serialization code in DorisJsonCodec class
2022-01-09 13:27:52 +08:00
738d2d2e07 [refactor] update parent pom version and optimize build scripts (#7548) 2022-01-05 10:45:11 +08:00
b2c5f25ef4 [docs] add more faq and FE debugging method (#7422)
1. Add more faq and FE debugging method.
2. Add security document.
2021-12-31 09:55:04 +08:00
80587e7ac2 [improvement](spark-connector)(flink-connector) Modify the max num of batch written by Spark/Flink connector each time. (#7485)
Increase the default batch size and flush interval
2021-12-26 11:13:47 +08:00
889e33d53d [docs](seatunnel) Seatunnel Supports Doris connector (#7453) 2021-12-22 23:29:02 +08:00
5fed8a94ae [docs](flink-connector) Add instructions for flink doris connector (#7384) 2021-12-16 10:43:21 +08:00
62d12067aa [feature](udf) make orthogonal bitmap udaf as build in functions (#7211)
move orthogonal bitmap udaf as build in functions
add three buildin bitmap functions:

- orthogonal_bitmap_intersect
- orthogonal_bitmap_intersect_count
- orthogonal_bitmap_union_count
2021-12-07 09:57:26 +08:00
6e0664bdf8 [enhancement](audit) Enable fe audit plugin to audit more infos for query (#7300) 2021-12-06 10:33:15 +08:00
19a3c393a9 [Improvement](spark-connector) Add 'sink.batch.size' and 'sink.max-retries' options in spark-connector (#7281)
Add  `sink.batch.size` `sink.max-retries` options in `Doris Spark-connector`.
Be consistent with `link-connector` options .
eg:
```scala
   df.write
      .format("doris")
      // specify maximum number of lines in a single flushing
      .option("sink.batch.size",2048)
      // specify number of retries after writing failed
      .option("sink.max-retries",3)
      .save()
```
2021-12-06 10:29:33 +08:00
5b01f7bba2 [Feature] Support query hive table (#6569)
Users can directly query the data in the hive table in Doris, and can use join to perform complex queries without laboriously importing data from hive.

Main changes list below:

FE:

Extend HiveScanNode from BrokerScanNode
HiveMetaStoreClientHelper communicate with HIVE and HDFS.
BE:
Treate HiveScanNode as BrokerScanNode, treate HiveTable as BrokerTable.

broker_scanner.cpp: suppot read column from HDFS path.
orc_scanner.cpp: support read hdfs file.
POM:

Add hive.version=2.3.7, hive-metastore and hive-exec
Add hadoop.version=2.8.0, hadoop-hdfs
Upgrade commons-lang to fix incompatiblity of Java 9 and later.
Thrift:

Add THiveTable
Add read_by_column_def in TBrokerRangeDesc
2021-11-16 11:59:07 +08:00
f39a5bc1d0 [Feature] Spark connector supports to specify fields to write (#6973)
1. By default , Spark connector must write all fields value to `Doris` table .
In this feature , user can specify part of fields to write ,  even specify the order of the fields to write.

eg:
I have a table named `student` which has three columns (name,gender,age) ,
creating table sql as following:
```sql
create table student (name varchar(255), gender varchar(10), age int) duplicate key (name) distributed by hash(name) buckets 2;
```
Now , I just want  to write values to two columns : name , gender.
The code as following:
```scala
    val df = spark.createDataFrame(Seq(
      ("m", "zhangsan"),
      ("f", "lisi"),
      ("m", "wangwu")
    ))
    df.write
      .format("doris")
      .option("doris.fenodes", dorisFeNodes)
      .option("doris.table.identifier", dorisTable)
      .option("user", dorisUser)
      .option("password", dorisPwd)
      //specify your fields or the order
      .option("doris.write.field", "gender,name")
      .save()
```
2021-11-02 16:35:29 +08:00
addfff74c4 support use char like \x01 in flink-doris-sink column & line delimiter (#6937)
* support use char like \x01 in flink-doris-sink column & line delimiter

* extend imports

* add docs
2021-10-29 13:56:52 +08:00
ebb4c282b1 [Flink]Simplify the use of flink connector (#6892)
1. Simplify the use of flink connector like other stream sink by GenericDorisSinkFunction.
2. Add the use cases of flink connector.

## Use case
```
env.fromElements("{\"longitude\": \"116.405419\", \"city\": \"北京\", \"latitude\": \"39.916927\"}")
     .addSink(
          DorisSink.sink(
             DorisOptions.builder()
                   .setFenodes("FE_IP:8030")
                   .setTableIdentifier("db.table")
                   .setUsername("root")
                   .setPassword("").build()
                ));
```
2021-10-23 18:10:47 +08:00
30bf6c0d1d [DOC] minor update (#6820) 2021-10-13 09:14:56 +08:00
237a8ae948 [Feature] support spark connector sink data using sql (#6796)
Co-authored-by: wei.zhao <wei.zhao@aispeech.com>
2021-10-09 15:47:36 +08:00
ad3c9390a2 [Bug] Fix bdbje getDatabaseNames() bug and scan node close bug (#6769)
1. This bug is introduced from #6582
2. Optimize the error log of Address used used error msg.
3. Add some document about compilation.
    1. Add a custom thirdparty download url.
    2. Add a custom com.alibaba maven jar package for DataX.
4. Fix bug that BE crash when closing scan node, introduced from #6622.
2021-09-29 11:11:28 +08:00
8d471007a6 [Feature] support spark connector sink stream data to doris (#6761)
* [Feature] support spark connector sink stream data to doris

* [Doc] Add spark-connector batch/stream writing instructions

* add license and remove meaningless blanks code

Co-authored-by: wei.zhao <wei.zhao@aispeech.com>
2021-09-28 17:46:19 +08:00
df5ba6b5a2 [Fix] Flink connector support json import and use httpclient to streamlaod (#6740)
* [Bug]:fix when data null , throw NullPointerException

* [Bug]:Distinguish between null and empty string

* [Feature]:flink-connector supports streamload parameters

* [Fix]:code style

* [Fix]: support json format import and use httpclient to streamload

* [Fix]:remove System out

* [Fix]:upgrade httpclient  version

* [Doc]: add json format import doc

Co-authored-by: wudi <wud3@shuhaisc.com>
2021-09-28 17:37:03 +08:00
4ea2fcefbc [Improve]The connector supports spark 3.0, flink 1.13 (#6449)
Modify the flink/spark compilation documentation
2021-08-18 15:57:50 +08:00
1a5b03167a [Doc] Add document for datax and sample codes (#6389)
Add documents for datax in extension catalog.
Add documents for sampes in best-practice catalog.
2021-08-11 11:51:13 +08:00
d9fc1bf3ca [Feature]:Flink-connector supports streamload parameters (#6243)
Flink-connector supports streamload parameters
#6199
2021-08-09 22:12:46 +08:00
8ae129967e [ODBC] Support ODBC external table of SQLServer and revise the doc. (#6223)
Support ODBC external table of SQLServer
2021-07-21 10:50:56 +08:00
28e7d01ef7 [FlinkConnector] Support time interval for flink connector (#5934) 2021-06-30 09:27:12 +08:00
fc1389240f [Doc]Modify the flink connector document's instructions on enabling http v2 (#5883)
[doc]Modify the flink connector document's instructions on enabling http v2
2021-05-26 10:01:31 +08:00
d784cc06f6 [Doc] Flink doris connector document modification (#5769) 2021-05-12 10:59:35 +08:00
7eea811f6b [Feature] Flink Doris Connector (#5372) (#5375) 2021-04-23 09:43:48 +08:00
75db273b93 [Doris On ES][WIP] Support external ES table with SSL secured and configurable node sniffing (#5325)
Support external ES  table with `SSL` secured and configurable node sniffing
2021-04-12 11:23:49 +08:00
64fa305c06 [Doc] correct format errors in English doc (#5487)
Some formate errors in English doc.
They are very straightforward and should not break any existing build.
2021-03-11 22:34:54 +08:00
8855782aab [Doc] Fix page links (#5454) 2021-03-06 16:13:56 +08:00
4e1b6b3eef [ODBC] Let the type conversion of the fail in query in ODBC of MySQL table to prompt the information of the column (#5422)
Let the type conversion of the fail in query in ODBC of MySQL table to prompt the information of the column
2021-03-04 22:23:37 +08:00
e93a6da0e5 [Doc] correct format errors in English doc (#5321)
Fix some English doc format errors
2021-02-26 11:32:14 +08:00
67b0631257 [Enhancement] Fix bug for auditloader plugin that audit event cannot be processed in time (#5194)
* [Enhancement] Fix bug that audit event cannot be processed in time

Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>
2021-01-28 10:53:32 +08:00
115d4332aa [ODBC] Support ODBC Sink for insert into data to ODBC external table (#5033)
issue:#5031

1. Support ODBC Sink for insert into data to ODBC external table.
2. Support Transaction for ODBC sink to make sure insert into data is atomicital.
3. The document about ODBC sink has been modified
2020-12-13 21:53:27 +08:00
54fa76359b [ODBC] Support ODBC external table of PostgreSQL and revise the doc. (#4798) 2020-10-29 14:31:23 +08:00
751aa05cc0 fix docs typo (#4725) 2020-10-14 09:27:50 +08:00