Commit Graph

32 Commits

Author SHA1 Message Date
4ff6eb55d0 [FlinkConnector] Make flink datastream source parameterized (#6473)
make flink datastream source parameterized as List<?> instead of Object.
2021-08-22 22:03:32 +08:00
4ea2fcefbc [Improve]The connector supports spark 3.0, flink 1.13 (#6449)
Modify the flink/spark compilation documentation
2021-08-18 15:57:50 +08:00
2f90aaab8e [Doc] flink/spark connector: add sources/javadoc plugins (#6435)
spark-doris-connector/flink-doris-connect add plugins to generate javadoc and sources jar,
so can be easy to distribute and debug.
2021-08-16 22:41:24 +08:00
b13e512a65 [Feature] Support spark connector sink data to Doris (#6256)
support spark conector write dataframe to doris
2021-08-16 22:40:43 +08:00
1a5b03167a [Doc] Add document for datax and sample codes (#6389)
Add documents for datax in extension catalog.
Add documents for sampes in best-practice catalog.
2021-08-11 11:51:13 +08:00
929b33ac0a [DataX] doriswriter support csv (#6373)
make doriswriter of DataX support format csv.  Format csv is more simple and faster than
format json when data is simple

add property format: csv/json
add property column_separator: effect when format is csv, for example "\x01" , "^", etc...
2021-08-10 10:14:21 +08:00
d9fc1bf3ca [Feature]:Flink-connector supports streamload parameters (#6243)
Flink-connector supports streamload parameters
#6199
2021-08-09 22:12:46 +08:00
8fe5c75877 [DataX] Refactor doriswriter (#6188)
1. Use `read_json_by_line` to load data
2. Use FE http server as the target host of stream load
2021-07-13 11:36:40 +08:00
fcd31f29b6 [Bug][Flink] Fix when data null , flink-connector throw NullPointerException (#6165) 2021-07-08 09:55:50 +08:00
c33321ff42 [Feature][DataX] Implementation Datax doriswriter plugin (#6107) 2021-07-08 09:33:02 +08:00
b69ebc3ec4 [Extension] Add DataX doriswriter extension directory (#6111)
This CL only add the script for building DataX development environment
2021-06-30 09:55:19 +08:00
28e7d01ef7 [FlinkConnector] Support time interval for flink connector (#5934) 2021-06-30 09:27:12 +08:00
04cc6eaadc [Log] Fix a mistake in DorisDynamicOutputFormat.java (#5963)
Fix a mistake DorisDynamicOutputFormat.java
2021-06-06 22:06:57 +08:00
7ef9aa13d4 [Bug] Modify spark, flink doris connector to send request to FE, fix the problem of POST method, it should be the same as the method when sending the request (#5788)
Modify spark, flink doris connector to send request to FE, fix the problem of POST method,
it should be the same as the method when sending the request
2021-05-19 09:28:21 +08:00
7eea811f6b [Feature] Flink Doris Connector (#5372) (#5375) 2021-04-23 09:43:48 +08:00
39136011c2 [Spark-Doris-Connector][Bug-Fix] Resolve deserialize exception when Spark Doris Connector in aync deserialize mode (#5336)
Resolve deserialize exception when Spark Doris Connector in aync deserialize mode 
Co-authored-by: lanhuajian <lanhuajian@sankuai.com>
2021-03-04 17:48:59 +08:00
5781d67afe Fix file licences (#5414)
Add license to files
For Doris 0.14
2021-02-24 16:37:17 +08:00
9c022e3764 [Bug] Spark doris connector http v2 authentication fails, and HTTP v2 interface returns json nesting problem (#5366)
1. Deal with the problem of inconsistent data format returned by http v1 and v2
2. Deal with user authentication failure
2021-02-07 09:28:55 +08:00
56d0cc3f54 [Spark on Doris] fix the encode of varchar when convertArrowToRowBatch (#5202)
`convertArrowToRowBatch` use the default charset to encode String.
Set it to UTF_8, because we use `arrow::utf8` on the Backends.
2021-01-10 20:48:46 +08:00
86d235a76a [Extension] Logstash Doris output plugin (#3800)
This plugin is used to output data to Doris for logstash
Use the HTTP protocol to interact with the Doris FE Http interface
Load data through Doris's stream load
2020-06-11 08:54:51 +08:00
4cbcae1574 [Spark on Doris] Shade and provide the thrift lib in spark-doris-connector (#3631)
Mainly changes:
1. Shade and provide the thrift lib in spark-doris-connector
2. Add a `build.sh` for spark-doris-connector
3. Move the README.md of spark-doris-connector to `docs/`
4. Change the line delimiter of `fe/src/test/java/org/apache/doris/analysis/AggregateTest.java`
2020-05-19 14:20:21 +08:00
c9c58342b2 [License] Add License to codes (#3272) 2020-04-07 16:35:13 +08:00
16b61b62f5 [Spark] Support convert Arrow data to RowBatch asynchronously in Spark-Doris-Connector (#3186)
Currently, in the Spark-Doris-Connector, when Spark iteratively obtains each row of data,
it needs to synchronously convert the Arrow format data into the row format required by Spark.
In order to speed up the conversion process, we can add an asynchronous thread in the Connector,
which is responsible for obtaining the Arrow format data from BE and converting it into the row
format required by Spark calculation

In our test environment, Doris cluster used 1 fe and 7 be (32C+128G). When using Spark-Doris-Connector
to query a table containing 67 columns, the original query returned 69 million rows of data
took about 2.5min, but after improvement, it reduced to about 1.6min, which reduced the time by about 30%
2020-03-26 21:34:37 +08:00
e20d905d70 Remove unused KUDU codes (#3175)
KUDU table is no longer supported long time ago. Remove code related to it.
2020-03-24 13:54:05 +08:00
1550401d4b Support param exec_mem_limit for spark-doris-connctor (#2775) 2020-01-18 00:14:39 +08:00
8ea5907252 Update arrow's version to 0.15.1 and shaded it in spark-doris-connector (#2769) 2020-01-15 21:08:34 +08:00
18a11f5663 Convert from arrow to rowbatch (#2723)
For #2722
In our test environment, Doris cluster used 1 fe and 7 be (32C+128G). When using spakr-doris connecter to query a table containing 67 columns, it took about 1 hour for the query to return 69 million rows of data. After the improvement, the same query condition took 2.5 minutes and the query performance was significantly improved
2020-01-10 14:11:15 +08:00
feda66f99f Spark return error to users when spark on doris query failed (#2531) 2019-12-30 21:58:13 +08:00
435fdd236e Fix npe in spark-doris-connector when query is complex (#2503) 2019-12-19 14:53:29 +08:00
48f559600f Fix bug when spark on doris run long time (#2485) 2019-12-18 13:08:21 +08:00
0e84a88c1a Fix document bugs in spark-doris-connector (#2275) 2019-11-22 18:05:36 +08:00
732c473043 Add spark-doris-connector extension (#2228) 2019-11-22 15:38:05 +08:00