Commit Graph

20 Commits

Author SHA1 Message Date
acc24df10a [fix](datax)doris writer url decoder fix #22401
When the user imports data, there are some special characters in the data, which will cause the import to fail
The following error message appears:

2023-07-28 15:15:28.960  INFO 21756 --- [-interval-flush] c.a.d.p.w.d.DorisWriterManager           : Doris interval Sinking triggered: label[datax_doris_writer_7aa415e6-5a9c-4070-a699-70b4a627ae64].
2023-07-28 15:15:29.015  INFO 21756 --- [       Thread-3] c.a.d.p.w.d.DorisStreamLoadObserver      : Start to join batch data: rows[95968] bytes[3815834] label[datax_doris_writer_7aa415e6-5a9c-4070-a699-70b4a627ae64].
2023-07-28 15:15:29.038  INFO 21756 --- [       Thread-3] c.a.d.p.w.d.DorisStreamLoadObserver      : Executing stream load to: 'http://10.38.60.218:8030/api/ods_prod/ods_pexweb_online_product/_stream_load', size: '3911802'
2023-07-28 15:15:31.559  WARN 21756 --- [       Thread-3] c.a.d.p.w.d.DorisStreamLoadObserver      : Request failed with code:500
2023-07-28 15:15:31.561  INFO 21756 --- [       Thread-3] c.a.d.p.w.d.DorisStreamLoadObserver      : StreamLoad response :null
2023-07-28 15:15:31.564  WARN 21756 --- [       Thread-3] c.a.d.p.w.d.DorisWriterManager           : Failed to flush batch data to Doris, retry times = 0

java.io.IOException: Unable to flush data to Doris: unknown result status.
	at com.alibaba.datax.plugin.writer.doriswriter.DorisStreamLoadObserver.streamLoad(DorisStreamLoadObserver.java:66) ~[doriswriter-0.0.1-SNAPSHOT.jar:na]
	at com.alibaba.datax.plugin.writer.doriswriter.DorisWriterManager.asyncFlush(DorisWriterManager.java:163) [doriswriter-0.0.1-SNAPSHOT.jar:na]
	at com.alibaba.datax.plugin.writer.doriswriter.DorisWriterManager.access$000(DorisWriterManager.java:19) [doriswriter-0.0.1-SNAPSHOT.jar:na]
	at com.alibaba.datax.plugin.writer.doriswriter.DorisWriterManager$1.run(DorisWriterManager.java:134) [doriswriter-0.0.1-SNAPSHOT.jar:na]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_221]

在fe.log日志中发现下面的错误信息:

ava.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in escape (%) pattern - For input string: " l"
        at java.net.URLDecoder.decode(URLDecoder.java:194) ~[?:1.8.0_221]
        at org.springframework.http.converter.FormHttpMessageConverter.read(FormHttpMessageConverter.java:352) ~[spring-web-5.3.22.jar:5.3.22]
        at org.springframework.web.filter.FormContentFilter.parseIfNecessary(FormContentFilter.java:109) ~[spring-web-5.3.22.jar:5.3.22]
        at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:88) ~[spring-web-5.3.22.jar:5.3.22]
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117) ~[spring-web-5.3.22.jar:5.3.22]
        at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) ~[jetty-servlet-9.4.48.v20220622.jar:9.4.48.v20220622]
        at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626) ~[jetty-servlet-9.4.48.v20220622.jar:9.4.48.v20220622]
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) ~[spring-web-5.3.22.jar:5.3.22]
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117) ~[spring-web-5.3.22.jar:5.3.22]
        at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) ~[jetty-servlet-9.4.48.v20220622.jar:9.4.48.v20220622]
        at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626) ~[jetty-servlet-9.4.48.v20220622.jar:9.4.48.v20220622]
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:552) ~[jetty-servlet-9.4.48.v20220622.jar:9.4.48.v20220622]
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) ~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
        at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:600) ~[jetty-security-9.4.48.v20220622.jar:9.4.48.v20220622]
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
        at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandle
2023-07-31 12:57:10 +08:00
bd5882d08a [fix](datax)doris writer write error (#14276)
* doris writer write error
2022-11-18 18:20:13 +08:00
ed19562cb3 And ali datax unified configuration naming, modify maxBatchSize to batchSize(#13278)
And ali datax unified configuration naming, modify maxBatchSize to batchSize
2022-10-11 14:51:19 +08:00
6ee150755a [refactor](datax)Refactoring doris writer code (#13226)
* Refactoring doris writer code
2022-10-11 08:47:05 +08:00
29fc167548 [Bug](Datax)Fix bug that the dataxwriter will drop column when convert map to json (#13042)
* fix bug that when value is null,toJSONString will drop this key value.
2022-09-29 11:37:10 +08:00
a4f9628576 [improvement](datax) improvement json import and support csv writing
1.At present, read_json_by_line and fuzzy_parse are used for json format writing, and the performance of streamload writing will decrease. It is modified to strip_outer_array and fuzzy_parse writing, and the speed is increased by about 3 times.

2.Add csv writing, the column separator is set to \x01, and the row separator is set to \x02, the performance is about 5 times higher than before
2022-08-09 11:50:24 +08:00
65dd8eb885 Update init-env.sh (#11111)
This script is missing "!"
2022-07-22 21:55:12 +08:00
468040974e [compile]Update init-env.sh (#10451) 2022-06-30 11:28:06 +08:00
87e3904cc6 Fix some typos for docs. (#9680) 2022-05-19 20:55:21 +08:00
c1707ca388 [feature][datax]doriswriter support timeZone (#9327) 2022-05-06 18:39:10 +08:00
3dd6b42781 [fix](datax) Fix the problem of keyword error when importing datax (#8893) 2022-04-08 09:20:54 +08:00
3b159a9820 support doriswriter build in macos (#8330)
support doriswriter build in macos (#8330)
2022-03-07 09:53:16 +08:00
4bdeef3b64 [chore][fix][doc](fe-plugin)(mysqldump) fix build auditlog plugin error (#7804)
1. fix problems when build fe_plugins
2. format
3. add docs about dump data using mysql dump
2022-01-26 09:11:23 +08:00
3a8a85b739 [Optimize][Extension] optimize extension datax doriswriter,Remove import doris via csv in Dataxwriter, only support via json (#7568)
* 1.Remove import doris via csv in Dataxwriter, only support via json;
2.Format Dataxwriter code;
3.Optimize exception handling and reduce multiple output of exception logs;
4.Update the dataxwriter's documentation;

* Delete DorisCsvCodec.java

delete unused file extension/DataX/doriswriter/src/main/java/com/alibaba/datax/plugin/writer/doriswriter/DorisCsvCodec.java

* 1.remove `format` config key;
2.Optimize serialization code in DorisJsonCodec class
2022-01-09 13:27:52 +08:00
dcad6ff5e5 [License] Add License header for missing files (#7130)
1. Add License header for missing files
2. Modify the spark pom.xml to correct the location of `thrift`
2021-11-16 18:37:54 +08:00
1a5b03167a [Doc] Add document for datax and sample codes (#6389)
Add documents for datax in extension catalog.
Add documents for sampes in best-practice catalog.
2021-08-11 11:51:13 +08:00
929b33ac0a [DataX] doriswriter support csv (#6373)
make doriswriter of DataX support format csv.  Format csv is more simple and faster than
format json when data is simple

add property format: csv/json
add property column_separator: effect when format is csv, for example "\x01" , "^", etc...
2021-08-10 10:14:21 +08:00
8fe5c75877 [DataX] Refactor doriswriter (#6188)
1. Use `read_json_by_line` to load data
2. Use FE http server as the target host of stream load
2021-07-13 11:36:40 +08:00
c33321ff42 [Feature][DataX] Implementation Datax doriswriter plugin (#6107) 2021-07-08 09:33:02 +08:00
b69ebc3ec4 [Extension] Add DataX doriswriter extension directory (#6111)
This CL only add the script for building DataX development environment
2021-06-30 09:55:19 +08:00