Commit Graph

5 Commits

Author SHA1 Message Date
aae942b982 [Spark Load][Bug] Keep the column splitting in spark load consistent with broker load / mini load (#4532) 2020-09-06 20:33:26 +08:00
f5ee854b6f [Spark load][Bug] Fix column terminator for spark load (#4491)
Support specifying column separator without back slash.
2020-09-02 10:54:03 +08:00
wyb
82940a4905 [Spark Load] Fix spark load bugs (#4464)
1. fix write dpp result when dpp throw exception
2. boolean value:true, false(IgnoreCase), 0, 1
3. wrong dest column for source data check
4. support * in source file path 
5. if job state is cancelled or finished, submitPushTasks would throw all partitions have no load data exception,
    because tableToLoadPartitions was already cleaned up

#3433
2020-08-27 23:40:33 +08:00
790779fb6f [SparkLoad]remove unncessary convert from dataframe to rdd (#4304) 2020-08-13 23:37:38 +08:00
0e79f6908b [CodeRefactor] Modify FE modules (#4146)
This CL mainly changes:

1. Add 2 new FE modules

    1. fe-common

        save all common classes for other modules, currently only `jmockit`
        
    2. spark-dpp

        The Spark DPP application for Spark Load. And I removed all dpp related classes to this module, including unit tests.
        
2. Change the `build.sh`

    Add a new param `--spark-dpp` to compile the `spark-dpp` alone. And `--fe` will compile all FE modules.
    
    the output of `spark-dpp` module is `spark-dpp-1.0.0-jar-with-dependencies.jar`, and it will be installed to `output/fe/spark-dpp/`.

3. Modify some bugs of spark load
2020-07-29 16:18:05 +08:00