[improve](routine load) adjust default values to make routine load more convenient to use (#42491) (#44377)

pick (#42491)

For a routine load job, it will be divided into many tasks, each of
which is a transaction. Currently, the default time
consumed(max_batch_interval) is 10 seconds. The benefits of increasing
this value are:
1. Larger batch consumption can lead to better performance.
2. Reducing the number of transactions can alleviate the pressure of
compaction and the conflicts of concurrent transaction submissions.

related doc: https://github.com/apache/doris-website/pull/1236/files
This commit is contained in:
hui lai
2024-11-21 23:05:11 +08:00
committed by GitHub
parent a258915e22
commit 346b89e683
3 changed files with 6 additions and 5 deletions

View File

@ -108,7 +108,7 @@ public abstract class RoutineLoadJob extends AbstractTxnStateChangeCallback impl
public static final long DEFAULT_MAX_ERROR_NUM = 0;
public static final double DEFAULT_MAX_FILTER_RATIO = 1.0;
public static final long DEFAULT_MAX_INTERVAL_SECOND = 10;
public static final long DEFAULT_MAX_INTERVAL_SECOND = 60;
public static final long DEFAULT_MAX_BATCH_ROWS = 20000000;
public static final long DEFAULT_MAX_BATCH_SIZE = 1024 * 1024 * 1024; // 1GB
public static final long DEFAULT_EXEC_MEM_LIMIT = 2 * 1024 * 1024 * 1024L;

View File

@ -362,7 +362,7 @@ public class RoutineLoadJobTest {
+ "\"desired_concurrent_number\" = \"0\",\n"
+ "\"max_error_number\" = \"10\",\n"
+ "\"max_filter_ratio\" = \"1.0\",\n"
+ "\"max_batch_interval\" = \"10\",\n"
+ "\"max_batch_interval\" = \"60\",\n"
+ "\"max_batch_rows\" = \"10\",\n"
+ "\"max_batch_size\" = \"1073741824\",\n"
+ "\"format\" = \"csv\",\n"