Sample analyzing need to get row count by using table.getRowCount(). This method is not updated in real time, which may cause the sample task to scan whole table.
This pr is to fix this. Set the flag that indicate the analyze job is for an empty table and skip scan the table. Meanwhile, don't reset updatedRows in this case.
Set hugeTableAutoAnalyzeIntervalInMillis = 0 because all default huge table size has been set to 0.
We change memtable size from 200MB to 100MB to achieve smoother flush
performance. We change loadStreamPerNode from 20 to 60 to avoid stream
rpc to be the bottleneck when enable memtable_on_sink_node. We change
default s3&broker load parallelsim to make the most of CPUs on moderm
multi-core systems.
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
In the load process, if there are problems with the original data, we will store the error data in an error_log file on the disk for subsequent debugging. However, if there are many error data, it will occupy a lot of disk space. Now we want to limit the number of error data that is saved to the disk.
Be familiar with the usage of doris' import function and internal implementation process
Add a new be configuration item load_error_log_limit_bytes = default value 200MB
Use the newly added threshold to limit the amount of data that RuntimeState::append_error_msg_to_file writes to disk
Write regression cases for testing and verification
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>
If we specific target partition(s) when inserting overwrite an auto partition table,
before:
could create new partition
now:
behalf just like non-auto partition table
When string is too long, clucene will throw an error.
And the string is too long to analyze. So we ignore the string in index process when the string is longer than 256 bytes by default.
We add an poperty `ignore_above` for user to customize.
In some cases, users need to get the data size of single replica of a table, and evaluate certain actions based on this, such as estimating the precise backup size.
Signed-off-by: nextdreamblue <zxw520blue1@163.com>