fa0ad56817
[exec](compress) use FragmentTransmissionCompressionCodec control the exchange compress behavior ( #28818 )
2023-12-22 19:50:57 +08:00
aca8406e31
[refactor](executor)remove scan group #28847
2023-12-22 17:05:50 +08:00
453e3c18f4
[refactor](buffer) remove download buffer since it is no longer useful ( #28832 )
...
remove download buffer since it is no longer useful
2023-12-22 11:53:31 +08:00
172f68480b
[Enhancement](load) Limit the number of incorrect data drops and add documents ( #27727 )
...
In the load process, if there are problems with the original data, we will store the error data in an error_log file on the disk for subsequent debugging. However, if there are many error data, it will occupy a lot of disk space. Now we want to limit the number of error data that is saved to the disk.
Be familiar with the usage of doris' import function and internal implementation process
Add a new be configuration item load_error_log_limit_bytes = default value 200MB
Use the newly added threshold to limit the amount of data that RuntimeState::append_error_msg_to_file writes to disk
Write regression cases for testing and verification
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com >
2023-12-22 10:43:18 +08:00
0070909d30
[fix](group commit)Fix the issue of duplicate addition of wal path when encouter exception ( #28691 )
2023-12-21 20:27:33 +08:00
db523dafcb
[improve](move-memtable) limit task num in load stream flush token ( #28748 )
2023-12-21 12:19:58 +08:00
1253ed006e
[fix](memtable-limiter) do not block write if load mem usage is low ( #28602 )
...
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com >
2023-12-19 13:28:17 +08:00
9434ee5710
[fix](load) fix memtracking orphan too large ( #28600 )
2023-12-19 12:41:19 +08:00
66fbb22ad7
[fix](group commit) Fix some wal problems on group commit ( #28554 )
2023-12-19 09:51:03 +08:00
469edbdd3d
[feature](executor)make scan task wait timeout config #28467
2023-12-16 11:36:15 +08:00
82a91380e6
[enhancement](compaction) Add support for limiting low priority compaction scheduling ( #27648 )
2023-12-14 18:31:23 +08:00
e6e8632167
[improvement](merge-on-write) Optimize publish when there are missing versions ( #28012 )
...
1. Do not retry publishing on be When there are too many missing versions, just
add to async publish task.
2. To reduce memory consumption, clean up the tasks when there are too many
async publish tasks.
2023-12-13 16:59:25 +08:00
b901800963
[Fix] Support BE log sys_log_level modify to take effect dynamically (apache#26060) ( #28203 )
2023-12-13 11:08:48 +08:00
45b2dbab6a
[improve](group commit) Group commit support max filter ratio when rows is less than value in config ( #28139 )
2023-12-12 16:33:36 +08:00
e49ed3d885
[regression test](memtable) add case for aggregation memtable ( #28056 )
...
1. create aggregation table
2. insert some data
3. drop the table and create again
4. modify some parameters for some branch
5. insert some data
6. change the parameters back to its default
2023-12-12 11:14:59 +08:00
7fba3fcb91
[pipelineX](improvement) block local shuffle sink by mem usage ( #28224 )
2023-12-11 21:25:31 +08:00
593cc92501
[chore] Change default max segment size to 1GB ( #28201 )
2023-12-11 14:30:57 +08:00
59ec3da899
open workload group in PR pipeline ( #27744 )
2023-12-08 11:56:03 +08:00
394b420180
[Update](inverted index) use session variable for inverted index try query threshold ( #28052 )
...
* [Update](inverted index) use session variable for inverted index try query threshold
* remove unused config
* update clucene
2023-12-07 17:54:44 +08:00
a27c068a9d
[improve](move-memtable) make StreamWait time configurable ( #28086 )
2023-12-07 17:27:43 +08:00
84a651d976
[improve](load) rewrite memtable memory limiter rules ( #27759 )
2023-12-07 17:26:26 +08:00
48935c14e2
[Improvement](variant) limit the column size on tablet schema ( #27399 ) ( #27785 )
...
1. limit the column count to default 2048
2. fix get_inverted_index return nullptr when variant's unique id is -1, using it's parent unique id instead
3. avoid add same path subcolumn duplicately in tablet schema
4. make extracted column unique id -1
2023-12-04 14:47:36 +08:00
db8e56b9f2
[improve](move-memtable) increase open load stream timeout ( #26909 )
2023-11-30 15:27:29 +08:00
112ae59aa4
[fix](move-memtable) add timeout for load stream close wait ( #27439 )
2023-11-30 12:00:06 +08:00
d96e2dfefb
[feature-wip](arrow-flight)(step5) Support JDBC and PreparedStatement and Fix Bug ( #27661 )
2023-11-29 21:17:20 +08:00
9daa7dc6b5
[refactor](http) disable snapshot and get_log_file api ( #27724 )
...
Disable 2 http api by default:
1. BE's `/api/snapshot`
2. FE's `/get_log_file`
2023-11-29 16:11:51 +08:00
7398c3daf1
[Feature-Variant](Variant Type) support variant type query and index ( #27676 )
2023-11-29 10:37:28 +08:00
0a8e3d2199
[enhance](PrefetchReader) Make the prefetch timeout one config ( #27371 )
2023-11-24 10:24:15 +08:00
df628e1538
[chore](merge-on-write) disable rowid conversion check for mow table by default ( #27482 )
2023-11-23 23:39:01 +08:00
2ea33518b0
[Opt](load) use batching to optimize auto partition ( #26915 )
...
use batching to optimize auto partition
2023-11-23 19:12:28 +08:00
7a75f8c380
[improve](move-memtable) set brpc streaming params in config ( #27442 )
2023-11-23 14:14:43 +08:00
1cd1c58eee
[Feature](group commit) move group_commit_interval_ms from be.conf to table property ( #27116 )
2023-11-21 20:50:02 +08:00
be7273da83
[refactor](executor)Refactor workload meta update to be #26710
2023-11-18 11:19:38 +08:00
5d548935e0
[improvement](insert) support schema change and decommission for group commit ( #26359 )
2023-11-17 21:41:38 +08:00
54989175fb
[case] Load json data with enable_simdjson_reader=false ( #26601 )
2023-11-16 14:40:59 +08:00
dbac12bae8
[fix](memory)Modify the default conf values of mem_limit and cache_last_version_interval_second ( #26945 )
...
mem_limit from 80% to 90%
cache_last_version_interval_second from 900 to 30
2023-11-15 14:02:58 +08:00
89215306d3
[improve](load) add switch for vertical segment writer ( #26996 )
2023-11-15 08:19:12 +08:00
c0fda8c5c2
[improve](group commit) Add a swicth to wait internal group commit lo… ( #26734 )
...
* [improve](group commit) Add a swicth to make internal group commit load finish
* modify group commit tvf plan
2023-11-13 10:35:35 +08:00
8b33b0c4a4
[Fix](row store) cache invalidate key should not include sequence column ( #26771 )
2023-11-11 01:30:32 -06:00
d767804815
[feature](merge-cloud) Decouple rowset id generator and local rowsets gc implementation ( #25921 )
2023-11-10 10:07:02 +08:00
a5565f68b2
[Refactor](opentelemetry) Remove opentelemetry ( #26605 )
2023-11-09 18:05:34 +08:00
5f62a4462d
[Enhancement](wal) Add wal space back pressure ( #26483 )
2023-11-09 12:29:05 +08:00
33e46ee13d
[enhancement](config) enable single_replica_load by default in BE ( #26619 )
...
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com >
2023-11-09 12:14:37 +08:00
58bf79f79e
[fix](move-memtable) pass load stream num to backends ( #26198 )
2023-11-08 16:16:33 +08:00
6637f9c15f
Add enable_cgroup_cpu_soft_limit ( #26510 )
2023-11-08 15:52:13 +08:00
44b51bf0b9
[Feature](Variant) support variant load ( #26572 )
2023-11-08 00:37:57 -06:00
2cc68381ec
[feature](binlog) Add ingest_binlog/http_get_snapshot limit download speed && Add async ingest_binlog ( #26323 )
2023-11-06 11:14:44 +08:00
be7a10162a
[enhance](S3) Add timeout for s3 buffer allocation and corresponding observability ( #26125 )
2023-11-01 17:55:11 +08:00
387e33fa34
[enhancement](group commit)Add group commit block queues memory back pressure ( #26045 )
2023-11-01 16:29:45 +08:00
6a85f46ff3
[refactor](move-memtable) rename open_stream_sink rpc to open_load_stream ( #25883 )
2023-10-29 10:07:14 +08:00