Commit Graph

64 Commits

Author SHA1 Message Date
c2db01037a [refactor](config) rename segcompaction_max_threads (#22468) 2023-08-02 22:35:14 +08:00
bc87002028 [opt](conf) remote scanner thread num is changed to core num * 10 (#22427) 2023-08-01 23:09:49 +08:00
19d1f49fbe [improvement](compaction) compaction policy and options in the properties of a table (#22461) 2023-08-01 22:02:23 +08:00
ff0fda460c [be](parameter) change default fragment_pool_thread_num_max from 512 to 2048 (#22448)
change some parameter's default value:
brpc_num_threads from -1 to 256
compaction_task_num_per_disk from 2 to 4
compaction_task_num_per_fast_disk from 4 to 8
fragment_pool_thread_num_max from 512 to 2048
fragment_pool_queue_size from 2048 to 4096

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-08-01 20:33:41 +08:00
a371e1d4c5 [fix](window_funnel_function) fix upgrade compatibility due to the added field in WindowFunnelState (#22416) 2023-08-01 12:08:55 +08:00
5f25b924b3 [opt](conf) Modify brpc eovercrowded conf (#22407)
brpc ignore eovercrowded of data stream sender and exchange sink buffer
Modify the default value of brpc_socket_max_unwritten_bytes
2023-08-01 08:47:55 +08:00
c25b9071ad [opt](conf) Modify brpc work pool conf default value #22406
Default, if less than or equal 32 core, the following are 128, 128, 10240, 10240 in turn.
if greater than 32 core, the following are core num * 4, core num * 4, core num * 320, core num * 320 in turn

brpc_heavy_work_pool_threads
brpc_light_work_pool_threads
brpc_heavy_work_pool_max_queue_size
brpc_light_work_pool_max_queue_size
2023-07-31 20:38:34 +08:00
ee754307bb [refactor](load) refactor memtable flush actively (#21634) 2023-07-30 21:31:54 +08:00
0cc3232d6f [Improve](topn opt) modify fetch rpc timeout from 20s to 30s, since fetch is quite heavy sometimes (#22163) 2023-07-28 17:56:18 +08:00
5584d7a5ba [Improve](point query) Improve lookup connection cache from DoubleBuffer to LRU cache for better item pruning (#22041) 2023-07-27 22:22:50 +08:00
687d97e648 [improvement][default_config] enlarge default value compaction related (#22286)
configs

1. Because vertical compaction is enabled by default, it consumes less
memory, we can enlarge default value of compaction related configs.
2. Enlarge default value of shard size related to lock.
2023-07-27 20:17:43 +08:00
Pxl
05be45bd35 [Improvement](brpc) adjust brpc_light_work_pool_threads/brpc_heavy_work_pool_threads (#22241)
adjust brpc_light_work_pool_threads/brpc_heavy_work_pool_threads
2023-07-27 14:03:46 +08:00
31c856351a [enhancement](default_config) change default value of rpc related (#22149)
configs

Bdbje elect timeout is 30 seconds, so we enlarge thrift_rpc_timeout_ms
and txn_commit_rpc_timeout_ms to 60s.

BTW: enlarge bdbje_lock_timeout_second from 1 to 5.
2023-07-27 11:12:26 +08:00
9e16c69925 [improvement](compression) support LZ4_HC algorithm and parse LZ4_RAW (#22165) 2023-07-26 18:23:39 +08:00
1f3de0eae3 [fix](memory) fix invalid large memory check && fix memory info thread safety (#22027)
fix invalid large memory check
fix memory info thread safety
2023-07-26 12:18:31 +08:00
30c21789c8 [opt](filecache) use weak_ptr to cache the file handle of file segment (#21975)
Use weak_ptr to cache the file handle of file segment. The max cached number of file handles can be configured by `file_cache_max_file_reader_cache_size`, default `1000000`.
Users can inspect the number of cached file handles by request BE metrics: `http://be_host:be_webserver_port/metrics`:
```
# TYPE doris_be_file_cache_segment_reader_cache_size gauge
doris_be_file_cache_segment_reader_cache_size{path="/mnt/datadisk1/gaoxin/file_cache"} 2500
```
2023-07-24 19:09:27 +08:00
e146969376 [Fix](config) delete unuse lazy open config #22136 2023-07-24 15:02:34 +08:00
367ad9164a [feature-wip](auto-inc)(step-2) support auto-increment column for duplicate table (#19917) 2023-07-20 18:03:39 +08:00
c31e826756 [opt](config) rename alter_inverted_index_worker_count to alter_index_worker_count, and add docs (#21985) 2023-07-20 17:50:04 +08:00
d180ed418d [fix](stacktrace) Speed up stack trace (#21755)
Introduce libunwind get stack trace, cost is negligible and has line numbers.
use StackTraceCache, PHDRCache speed up, is customizable and has some optimizations.
Other stack trace tools remain: glog, boost, glibc, in case for need.

TODO:

currently support linux __x86_64__, __arm__, __powerpc__, not supported __FreeBSD__, APPLE
Note: __arm__, __powerpc__ not been verified
Support signal handle
libunwid support unw_backtrace for jemalloc
Use of undefined compile option USE_MUSL for later
2023-07-19 15:43:14 +08:00
4b30485d62 [improvement](memory) Refactor doris cache GC (#21522)
Abstract CachePolicy, which controls the gc of all caches.
Add stale sweep to all lru caches, including page caches, etc.
I0710 18:32:35.729460 2945318 mem_info.cpp:172] End Full GC Free, Memory 3866389992 Bytes. cost(us): 112165339, details: FullGC:
  FreeTopMemoryQuery:
     - CancelCostTime: 1m51s
     - CancelTasksNum: 1
     - FindCostTime: 0.000ns
     - FreedMemory: 2.93 GB
  WorkloadGroup:
  Cache name=DataPageCache:
     - CostTime: 15.283ms
     - FreedEntrys: 9.56K
     - FreedMemory: 691.97 MB
     - PruneAllNumber: 1
     - PruneStaleNumber: 1
2023-07-11 20:21:31 +08:00
38c8657e5e [improve](memory) more grace logging for memory exceed limit (#21311)
more grace logging for Allocator and MemTracker when memory exceed limit
fix bthread grace exit.
2023-07-05 14:59:06 +08:00
13fb69550a [improvement](kerberos) disable hdfs fs handle cache to renew kerberos ticket at fix interval (#21265)
Add a new BE config `kerberos_ticket_lifetime_seconds`, default is 86400.
Better set it same as the value of `ticket_lifetime` in `krb5.conf`
If a HDFS fs handle in cache is live longer than HALF of this time, it will be set as invalid and recreated.
And the kerberos ticket will be renewed.
2023-07-04 17:13:34 +08:00
a3d34e1e08 [decimalv2](compatibility) add config to allow invalid decimalv2 literal (#21327) 2023-07-03 10:55:27 +08:00
0396f78590 [fix](memory) Remove ChunkAllocator & fix Allocator no use mmap (#21259) 2023-06-28 16:10:24 +08:00
5d1fb33f2d [enhancement](merge-on-write) increasing the max_write_buffer_number parameter to improve save meta performance (#21243) 2023-06-28 11:32:11 +08:00
50c1d55769 [Improve](dynamic schema) support filtering invalid data (#21160)
* [Improve](dynamic schema) support filtering invalid data

1. Support dynamic schema to filter illegal data.
2. Expand the regular expression for ColumnName to support more column names.
3. Be compatible with PropertyAnalyzer and support legacy tables.
4. Default disable parse multi dimenssion array, since some bug unresolved
2023-06-26 19:32:43 +08:00
76bdcf1d26 [improvement](pipeline) task group scan entity (#19924) 2023-06-25 14:43:35 +08:00
2c9bdd64fa [fix](memory) arena support memory reuse after clear() (#21033) 2023-06-21 23:27:21 +08:00
18a0824eb3 [fix](compaction)Modify time series compaction policy default config (#21079) 2023-06-21 20:29:58 +08:00
442a734ef5 [improvement](config) update be config max_runnings_transactions_per_txn_map default value (#21060) 2023-06-21 20:29:13 +08:00
564b3533cf [enhancement](merge-on-write) update publish/streamload/compaction co… (#21040) 2023-06-21 14:49:51 +08:00
9eade148dd [enhancement](merge-on-write) add primary key data page size config (#20961) 2023-06-20 19:51:02 +08:00
cc3f9ed9b7 [Fix](fd) fix fd limit over 100% (#20778) 2023-06-17 19:54:10 +08:00
2e295a1ee9 [Enhancement](http) unify http auth config (#20864) 2023-06-16 16:55:46 +08:00
f1af09ef87 [Enhancement](merge-on-write) parallel calculate delete bitmap when tablet has multi segments (#20706) 2023-06-15 21:11:39 +08:00
2a2e485456 [Enhancement](compaction) time-series scenario cumulative compaction policy (#20715)
new compaction policy for log and time-series scenario
2023-06-14 23:48:44 +08:00
4b15185e25 [improvement](hdfs) add parquet footer cache and hdfs file handle cache (#20544)
1. Add hdfs file handle cache for hdfs file reader

    Copied from Impala, `https://github.com/apache/impala/blob/master/be/src/util/lru-multi-cache.h`. (Thanks for the Impala team)
    This is a lru cache that can store multi entries with same key.
    The key is build with {file name + modification time}
    The value is the hdfsFile pointer that point to a certain hdfs file.
    
    This cache is to avoid reopen same hdfs file mutli time, which can save
    query time.
    
    Add a BE config `max_hdfs_file_handle_cache_num` to limit the max number
    of file handle cache, default is 20000.

2. Add file meta cache

	The file meta cache is a lru cache. the key is {file name + modification time},
	the value is the parsed file meta info of the certain file, which can save
	the time of re-parsing file meta everytime.
	Currently, it is only used for caching parquet file footer.
	
The test show that is cache is hit, the `FileOpenTime` and `ParseFooterTime` is reduce to almost 0
in query profile, which can save time when there are lots of files to read.
2023-06-13 15:13:57 +08:00
Pxl
e010fa8d4f [Chore](runtime filter) remove runtime filter ready_for_publish/publish_finally (#20593) 2023-06-13 11:20:49 +08:00
bd5a26f240 [improvement](recover) Default disable check tablet path (#20565)
change check tablet path interval's default value to -1
2023-06-09 08:47:39 +08:00
92577f45d3 [fix] (recover) fix can not recover a BE's tablet after deleting its data directory manual (#20273) (#20274) 2023-06-07 22:27:50 +08:00
09344eaab5 [feature](load) introduce single-stream-multi-table load (#20006)
For routine load (kafka load), user can produce all data for different
table into single topic and doris will dispatch them into corresponding
table.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-06-07 17:55:25 +08:00
3e186a8821 [opt](MergedIO) optimize merge small IO, prevent amplified read (#20305)
Optimize the strategy of merging small IO to prevent severe read amplification, and turn off merged IO when file cache enabled.
Adjustable parameters:
```
// the max amplified read ratio when merging small IO
max_amplified_read_ratio=0.8
// the min segment size
file_cache_min_file_segment_size = 1048576
```
2023-06-03 10:51:24 +08:00
c03a19ea23 [improvement](bitmap) Using set to store a small number of elements to improve performance (#19973)
Test on SSB 100g:

select lo_suppkey, count(distinct lo_linenumber) from lineorder group by lo_suppkey;
exec time: 4.388s

create materialized view:

create materialized view customer_uv as select lo_suppkey, bitmap_union(to_bitmap(lo_linenumber)) from lineorder group by lo_suppkey;
select lo_suppkey, count(distinct lo_linenumber) from lineorder group by lo_suppkey;
exec time: 12.908s

test with the patch, exec time: 5.790s
2023-05-31 16:13:42 +08:00
accaff1026 [Feature](compaction) wip: single replica compaction (#19237)
Currently, compaction is executed separately for each backend, and the reconstruction of the index during compaction leads to high CPU usage. To address this, we are introducing single replica compaction, where a specific primary replica is selected to perform compaction, and the remaining replicas fetch the compaction results from the primary replica.

The Backend (BE) requests replica information for all peers corresponding to a tablet from the Frontend (FE). This information includes the host where the replica is located and the replica_id. By calculating hash(replica_id), the replica with the smallest hash value is responsible for executing compaction, while the remaining replicas are responsible for fetching the compaction results from this replica.
The compaction task producer thread, before submitting a compaction task, checks whether the local replica should fetch from its peer. If it should, the task is then submitted to the single replica compaction thread pool.
When performing single replica compaction, the process begins by requesting rowset versions from the target replica. These rowset_versions are then compared with the local rowset versions. The first version that can be fetched is selected.
2023-05-30 21:12:48 +08:00
ab8125d56f [Improve](performance) introduce SchemaCache to cache TabletSchame & Schema (#20037)
* [Improve](performance) introduce SchemaCache to cache TabletSchame & Schema

1. When the system is under high-concurrency load with wide table point queries, the frequent memory allocation and deallocation of Schema become evident system bottlenecks. Additionally, the initialization of TabletSchema and Schema also becomes a CPU hotspot.Therefore, the introduction of a SchemaCache is implemented to cache these resources for reuse.

2. Make some variables wrapped with std::unique<unique_ptr>

Performance:
| 状态              | QPS | 平均响应时间 (avg) | P99 响应时间 |
|------------------|-----|------------------|-------------|
| 开启 SchemaCache | 501 | 20ms             | 34ms        |
| 关闭 SchemaCache | 321 | 31ms             | 61ms        |

* handle schema change with schema version

* remove useless header

* rebase
2023-05-29 17:34:53 +08:00
93933308e6 [Feature-WIP](CCR): Add ccr doris interface (WIP) (#17881) 2023-05-26 23:40:49 +08:00
9e70a9ef84 [opt](compaction) add pick rowset to compact interval config (#19868) 2023-05-26 17:39:02 +08:00
1c950d6930 [fix](config) fix memory config enable_query_memroy_overcommit spell problem #19898 2023-05-22 00:32:20 +08:00
76c358b3e3 [revert](memory) revert page no use Allocator && default disable ChunkAllocator (#19905)
default chunk allocator reserve is 0. At this time, it is meaningless to enable chunk allocator, it will only waste memory.
2023-05-21 22:16:41 +08:00