doris

Author	SHA1	Message	Date
Luwei	cb312cabb2	[Fix](tablet-meta) limit the data size of tablet meta (#39455 ) (#39974 ) pick master #39455	2024-08-27 20:10:17 +08:00
Xinyi Zou	ae4d747c13	[branch-2.1](memory) Modify memory gc conf and add `crash_in_alloc_large_memory_bytes` (#39834 ) pick #39611	2024-08-24 09:21:35 +08:00
Xinyi Zou	1367f74e7a	[branch-2.1](memory) Optimize ClearCacheActionimplementation (#39796 ) pick #38438	2024-08-23 01:51:14 +08:00
Xinyi Zou	8ce8887b75	[branch-2.1](memory) Refactor refresh workload groups weighted memory ratio and record refresh interval memory growth (#39760 ) pick #38168 overwrites changes in #37221 on workload_group_manager.cpp. If need to pick 37221, ignore it.	2024-08-22 17:33:11 +08:00
Yongqiang YANG	610f69432a	[improvement](segmentcache) limit segment cache by fd limit or memory… (#39689 ) … (#39658) remove a useless config.	2024-08-21 15:19:52 +08:00
zhiqiang	830f250a80	[opt](query cancel) cancel query if it has pipeline task leakage #39223 (#39537 ) pick #39223 with some modifications. Optimization will only be applied to pipeline x.	2024-08-19 14:33:59 +08:00
苏小刚	0680c8d314	[improve](cache) File cache async init (#39036 ) ## Proposed changes Do `load_cache_info_into_memory()` asynchronously in a background thread in `LRUFileCache::initialize()`. When the cache is not ready, `LRUFileCache::get_or_set()` will return the FileBlock which state is SKIP_CACHE.	2024-08-15 16:27:51 +08:00
qiye	8678fcea32	[config](inverted index)Make inverted_index_ram_dir enable by default(#35094 ) (#39120 ) ## Proposed changes bp #35094 Co-authored-by: Luennng <luennng@gmail.com>	2024-08-09 01:38:14 +08:00
Xr Ling	2543b569bb	[Optimize](Row store) pick #37145 , #38236 (#38932 )	2024-08-07 09:55:42 +08:00
Mingyu Chen	e9bf0776d7	[fix](parquet) disable parquet page index by default #38691 (#38901 ) bp #38691	2024-08-06 08:51:39 +08:00
Luwei	0603ec1d9d	[enhancement](compaction) optimizing memory usage for compaction (#37099 ) (#37486 )	2024-08-04 10:49:18 +08:00
qiye	b3f335ba5f	[enhancement](index compaction) Enable index compaction by default (#36812 ) (#38676 ) ## Proposed changes bp #36812	2024-08-02 12:03:57 +08:00
Kaijie Chen	0152a4e86f	[config](be) add be config migration_lock_timeout_ms (#38000 ) (#38337 ) backport #38000	2024-07-25 17:36:34 +08:00
Xinyi Zou	10c5c336d8	[branch-2.1](arrow-flight-sql) Add config arrow_flight_result_sink_buffer_size_rows (#38223 ) pick #38221	2024-07-24 15:15:39 +08:00
wangbo	7b141ffde7	[pick]add min scan thread num for workload group's scan thread (#38123 ) ## Proposed changes pick #38096	2024-07-19 18:43:05 +08:00
lihangyu	b15ccdbe98	[Pick](Variant) pick some fix (#37922 ) #37674 #37839 #37883 #37857 #37794	2024-07-16 21:38:47 +08:00
Xinyi Zou	9861f81630	[branch-2.1](memory) Fix Jemalloc Cache Memory Tracker (#37905 ) pick #37464	2024-07-16 19:01:31 +08:00
Pxl	010d9d88f8	[Feature](rpc) support set brpc_idle_timeout_sec and enable thrift so… (#37808 ) pick from #37333	2024-07-15 21:12:25 +08:00
Mingyu Chen	a4d37d96ca	[opt](file-scanner) add not found file number in profile (#37042 ) (#37764 ) bp #37042	2024-07-15 17:11:06 +08:00
Kaijie Chen	232202b71f	[improve](load) reduce memory reserved in memtable limiter (#37511 ) (#37699 ) cherry-pick #37511	2024-07-15 11:09:09 +08:00
zclllyybb	2759383365	[branch-2.1](timezone) refactor tzdata load to accelerate and unify timezone parsing (#37062 ) (#37269 ) pick https://github.com/apache/doris/pull/37062 1. revert https://github.com/apache/doris/pull/25097. we decide to rely on OS. not maintain independent tzdata anymore to keep result consistency 2. refactor timezone load. removed rwlock. before: ```sql mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates; +-------------------------------------------------------------------------------------+--------------------------------------------------------+ \| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) \| count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) \| +-------------------------------------------------------------------------------------+--------------------------------------------------------+ \| 16000000 \| 16000000 \| +-------------------------------------------------------------------------------------+--------------------------------------------------------+ 1 row in set (6.88 sec) ``` now: ```sql mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates; +-------------------------------------------------------------------------------------+--------------------------------------------------------+ \| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) \| count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) \| +-------------------------------------------------------------------------------------+--------------------------------------------------------+ \| 16000000 \| 16000000 \| +-------------------------------------------------------------------------------------+--------------------------------------------------------+ 1 row in set (2.61 sec) ``` 3. now don't support timezone offset format string like 'UTC+8', like we already said in https://doris.apache.org/docs/dev/query/query-variables/time-zone/#usage 4. support case-insensitive timezone parsing in nereids. 5. a bug when parse timezone using nereids. should check DST by input, but wrongly by now before. now fixed. doc pr: https://github.com/apache/doris-website/pull/810	2024-07-15 10:56:48 +08:00
Xinyi Zou	747172237a	[branch-2.1](memory) Pick some memory GC patch (#37725 ) pick #36768 #37164 #37174 #37525	2024-07-14 15:19:40 +08:00
Xinyi Zou	cf2fb6945a	[branch-2.1](memory) Refactor LRU cache policy memory tracking (#37658 ) pick #36235 #35965	2024-07-11 21:04:01 +08:00
Luwei	3337c1bbe3	[[enhancement](compaction) adjust compaction concurrency based on compaction score and workload (#37491 ) adjust compaction concurrency based on compaction score and workload #36672 fix null pointer when retrieving CPU load average #37171	2024-07-09 09:56:35 +08:00
zhannngchen	494b54a5a5	[enhancement](trash) support skip trash, update trash default expire time (#37170 ) (#37409 ) cherry-pick #37170	2024-07-08 15:33:02 +08:00
wangbo	b272247a57	[pick]log thread num (#37258 ) ## Proposed changes pick #37159	2024-07-04 15:27:52 +08:00
Pxl	70e1c563b3	[Chore](runtime-filter) enlarge sync filter size rpc timeout limit (#37103 ) (#37225 ) pick from #37103	2024-07-03 21:02:26 +08:00
Mingyu Chen	e25717458e	[opt](catalog) add some profile for parquet reader and change meta cache config (#37040 ) (#37146 ) bp #37040	2024-07-02 20:58:43 +08:00
wangbo	f5572ac732	[pick]reset memtable flush thread num (#37092 ) ## Proposed changes pick #37028	2024-07-02 19:20:17 +08:00
camby	798d9d6fc6	[pick21][opt](mow) reduce memory usage for mow table compaction (#36865 ) (#36968 ) cherry-pick https://github.com/apache/doris/pull/36865 to branch-2.1	2024-07-01 15:33:18 +08:00
Yongqiang YANG	07278e9dcb	[improvement](segmentcache) limit segment cache by memory or segment … (#37035 ) …num (#37026) pick ##37026	2024-06-30 20:34:13 +08:00
yujun	22cb7b8fcb	[improvement](compaction) be do not compact invisible version to avoid query error -230 #28082 (#36222 ) cherry pick from #28082	2024-06-27 13:45:21 +08:00
walter	a79b56ac23	[chore](be) Support config max message size for be thrift server (#36595 ) Cherry-pick #36467	2024-06-20 20:15:43 +08:00
Ashin Gau	f59dc4fb37	[opt](split) generate and get split batch concurrently (#36044 ) bp #36045, and turn on batch split, which is turn off in #36109 Generate and get split batch concurrently. `SplitSource.getNextBatch` remove the synchronization, and make each get their splits concurrently, and `SplitAssignment` generates splits asynchronously.	2024-06-19 16:16:02 +08:00
Tiewei Fang	c84b56140c	[Fix](outfile) Add a configuration for exporting data in Parquet format using `select into outfile` (#36143 ) backport: #36142	2024-06-13 11:49:46 +08:00
lihangyu	0b28420e1c	[pick](Variant) make remote schema fetch rpc timeout configurable (#35296 ) (#36174 )	2024-06-12 19:51:53 +08:00
Xin Liao	d1eb917076	[fix](rpc) fix transfer large data and enable transfer_large_data_by_brpc by default #35770 (#36169 ) cherry pick from #35770	2024-06-12 19:39:07 +08:00
Mingyu Chen	fbc82e0253	[opt](log) refine the BE logger (#35942 ) (#35988 ) bp #35942	2024-06-06 22:25:22 +08:00
plat1ko	c2b830e1e7	Pick "[Fix](Tablet) Fix the issue of redundant loading of stale rowset (#35768 )" (#35882 )	2024-06-05 07:55:04 +08:00
Mingyu Chen	e755d64e62	[feature](be jvm monitor)append enable_jvm_monitor in be.conf to control jvm monitor. (#35608 ) (#35764 ) bp #35608 Co-authored-by: daidai <2017501503@qq.com>	2024-06-02 00:18:44 +08:00
Qi Chen	b91d2caab8	[Feature](iceberg-writer) Implements iceberg sink basic functionality for inserting into table. (#35587 ) backport #34929	2024-05-29 16:40:54 +08:00
Mingyu Chen	5c40e87667	[opt](s3) auto retry when meeting 429 error (#35397 ) - Add 2 new BE config - `s3_read_base_wait_time_ms` and `s3_read_max_wait_time_ms` When meet s3 429 error, the "get" request will sleep `s3_read_base_wait_time_ms (1, 2, 3, 4)` ms get try again. The max sleep time is s3_read_max_wait_time_ms and the max retry time is max_s3_client_retry - Add more metrics for s3 file reader - `s3_file_reader_too_many_request`: counter of 429 error. - `s3_file_reader_s3_get_request`: the QPS of s3 get request. - `TotalGetRequest`: Get request counter in profile - `TooManyRequestErr`: 429 error counter in profile - `TooManyRequestSleepTime`: Sum of sleep time after 429 error in profile - `TotalBytesRead`: Total bytes read from s3 in profile	2024-05-28 23:00:31 +08:00
TengJianPing	eefeb4d80c	[fix](spill) fix wrong disk usage of spill (#35423 ) ## Proposed changes Issue Number: close #xxx <!--Describe your changes.--> ## Further comments If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...	2024-05-28 18:53:55 +08:00
Xinyi Zou	b6eaf95720	[fix](memory) Fix BE memory info compatible with Cgroup (#35412 ) (#35425 ) 1. `memory.usage_in_bytes ~= free.used + free.(buff/cache) - (buff)`, free cache can be reused, so, modify cgroup_memory_usage = memory.usage_in_bytes - memory.meminfo["Cached"]. 2. If system not configured with cgroup, find cgroup file path will failed, refactor refresh cgroup memory info, compatible with find failed.	2024-05-27 12:31:44 +08:00
HHoflittlefish777	c6c90ff63e	[chore](routine-load) make routine_load_consumer_pool_size can update using HTTP API (#35315 )	2024-05-25 17:46:29 +08:00
Kaijie Chen	a6f7747d29	[feature](datatype) add BE config to allow zero date (#34961 ) Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>	2024-05-23 19:12:39 +08:00
Ashin Gau	98f8eb5c43	[opt](split) get file splits in batch mode (#34032 ) (#35107 ) bp #34032	2024-05-21 22:27:07 +08:00
Xin Liao	5019aa03e9	[enhancement](be-meta) disable sync rocksdb by default for better performance (#32714 ) (#35122 )	2024-05-21 15:30:49 +08:00
HHoflittlefish777	1a24895257	[opt](routine-load) optimize routine load task thread pool and related param(#32282 ) (#34896 )	2024-05-15 12:42:02 +08:00
Mingyu Chen	cadbbdd2c0	[fix](config) for compatibility issue of log dir config (#34734 ) * [fix](config) for compatibility issue of log dir config * 1	2024-05-12 09:44:50 +08:00

1 2 3 4 5 ...

304 Commits