doris

Author	SHA1	Message	Date
HHoflittlefish777	e0ec2da29b	[fix](routine-load) fix get kafka offset timeout may too long (#33502 )	2024-04-17 23:42:12 +08:00
Pxl	e85a2c8866	[Chore](status) change unknow filter error to internal error (#33633 )	2024-04-17 23:42:12 +08:00
wangbo	8ee8de7857	[Fix](executor)reset remote scan thread num #33579	2024-04-17 23:42:11 +08:00
Gabriel	01f333086d	[pipelineX](fix) Fix data pooling judgement for bucket join (#33533 )	2024-04-17 23:42:00 +08:00
zhangstar333	b2b385a4ff	[improve](fold) support complex type for constant folding (#32867 )	2024-04-17 23:41:59 +08:00
Kaijie Chen	fefbde8927	[log](move-memtable) improve logs in vtablet_writer_v2 and load_stream (#33103 )	2024-04-12 15:09:25 +08:00
yiguolei	a4924dabb7	[enhancement](exception) enble exception logic in pipeline execute thread (#33437 ) * [enhancement](exception) enble exception logic in pipeline execute thread * f --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-04-12 15:09:25 +08:00
Pxl	5f30463bb3	[Chore](descriptors) remove unused codes for descriptors (#33408 ) remove unused codes for descriptors	2024-04-12 15:09:25 +08:00
Qi Chen	e841d82ffb	[Enhancement](hive-writer) Adjust table sink exchange rebalancer params. (#33397 ) Issue Number: #31442 Change table sink exchange rebalancer params to node level and adjust these params to improve write performance by better balance. rebalancer params: ``` DEFINE_mInt64(table_sink_partition_write_min_data_processed_rebalance_threshold, "26214400"); // 25MB // Minimum partition data processed to rebalance writers in exchange when partition writing DEFINE_mInt64(table_sink_partition_write_min_partition_data_processed_rebalance_threshold, "15728640"); // 15MB ```	2024-04-12 13:09:56 +08:00
zclllyybb	3d66723214	[branch-2.1](auto-partition) pick auto partition and some more prs (#33523 )	2024-04-11 17:12:17 +08:00
Pxl	5688c28364	[Bug](runtime-filter) try to fix heap use after free on runtime filter send filter size (#33465 ) (#33522 )	2024-04-11 13:10:24 +08:00
Pxl	3081fc584d	[Improvement](runtime-filter) support sync join node build side's size to init bloom runtime filter (#32180 ) support sync join node build side's size to init bloom runtime filter	2024-04-11 09:31:50 +08:00
Xinyi Zou	6bef95eb4f	[fix](memory) Fix memory tracker destructor deadlock (#33497 )	2024-04-10 22:46:53 +08:00
HHoflittlefish777	f8d1fa2be3	[chore](multi-table-load) add context info in log when using single-stream-multi-table load (#33317 )	2024-04-10 16:03:05 +08:00
Xinyi Zou	2b1ab89b5b	[fix](memory) Fix memory log compile by ASAN (#33162 ) ASAN compiles BE, add markers in memory logs	2024-04-10 15:26:09 +08:00
Pxl	8fd6d4c41b	[Chore](build) add -Wconversion and remove some unused code (#33127 ) add -Wconversion and remove some unused code	2024-04-10 15:26:08 +08:00
924060929	cc363f26c2	[fix](Nereids) fix group concat (#33091 ) Fix failed in regression_test/suites/query_p0/group_concat/test_group_concat.groovy select group_concat( distinct b1, '?'), group_concat( distinct b3, '?') from table_group_concat group by b2 exception: lowestCostPlans with physicalProperties(GATHER) doesn't exist in root group The root cause is '?' is push down to slot by NormalizeAggregate, AggregateStrategies treat the slot as a distinct parameter and generate a invalid PhysicalHashAggregate, and then reject by ChildOutputPropertyDeriver. I fix this bug by avoid push down literal to slot in NormalizeAggregate, and forbidden generate stream aggregate node when group by slots is empty	2024-04-10 14:59:46 +08:00
TengJianPing	517c12478f	[improvement](spill) spill trigger improvement (#32641 )	2024-04-10 14:52:46 +08:00
Xinyi Zou	cf7595d423	[opt](memory) Optimize mem tracker accuracy (#32039 ) (#33140 )	2024-04-10 11:42:19 +08:00
amory	28e2d89ce3	[Improve](inverted_index) update clucene and improve array inverted index writer (#32436 )	2024-04-10 11:34:29 +08:00
yiguolei	005f7af21f	[bugfix](deadlock) should not use query cancelled in fragment mgr	2024-04-09 16:09:01 +08:00
yiguolei	bfc9260507	[bugfix](deadlock) avoid deadlock in memtracker cancel query (#33400 ) get_query_ctx(hold query ctx map lock) ---> QueryCtx ---> runtime statistics mgr ---> runtime statistics mgr ---> allocate block memory ---> cancel query memtracker will try to cancel query when memory is not available during allocator. BUT the allocator is a foundermental API, if it call the upper API it may deadlock. Should not call any API during allocator.	2024-04-09 12:20:54 +08:00
Gabriel	a8232c67f9	[pipelineX](runtime filter) Fix task timeout caused by runtime filter (#33332 ) (#33369 )	2024-04-08 16:30:32 +08:00
Xinyi Zou	d60d804d9c	[fix](memory) Fix task repeat attach task DCHECK failed #32784 (#33343 ) [branch-2.1](memory) Fix CCR task repeat attach task DCHECK failed3 #33366	2024-04-08 16:15:04 +08:00
Mingyu Chen	466972926e	[fix](dns-cache) do not detach the refresh thread (#33182 )	2024-04-07 22:18:56 +08:00
Mingyu Chen	c758a25dd8	[opt](fqdn) Add DNS Cache for FE and BE (#32869 ) In previously, when enabling FQDN, Doris will call dns resolver to get IP from hostname each time when 1) FE gets BE's grpc client. 2) BE gets other BE's brpc client. So when in high concurrency case, the dns resolver be overloaded and failed to resolve hostname. This PR mainly changes: 1. Add DNSCache for both FE and BE. The DNSCache will run on every FE and BE node. It has a cache, key is hostname and value is IP. Caller can get IP by hostname from this cache, and if hostname does not exist, it will try to resolve it and update the cache. In addition, DNSCache has a daemon thread to refresh the cache every 1 min, in case that the IP may be changed at anytime. There are other implements of this dns cache: 1. `36fed13997` This is for BE side, but it does not handle the IP change case. 3. https://github.com/apache/doris/pull/28479 This is for FE side, but it can only work with Master FE. Other FE node will not be aware of the IP change. And there are a bunch of BackendServiceProxy, this PR only handle cache in one of them.	2024-04-07 22:16:04 +08:00
Gabriel	77349ca71a	[pipelineX](fix) Fix coredump by incorrect cancel order (#33294 )	2024-04-07 12:06:12 +08:00
Xin Liao	950ca68fac	[fix](move-memtable) fix timeout to get tablet schema (#33256 ) (#33260 )	2024-04-04 21:45:55 +08:00
yiguolei	7675383c40	[bugfix](deadlock) fix dead lock in cancel fragment (#33181 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-04-03 13:41:24 +08:00
airborne12	0122b8a6b4	[Update](inverted index) add config for inverted index query cache shards (#32666 )	2024-03-26 20:27:33 +08:00
yiguolei	7b94cfdba1	Revert "[Fix](tests) add regression tests for trino-connector (#32552 )" This reverts commit 3fc3a4650681cb519405730899a2f22f268b38c1.	2024-03-25 22:38:21 +08:00
Tiewei Fang	3fc3a46506	[Fix](tests) add regression tests for trino-connector (#32552 )	2024-03-25 22:31:55 +08:00
lihangyu	62c7d0a421	[Fix](point query) add query options for short circuit queries (#32530 ) (#32684 ) Some options like `be_exec_version` needed for functions	2024-03-22 18:03:18 +08:00
wangbo	326a264fcd	[Improvement](executor)Add spill property for workload group #32554	2024-03-22 16:38:19 +08:00
zhangstar333	e41311d77d	[bug](fold) fix fold constant core dump with variant type (#32265 ) 1. variant type core dump at call get_data_at function, as not impl this function. 2. some case can't pass at old planner and fold_constant_by_be = on. 3. open enable_fold_constant_by_be = true.	2024-03-22 16:37:33 +08:00
yiguolei	6b54171778	[bugfix](deadlock) pipelinex map lock should only scope in map not about pipelinectx's cancel method (#32622 ) both global lock in fragment mgr should only protect the map logic, could not use it to protect cancel method. fragment ctx cancel method should be protected by a lock. query ctx cancel --> pipelinex fragment cancel ---> query ctx cancel will dead lock.	2024-03-22 08:52:38 +08:00
Mryange	a40463617e	[feature](cpu cores) get the cores when running within a cgroup. (#32370 ) get the cores when running within a cgroup	2024-03-21 14:07:49 +08:00
HappenLee	b6a35d68b0	[code](Refactor) Del unless filter id in runtime filter func (#32502 ) Del unless filter id in runtime filter func	2024-03-21 14:07:49 +08:00
walter	50c247e08c	[fix](snapshot-loader) Fix be crash caused by deref end() iterator (#32489 ) The standard said that the input parameter `pos` of std::vector::erase must be valid and dereferenceable, the `end()` iterator cannot be used as a value of `pos`. I did some tests and the crash only occurs when the vector is empty. Fortunately `local_files` is usually not empty.	2024-03-21 14:07:24 +08:00
huanghaibin	2196c534e8	[fix](group commit) Fix compatibility issues on serializing and deserializing wal file (#32299 )	2024-03-21 14:07:24 +08:00
Gabriel	ab512f935c	[pipelineX](api) Add api for long-running tasks (#32459 )	2024-03-21 14:07:24 +08:00
Gabriel	4bf5a21ba3	[pipelineX](cancel) Remove lock for mapping query ctx to fragment (#32346 )	2024-03-21 14:07:23 +08:00
Mingyu Chen	e99b33c274	[opt](file-meta-cache) reduce file meta cache size and disable cache for some cases (#32340 ) File meta cache on BE is used to cache the meta for external table's file such as parquet footer. This cache is counted by number, not memory consumption. So if the cache object is big(eg, a large parquet footer), the total memory consumption of this cache will be large and causing OOM. This PR mainly changes: 1. Add a new method `exceed_prune_limit()` for `CachePolicy` For `ObjLRUCache`, it always return true so that the minor of full gc on BE will prune the cache each time. 2. Reduce the default capability of file meta cache, from 20000 to 1000 Also change the default capability of hdfs file handle cache, from 20000 to 1000 4. Change judgement of whether enable file meta cache when querying If the number of file need to be read is larger than the 1/3 of the file meta cache's capability, file meta cache will be disabled for this query. Because cache is useless if there are too many files.	2024-03-21 14:07:22 +08:00
airborne12	9eb2f90e27	[Optimize](inverted index) optimize inverted index bitmap copy (#32279 ) (#32469 )	2024-03-19 17:28:59 +08:00
Mingyu Chen	ef2151ae66	[Feature-WIP](multi-catalog) Add Hive sink on BE side. (#32306 ) (#32364 ) bp #32306 Co-authored-by: Qi Chen <kaka11.chen@gmail.com>	2024-03-18 11:23:01 +08:00
Pxl	5e4da61df9	[Bug](top-n) do not get runtime predicate when predicate not initialized (#32208 )	2024-03-15 18:06:15 +08:00
HappenLee	c8f3643890	[exec](runtimefilter) support null aware in runtime filter (#32152 ) null aware in runtime filter	2024-03-15 18:05:13 +08:00
yiguolei	62023d705d	[refactor](rename) rename task group to workload group in be (#32204 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-03-15 18:04:02 +08:00
HHoflittlefish777	56a14c912a	[fix](routineload) fix consume data too slow in partial partitions (#32126 )	2024-03-15 18:01:22 +08:00
Xinyi Zou	7b74b199a5	[fix](memory) Fix LRU cache deleter and memory tracking (#32080 ) In order to add common code to the value deleter of LRU cache, let all lru cache values inherit from LRUCacheValueBase class and tracking memory in destructor.	2024-03-15 17:57:58 +08:00

1 2 3 4 5 ...

1579 Commits