Commit Graph

3030 Commits

Author SHA1 Message Date
2196c534e8 [fix](group commit) Fix compatibility issues on serializing and deserializing wal file (#32299) 2024-03-21 14:07:24 +08:00
14c9537679 [fix](decimal) fix Arithmetic Overflow error of converting string to decimal (#32246) 2024-03-21 14:07:24 +08:00
f99db38998 [fix](ParquetReader) Fix Parquet Reader to read int96 parquet type problem (#32394)
`hi - JULIAN_EPOCH_OFFSET_DAYS` could be negative, so we can't all use unsigned int.
2024-03-21 14:07:24 +08:00
7422f185da [Fix](smooth-upgrade) Fix incompatibility when upgrade from 2.0 to 2.1 (#32444) 2024-03-21 14:07:24 +08:00
715eed0748 [opt](like) opt LIKE and REGEXP clause with concat(col, pattern_str) (#32333)
opt LIKE and REGEXP clause with concat(col, pattern_str)
2024-03-21 14:07:24 +08:00
a5f3611b88 [Fix](Regression) DCHECK failed in runtime filter wrapper (#32446) 2024-03-21 14:07:23 +08:00
7a0b591b8f [FIX](array_agg) fix array agg with other agg function (#32387)
fix array agg with other agg function
2024-03-21 14:07:23 +08:00
4bf5a21ba3 [pipelineX](cancel) Remove lock for mapping query ctx to fragment (#32346) 2024-03-21 14:07:23 +08:00
e952b5ef5b [opt](jdbc catalog) Refine the jdbc_connector close logic and actively clear the jvm occupied by jdbcexecutor (#32300) 2024-03-21 14:07:23 +08:00
163007a665 [fix](grouping sets) fix grouping sets have multiple empty sets (#32317)
in this #32112, handling empty sets (empty expression cases) has been addressed. However, multiple empty sets in grouping sets have different grouping IDs
2024-03-21 14:07:22 +08:00
e99b33c274 [opt](file-meta-cache) reduce file meta cache size and disable cache for some cases (#32340)
File meta cache on BE is used to cache the meta for external table's file such as parquet footer.
This cache is counted by number, not memory consumption.
So if the cache object is big(eg, a large parquet footer), the total memory consumption of this cache
will be large and causing OOM.

This PR mainly changes:

1. Add a new method `exceed_prune_limit()` for `CachePolicy`
    For `ObjLRUCache`, it always return true so that the minor of full gc on BE will prune the cache each time.

2. Reduce the default capability of file meta cache, from 20000 to 1000

    Also change the default capability of hdfs file handle cache, from 20000 to 1000

4. Change judgement of whether enable file meta cache when querying

    If the number of file need to be read is larger than the 1/3 of the file meta cache's capability, file meta cache
    will be disabled for this query. Because cache is useless if there are too many files.
2024-03-21 14:07:22 +08:00
2e564036ef [fix](profile) avoid update profile in deconstructor (#32131)
In previous, the counter in `profile` may be updated when close the file reader.
And the file reader may be closed when the object being deconstruted.
But at that time, the `profile` object may already be deleted, causing NPE and BE will crash.

This PR try to fix this issue:

1. Remove the "profile counter update" logic from all `close()` method.

2. Add a new interface `ProfileCollector`

	It has 2 methods:
	
	- `collect_profile_at_runtime()`

		It can be called at runtime, eg, in every `get_next_block()` method.
		So that the counter in profile can be updated at runtime.
		
	- `collect_profile_before_close()`

		Should be called before the object call `close()`. And it will only be called once.
		
3. Derived from `ProfileCollector`

	All classes which may update the profile counter in `close()` method should extends
	the `ProfileCollector`. Such as `GenericReader`, etc. And implement `collect_profile_before_close()`
	
	And `collect_profile_before_close()` will be called in `scanner->mark_to_need_to_close()`.
2024-03-21 14:07:22 +08:00
724bc82362 [refactor](chore) replace HashMapWithStackMemory with std::unordered_map (#32309) 2024-03-21 14:07:19 +08:00
fd1345bef0 fix load channel may memory leak (#32277) 2024-03-21 14:07:19 +08:00
0990014e94 [fix](datetime) fix datetime rounding on BE (#32075) 2024-03-21 14:07:19 +08:00
85b2c42f76 [Enhancement](jdbc catalog) Add a property to test the connection when creating a Jdbc catalog (#32125) (#32531) 2024-03-21 14:05:59 +08:00
ef2151ae66 [Feature-WIP](multi-catalog) Add Hive sink on BE side. (#32306) (#32364)
bp #32306
Co-authored-by: Qi Chen <kaka11.chen@gmail.com>
2024-03-18 11:23:01 +08:00
5ceccb5ba5 [fix](compatibility) should enable windown funnel mode from 2.0 (#32284) 2024-03-16 20:56:16 +08:00
258dcfca97 [Refactor](executor)Add information_schema.workload_groups (#32195) (#32314) 2024-03-15 20:46:54 +08:00
e3bb499cc6 [fix](function)revert function REPEAT nullable mode #32226 2024-03-15 18:06:28 +08:00
97b35d6830 [fix](nereids)AssertNumRow node's output should be nullable (#32136)
Co-authored-by: Co-Author Jerry Hu <mrhhsg@gmail.com>
2024-03-15 18:06:28 +08:00
9c1888e7ec [RuntimeFilter](exec) support min max runtime filter and do refactor (#32210) 2024-03-15 18:06:20 +08:00
04a59d6071 [improve](distinct agg) add check of hash table to decide whether emplace value (#32063)
* [improve](distinct agg) add check of hash table to emplace value
2024-03-15 18:06:15 +08:00
687ab1a3e1 [fix](ip) conversion of ipv4 or ipv6 to arrow type #32240 2024-03-15 18:05:35 +08:00
c8f3643890 [exec](runtimefilter) support null aware in runtime filter (#32152)
null aware in runtime filter
2024-03-15 18:05:13 +08:00
aca7328109 [bugfix]json_length() BE crash fix (#32145)
Co-authored-by: Rohit Satardekar <rohitrs1983@gmail.com>
2024-03-15 18:04:49 +08:00
62023d705d [refactor](rename) rename task group to workload group in be (#32204)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-03-15 18:04:02 +08:00
0578b28d54 [fix](function) fixed the get_json_string function (#32150) 2024-03-15 18:04:02 +08:00
4534300030 [fix](Operator) RepeatNode does not handle empty expressions. (#32112)
In the past, RepeatNode did not handle empty expressions.
It used DCHECK to check if the expression was non-empty.
In non-debug mode, this caused _child_block to remain unprocessed, resulting in a deadlock.
Now, if the expression is empty, the output block directly outputs _child_block
2024-03-15 18:02:33 +08:00
1db57c0667 [Optimization][Scanner] Skip _init_variant_columns when there are no variant columns, and ensure inherit_tablet_index is called only once (#32174) 2024-03-15 18:01:19 +08:00
94a75c27e7 [feature](pipelineX) support paritition tablet sink shuffle (#31689) 2024-03-15 17:58:01 +08:00
7b74b199a5 [fix](memory) Fix LRU cache deleter and memory tracking (#32080)
In order to add common code to the value deleter of LRU cache, let all lru cache values inherit from LRUCacheValueBase class and tracking memory in destructor.
2024-03-15 17:57:58 +08:00
bbdce3eb5e [fix](jdbc catalog) fix jdbc-connector coredump as get env return nullptr (#32217) 2024-03-14 16:05:08 +08:00
20d6698c27 [bugfix](arm compile) could not compile on arm because -Werror=maybe-uninitialized 2024-03-14 12:11:25 +08:00
847ec368be [Fix](smooth-upgrade) Fix incompatibility when upgrade from 2.0 to 2.1 (#32220) 2024-03-14 11:23:05 +08:00
b031c95324 [Opt](exec) use libbase64 to replace base64 code in doris (#32078)
* [Opt](exec) use libbase64 to replace base64 code in doris
2024-03-14 09:20:50 +08:00
Pxl
2f4401189a [Bug](top-n) do not update topn filter when sort node and scan node are not in the… (#32159) 2024-03-13 16:21:36 +08:00
5da7cd0fba [bugfix](becore) has to use value to capture lambda value to avoid core during callback (#32132)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-03-12 22:51:44 +08:00
0159a75ced [bugfix](becore) be will core when stop because the map is modified during iterator (#32105)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-03-12 18:50:26 +08:00
473bd3ee64 [fix](function) incorrect result of eq_for_null (#32103) 2024-03-12 18:50:26 +08:00
61928f7df5 [pipelineX](scanner) Use the actual instances num when ignore data distribution (#32081) 2024-03-12 14:20:39 +08:00
b41b17ad0a [fix](spill) fix storage engine type cast error (#32071) 2024-03-12 14:20:18 +08:00
aea9ddc3cb [Fix](Outfiel) fix be core when the open method of vfile_result_writer failed #32042 2024-03-12 14:20:18 +08:00
4268634115 [fix](memory) Fix Allocator cancel pipelinex query #32048 2024-03-12 14:20:18 +08:00
ccd21a6ea4 [Improve](InPredict) enhance in predict with array type (#31828) 2024-03-12 14:19:14 +08:00
68a5319da3 [fix](pipelineX) _local_channel_dependency is null in non pipelineX (#32054) 2024-03-12 14:19:04 +08:00
c0f2d0188b [feature](pipelineX) add mem control in local exchange sink (#31982) 2024-03-12 14:17:48 +08:00
b0b7161ad0 [feature](rf) add filter info profile when rf run as expr (#31822) 2024-03-12 14:17:48 +08:00
2470634859 [RuntimeFilter] fix <=> runtime filter failed bug (#32003) 2024-03-12 14:13:13 +08:00
3358f76a7f [feature](spill) Implement spill to disk for hash join, aggregation and sort for pipelineX (#31910)
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
2024-03-12 14:12:09 +08:00