Commit Graph

7282 Commits

Author SHA1 Message Date
ea71472d64 [fix](build index) fix core when build index for a new column which without data (#32550) (#32669)
Co-authored-by: Luennng <luennng@gmail.com>
Co-authored-by: Tanya-W <tanya1218w@163,com>
2024-03-22 15:05:19 +08:00
a4a191fe56 [fix](index compaction)Fix MOW index compaction core (#32121) (#32657) 2024-03-22 14:20:19 +08:00
23c12fd68f [fix](join) core caused by null-safe-equal join (#32623) 2024-03-22 08:53:47 +08:00
921fab2196 [fix](memory) Fix thread context not initialized in MacOS (#32570) 2024-03-22 08:53:47 +08:00
6b54171778 [bugfix](deadlock) pipelinex map lock should only scope in map not about pipelinectx's cancel method (#32622)
both global lock in fragment mgr should only protect the map logic, could not use it to protect cancel method.
fragment ctx cancel method should be protected by a lock.
query ctx cancel --> pipelinex fragment cancel ---> query ctx cancel will dead lock.
2024-03-22 08:52:38 +08:00
Pxl
6462d913ca [Improvement](brpc) log error message when AutoReleaseClosure meet brpc error or response… (#32628)
log error message when AutoReleaseClosure meet brpc error or response with error status
2024-03-22 08:52:38 +08:00
d3bdda6071 [fix](partial update) fix data correctness risk when load delete sign data into a table with sequence col (#32574) 2024-03-22 08:52:38 +08:00
55b7f7f019 [fix](inverted index) skip read index column data only for DUP and MOW table (#32594) 2024-03-22 08:52:16 +08:00
2cb652a7fa [FIX](compile)fix for gcc compile (#32508)
* fix for gcc compile
2024-03-22 08:52:16 +08:00
d7a3ff1ddf [Fix](Outfile) Fix the column type mapping in the orc/parquet file format (#32281)
| Doris Type             | Orc Type                     |  Parquet Type                |
|---------------------|--------------------|------------------------|
| Date                            | Long (logical: DATE)                 |       int32 (Logical: Date)                                        |
| DateTime                    | TIMESTAMP (logical: TIMESTAMP)    |       int96                          |
2024-03-22 08:52:16 +08:00
fd0bc720e9 [opt](information_schema) Add DEFAULT_ENCRYPTION column to schemata table (#32501) 2024-03-22 08:52:16 +08:00
6888e52365 [pipelineX](fix) Fix illegal memory access (#32602) 2024-03-22 08:52:16 +08:00
844dd8b2ce [fix](spill) should wait for merging done before read agg result (#32537) 2024-03-22 08:52:16 +08:00
fd62af82d2 [enhancement](mow) Add bvar for bloom filter and segment (#32355) 2024-03-22 08:52:12 +08:00
0cde0cbf19 (invert index) modify of time series compaction policy 2024-03-22 08:16:30 +08:00
4c8aaa156a [fix](jni) remove 'push_down_predicates' and fix BE crash with decimal predicate (#32253) (#32599) 2024-03-21 14:07:50 +08:00
617cc667fe [Fix](Variant) fix variant serialize root node (#31769) 2024-03-21 14:07:50 +08:00
02ef02402a [pipelineX](debug) Add debug logs for long-running load task (#32534) 2024-03-21 14:07:50 +08:00
02430e6e53 [enhance](S3) Print the oss request id for each error s3 request (#32499) 2024-03-21 14:07:50 +08:00
7486e96b12 [improve](function) add error msg if exceeded maximum default value in repeat function (#32219)
add some error msg from repeat function, so the user could know the count is greater than default value.
2024-03-21 14:07:49 +08:00
6d076f9947 [improvement](group_comit) Add bvar to monitor the total wal count on disk (#31646) 2024-03-21 14:07:49 +08:00
09be4dc7ee [fix](random-bucket) tabletindex when there is no cached value in memory (#32336)
1. In cloud mode, get visible version is a rpc to metaservice, while
loads would get visible version for all partitions.
2. VunionNode should follow batch size.
2024-03-21 14:07:49 +08:00
06bf5541f2 [pipelineX](fix) Fix running tasks API core dump (#32503) 2024-03-21 14:07:49 +08:00
0db402e154 [expr](fix) Not to throw exception when close failed (#32287) 2024-03-21 14:07:49 +08:00
a40463617e [feature](cpu cores) get the cores when running within a cgroup. (#32370)
get the cores when running within a cgroup
2024-03-21 14:07:49 +08:00
b92a764665 [feature](function) Support for aggregate function foreach combiner for some error function (#31913)
Support for aggregate function foreach combiner for some error function
2024-03-21 14:07:49 +08:00
b6a35d68b0 [code](Refactor) Del unless filter id in runtime filter func (#32502)
Del unless filter id in runtime filter func
2024-03-21 14:07:49 +08:00
6871c964af [fix](nereids)NullSafeEqualToEqual rule only change to equal if both children are not nullable (#32374)
NullSafeEqualToEqual rule only change to equal if both children are not nullable
2024-03-21 14:07:49 +08:00
4efeb6618a [Fix](inverted index) fix inappropriate use of macro in inverted index fs directory error process (#32472) 2024-03-21 14:07:24 +08:00
50c247e08c [fix](snapshot-loader) Fix be crash caused by deref end() iterator (#32489)
The standard said that the input parameter `pos` of std::vector::erase
must be valid and dereferenceable, the `end()` iterator cannot be used
as a value of `pos`. I did some tests and the crash only occurs when the
vector is empty. Fortunately `local_files` is usually not empty.
2024-03-21 14:07:24 +08:00
612d3595e4 [improvement](spill) optimize the spilling logic of hash join operator (#32202) 2024-03-21 14:07:24 +08:00
e892774c9a [improvement](agg) streaming agg should not take too much memory when spilling enabled (#32426) 2024-03-21 14:07:24 +08:00
2196c534e8 [fix](group commit) Fix compatibility issues on serializing and deserializing wal file (#32299) 2024-03-21 14:07:24 +08:00
14c9537679 [fix](decimal) fix Arithmetic Overflow error of converting string to decimal (#32246) 2024-03-21 14:07:24 +08:00
ab512f935c [pipelineX](api) Add api for long-running tasks (#32459) 2024-03-21 14:07:24 +08:00
f99db38998 [fix](ParquetReader) Fix Parquet Reader to read int96 parquet type problem (#32394)
`hi - JULIAN_EPOCH_OFFSET_DAYS` could be negative, so we can't all use unsigned int.
2024-03-21 14:07:24 +08:00
0635a8716c [improve](group commit) Group commit support chunked stream load in flink (#32135) 2024-03-21 14:07:24 +08:00
7422f185da [Fix](smooth-upgrade) Fix incompatibility when upgrade from 2.0 to 2.1 (#32444) 2024-03-21 14:07:24 +08:00
715eed0748 [opt](like) opt LIKE and REGEXP clause with concat(col, pattern_str) (#32333)
opt LIKE and REGEXP clause with concat(col, pattern_str)
2024-03-21 14:07:24 +08:00
6ea8e51261 [Performance](join) speed up the colocate and bucket shuffle join by change rf size (#32421) 2024-03-21 14:07:24 +08:00
a5f3611b88 [Fix](Regression) DCHECK failed in runtime filter wrapper (#32446) 2024-03-21 14:07:23 +08:00
7a0b591b8f [FIX](array_agg) fix array agg with other agg function (#32387)
fix array agg with other agg function
2024-03-21 14:07:23 +08:00
a0a3a2a2ce [Fix](Variant) fix variant with not null (#32248)
ignore null bitmap for not null and make subcolumn access slots always nullable
2024-03-21 14:07:23 +08:00
590e1d52ec [pipelineX](streaming agg) Fix wrong columns produced by streaming agg (#32411)
* [pipelineX](streaming agg) Fix wrong columns produced by streaming agg

* update
2024-03-21 14:07:23 +08:00
4bf5a21ba3 [pipelineX](cancel) Remove lock for mapping query ctx to fragment (#32346) 2024-03-21 14:07:23 +08:00
b66840efd7 [Fix](regression test) Fix <=> rf cause regresion test failed (#32377) 2024-03-21 14:07:23 +08:00
fdcf5b7d34 [enhancement](dict) check valid of offset in page (#32349) 2024-03-21 14:07:23 +08:00
e952b5ef5b [opt](jdbc catalog) Refine the jdbc_connector close logic and actively clear the jvm occupied by jdbcexecutor (#32300) 2024-03-21 14:07:23 +08:00
163007a665 [fix](grouping sets) fix grouping sets have multiple empty sets (#32317)
in this #32112, handling empty sets (empty expression cases) has been addressed. However, multiple empty sets in grouping sets have different grouping IDs
2024-03-21 14:07:22 +08:00
e99b33c274 [opt](file-meta-cache) reduce file meta cache size and disable cache for some cases (#32340)
File meta cache on BE is used to cache the meta for external table's file such as parquet footer.
This cache is counted by number, not memory consumption.
So if the cache object is big(eg, a large parquet footer), the total memory consumption of this cache
will be large and causing OOM.

This PR mainly changes:

1. Add a new method `exceed_prune_limit()` for `CachePolicy`
    For `ObjLRUCache`, it always return true so that the minor of full gc on BE will prune the cache each time.

2. Reduce the default capability of file meta cache, from 20000 to 1000

    Also change the default capability of hdfs file handle cache, from 20000 to 1000

4. Change judgement of whether enable file meta cache when querying

    If the number of file need to be read is larger than the 1/3 of the file meta cache's capability, file meta cache
    will be disabled for this query. Because cache is useless if there are too many files.
2024-03-21 14:07:22 +08:00