* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
* fix bug, add remote meta for compaction
Logic in function VCollectIterator::build_heap is not robust, which may cause memory leak:
Level1Iterator* cumu_iter = new Level1Iterator(
cumu_children, _reader, cumu_children.size() > 1, _is_reverse, _skip_same);
RETURN_IF_NOT_EOF_AND_OK(cumu_iter->init());
std::list<LevelIterator*> children;
children.push_back(*base_reader_child);
children.push_back(cumu_iter);
_inner_iter.reset(
new Level1Iterator(children, _reader, _merge, _is_reverse, _skip_same));
cumu_iter will be leaked if cumu_iter->init()); is not success.
The element in InvertedIndexSearcherCache is inverted index searcher, which is a file descriptor of inverted index file, so InvertedIndexSearcherCache is actually cache file descriptor of inverted index file.
If open file descriptor limit of the Linux system is set too small and config inverted_index_searcher_cache_limit is too big, during high pressure load maybe cause "Too many open files".
So, when insert inverted index searcher into InvertedIndexSearcherCache, need also check whether reach file_descriptor_number limit for inverted index file.
There is a bug in inverted_index_writer when adding multiple lines array values' index.
This problem can cause error result when doing schema change adding index.
From the original logic, query like `select * from a where exists (select * from b order by 1) order by 1 limit 1` is a query contains subquery,
but the top query will pass `checkEnableTwoPhaseRead` and set `isTwoPhaseOptEnabled=true`.So check the double plan is a general topn query plan is needed, and rollback the needMaterialize flag setted by the previous `analyze`.
* [improve](dynamic table) refine SegmentWriter columns writer generate
```
Dynamic Block consists of two parts, dynamic part of columns and static part of columns
static dynamic
| ----- | ------- |
the static ones are original _tablet_schame columns
the dynamic ones are auto generated and extended from file scan.
```
**We should only consisder to use Block info to generte columns when it's a dynamic table load procudure.**
And seperate the static ones and dynamic ones
* test
property `hive.metastore.kerberos.principal` is essential when the principal of hms you are connecting is not the
default value: hive-metastore/_HOST@your_realms。
otherwise, you will get error: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)
To avoid data irrecoverable due to delete bitmap calculation error,do compaction with merge on read. Through this way ,even if the delete bitmap calculation is wrong, the data can be recovered by full compaction.
Save cached file segment into path like `cache_path / hash(filepath).substr(0, 3) / hash(filepath) / offset`
to prevent too many directories in `cache_path`.
Set column names from path to lower case in case-insensitive case.
This is for Iceberg columns from path. Iceberg columns are case sensitive,
which may cause error for table with partitions.
Now we use a thrift message per fragment instance. However, there are many same messages between instances in a fragment. So this PR aims to extract the same messages and we only need to send thrift message once for a fragment
The previous logic is how many cn can be returned at most. Instead,
if the number of cn is less than expectBeNum, need to use mix to fill in,
until the number of cn equals with expectBeNum or mix nodes are also used up
To support schema evolution, Iceberg add schema information to Parquet file metadata.
But for early iceberg version, it doesn't write any schema information to Parquet file.
This PR is to support read parquet without schema information.
Now we reuse buffer pool for broadcast shuffle on pipeline engine. This PR ensures that a pipeline with a broadcast shuffle sink will not be scheduled if there are no available buffer in the buffer pool
1.change mv rewrite from bottom up to up bottom
2.compatible with old version mv
3.restore some ut codes (but disable)
4. fix some ut introduced by [fix](planner)fix bug for missing slot #16601 and [Feature](Materialized-View) support multiple slot on one column in materialized view #16378
Currently not support insert {1, 'a'} into struct<f1:tinyint, f2:varchar(20)>
This commit will support implicitly cast the char type in the struct to varchar.
Add implicitly cast for struct-type.
In `Build Third Party Libraries (Linux)` job, some errors occur due to the package conflicts. This PR fixes these errors by skipping the command `apt upgrade`.
```
Unpacking odbcinst1debian2:amd64 (2.3.11) ...
dpkg: error processing archive /tmp/apt-dpkg-install-SY6NPA/43-odbcinst1debian2_2.3.11_amd64.deb (--unpack):
trying to overwrite '/usr/lib/x86_64-linux-gnu/libodbcinst.so.2.0.0', which is also in package libodbcinst2:amd64 2.3.9-5
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)1
```