To support schema evolution, Iceberg add schema information to Parquet file metadata.
But for early iceberg version, it doesn't write any schema information to Parquet file.
This PR is to support read parquet without schema information.
Now we reuse buffer pool for broadcast shuffle on pipeline engine. This PR ensures that a pipeline with a broadcast shuffle sink will not be scheduled if there are no available buffer in the buffer pool
1.change mv rewrite from bottom up to up bottom
2.compatible with old version mv
3.restore some ut codes (but disable)
4. fix some ut introduced by [fix](planner)fix bug for missing slot #16601 and [Feature](Materialized-View) support multiple slot on one column in materialized view #16378
Currently not support insert {1, 'a'} into struct<f1:tinyint, f2:varchar(20)>
This commit will support implicitly cast the char type in the struct to varchar.
Add implicitly cast for struct-type.
In `Build Third Party Libraries (Linux)` job, some errors occur due to the package conflicts. This PR fixes these errors by skipping the command `apt upgrade`.
```
Unpacking odbcinst1debian2:amd64 (2.3.11) ...
dpkg: error processing archive /tmp/apt-dpkg-install-SY6NPA/43-odbcinst1debian2_2.3.11_amd64.deb (--unpack):
trying to overwrite '/usr/lib/x86_64-linux-gnu/libodbcinst.so.2.0.0', which is also in package libodbcinst2:amd64 2.3.9-5
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)1
```
colocated join is depended on if the both side of the join conjuncts are simple column with same distribution policy etc. So the key is to figure out the original source column in scan node if there is one. To do that, we should check the slot from both lhs and rhs of outputSmap in join node.
In current implementation, the class Auth is used for:
Manager all authentication and authorization info such as user, role, password, privileges.
Provide an interface for privilege checking
Some user may want to integrate external access management system such as Apache Ranger.
So we should provide a way to let user set their own access controller.
This PR mainly changes:
A new class SystemAccessController
This access controller is used to check the global level privileges and resource privileges.
A new interface CatalogAccessController
This interface is used to check catalog/database/tbl level privileges.
It has a default implements InternalCatalogAccessController.
All privilege checking methods are moved from Auth to either SystemAccessController or
InternalCatalogAccessController
A new class AccessControllerManager
This is the entry point of privilege authentication. All methods previously called from Auth
now are called from AccessControllerManager
Now, user can implement the interface CatalogAccessController to use their own access controller.
And when creating external catalog, user can specified the access controller class name, so that
different external catalog can use different access controller.
Support IPV6 in Apache Doris, the main changes are:
1. enable binding to IPV6 address if network priority in config file contains an IPV6 CIDR string
2. BRPC and HTTP support binding to IPV6 address
3. BRPC and HTTP support visiting IPV6 Services
Hive 1.x may write orc file with internal column name (_col0, _col1, _col2...).
This will cause query result be NULL because column name in orc file doesn't match
with column name in Doris table schema. This pr is to support query Hive orc files with internal column names.
For now, we haven't see any problem in Parquet file, will send new pr to fix parquet if any problem show up in the future.
* [improvement](block exception safe) make block queue exception safe
This is part of exception safe: #16366.
---------
Co-authored-by: yiguolei <yiguolei@gmail.com>
In our current logic, index page will be pre-decoded but it will return OK
as index page use BinaryPlainPageBuilder and first 4 bytes of the page is a offset
so it's high probablility not equal to EncodingTypePB::DICT_ENCODING which
is 5.
Code in bitshuffle_page_pre_decode.h
```
if constexpr (USED_IN_DICT_ENCODING) {
auto type = decode_fixed32_le((const uint8_t*)&data.data[0]);
if (static_cast<EncodingTypePB>(type) != EncodingTypePB::DICT_ENCODING) {
return Status::OK();
}
size_of_dict_header = BINARY_DICT_PAGE_HEADER_SIZE;
data.remove_prefix(4);
}
```
But if type just equal to EncodingTypePB::DICT_ENCODING and then it will use
BitShuffle to decode BinaryPlainPage, which will leads to an fatal error.