Some error happen when using routine load
```
[INTERNAL_ERROR]Message at offset XXX might be too large to fetch, try increasing receive.message.max.bytes
```
Refer to https://github.com/confluentinc/librdkafka/issues/2993, we should upgrade librdkafka version to avoid this bug.
When `ENABLE_PCH = false`, this define will be conflict with BOOL in `include/arrow/type_fwd.h`.
The ODBC table will be deprecated in 2.1, so I just simply undefined the BOOL in include/sqltypes.h
to make compile OK
why upgrade? anything wrong?
Try to fix the problem about opentelemetry::v1::ext::http::client::curl::HttpOperation::Send(), I have updated the pr info.
If we use Clang-16 to build the third-party libraries and build doris_be_test against them, we can not run doris_be_test successfully. Some errors with BRPC occur.
I tested this on Linux (x86_64) and macOS (x86_64/arm64), these errors always raised.
This PR ports the codebase to Clang-16.
Upgrade some third-party libraries:
1. Apache BRPC: 1.2.0 -> 1.4.0 (Some bugs are fixed and all patches for 1.2.0 can be removed.)
2. Boost: 1.73.0 -> 1.81.0 (Porting to Clang-16)
3. libclucene: 2.4.6 -> 2.4.8 (Porting to Clang-16)
Fix conflit name TCHAR in odbc sqltypes.h and clucene clucene-config.h.
Change TCHAR to TWCHAR in odbc sqltypes.h, because TCHAR in odbc is not found used in doris,
but there are too many places to call clucene's TCHAR.
thirdparty/installed/include/sqltypes.h:
`typedef char TCHAR;`
thirdparty/installed/include/CLucene/clucene-config.h:
`typedef wchar_t TCHAR;`
Upgrade simdjson from 1.0.2 to latest version 3.0.1 to avoid -mlzcnt compiler flag causing BE UT(macOS) failure.
simdjson is now only used by VJsonScanner and disabled by default. So the impact of upgrade is limited.
ORC NextStripeReader now only support read columns by indices, but it is hard to get column indices for complex types.
We patch ORC adapter to support read columns by column names.
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
hyperscan is a high-performance regular expression matching library, but can not be used on ARM.
vectorscan is an ARM port for hyperscan, it can be used as a drop in replacement.
Since hyperscan is original created by Intel and is popular and mature on x86, so we just use vectorscan
only for aarch64 when build thirdparty.
Add a new function in arrow adapter to get the raw orc reader which we can get more information
from such offset or min/max value.
And this will be used in #1046
This modify is inspired by Clickhouse
Currently, the libhdfs3 library integrated by doris BE does not support accessing the cluster with kerberos authentication
enabled, and found that kerberos-related dependencies(gsasl and krb5) were not added when build libhdfs3.
so, this pr will enable kerberos support and rebuild libhdfs3 with dependencies gsasl and krb5:
- gsasl version: 1.8.0
- krb5 version: 1.19