This PR refactors the old way of writing data to JDBC External Table & JDBC Catalog, mainly including the following tasks
1. Continuing the work of @BePPPower 's PR #18594, changing the logic of splicing Inster sql to operating off-heap memory and using preparedStatement.set to write data logic to complete
2. Supplement the support written by largeint type, mainly to adapt to Java.Math.BigInteger, which uses binary operations
3. Delete the splicing SQL logic in the JDBC External Table & JDBC Catalog related written code
ToDo: Binary type,like bit,binary, blob...
Finally, special thanks to @BePPPower , @AshinGau for his work
Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>
required_slots in TFileScanRangeParams params for external hive table may be updated after FileQueryScanNode finalize. For text file, we need to use the origin required_slots in params so that the list could be updated later. Otherwise, query text file may get the following error:
[INTERNAL_ERROR]Unknown source slot descriptor, slot_id=3
The test query includes the conversion of string types to other types, and the processing of materialized columns for nested subqueries, which is the regression test for bug fix(#18783)
Currently, compaction is executed separately for each backend, and the reconstruction of the index during compaction leads to high CPU usage. To address this, we are introducing single replica compaction, where a specific primary replica is selected to perform compaction, and the remaining replicas fetch the compaction results from the primary replica.
The Backend (BE) requests replica information for all peers corresponding to a tablet from the Frontend (FE). This information includes the host where the replica is located and the replica_id. By calculating hash(replica_id), the replica with the smallest hash value is responsible for executing compaction, while the remaining replicas are responsible for fetching the compaction results from this replica.
The compaction task producer thread, before submitting a compaction task, checks whether the local replica should fetch from its peer. If it should, the task is then submitted to the single replica compaction thread pool.
When performing single replica compaction, the process begins by requesting rowset versions from the target replica. These rowset_versions are then compared with the local rowset versions. The first version that can be fetched is selected.
After discussion in the doris community @apache/doris-committers , we limit the PR to be merged only after at least two people approve it.↳
We can try to run it for a while first, and if everyone gives good feedback, we can use this as a mandatory check.
Since the merge must be approved by at least one committer, we only need to judge whether there are two approves, and we don't need to care about the identity of the approve.
When there is a request change, if the other party is a committer, the committer dismiss is required when merging, which is enforced by github, so we don't need to care.
before:
F0530 11:02:41.989699 1154607 assert_cast.h:54] Bad cast from type:doris::vectorized::IDataType const* to doris::vectorized::DataTypeAggState const*
after:
F0530 11:24:28.390286 1292475 assert_cast.h:46] Bad cast from type:doris::vectorized::DataTypeNullable* to doris::vectorized::DataTypeAggState const*
## Problem summary
When we want to push the filter through the union. We should check whether the union's children are `OneRowRelation` or not. If there are some `OneRowRelation`, we shouldn't push down the filter to that part
Before this PR
```
mysql> select * from (select 1 as a, 2 as b union all select 3, 3) t where a = 1;
+------+------+
| a | b |
+------+------+
| 1 | 2 |
| 3 | 3 |
+------+------+
2 rows in set (0.01 sec)
```
After this PR
```
mysql> select * from (select 1 as a, 2 as b union all select 3, 3) t where a = 1;
+------+------+
| a | b |
+------+------+
| 1 | 2 |
+------+------+
1 row in set (0.38 sec)
```
1. Fix create catalog with resource replay bug.
If user create catalog using `create catalog hive with resource xxx`, when replaying edit log,
there is a bug that resource may be dropped, causing NPE and FE will fail to start.
In this PR, I add a new FE config `disallow_create_catalog_with_resource`, default is true.
So that `with resource` will not be allowed, and it will be deprecated later.
And also fix the replay bug to avoid NPE.
2. Fix issue when creating 2 hive catalogs to connect with and without kerberos authentication.
When user create 2 hive catalogs, one use simple auth, the other use kerberos auth.
The query may fail with error like: `Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.`
So I add a default property for hive catalog: `"ipc.client.fallback-to-simple-auth-allowed" = "true"`.
Which means this property will be added automatically when user creating hive catalog, to avoid such problem.
3. Fix calling `hdfsExists()` issue
When calling `hdfsExists()` with non-zero return code, should check if it encounters error or is file not found.
3. Some code refactor
Avoid import `org.apache.parquet.Strings`
Doris updated array type output format, using double quote for Strings.
Before, it was using single quote. So we need to update the case out file using double quote.
before
mysql [test]>select cast(1 as DECIMALV3(16, 2)) / cast(3 as DECIMALV3(16, 2));
+-----------------------------------------------------------+
| CAST(1 AS DECIMALV3(16, 2)) / CAST(3 AS DECIMALV3(16, 2)) |
+-----------------------------------------------------------+
| 0.00 |
+-----------------------------------------------------------+
mysql [test]>select * from divtest;
+------+------+
| id | val |
+------+------+
| 3 | 5.00 |
| 2 | 4.00 |
| 1 | 3.00 |
+------+------+
mysql [test]>select cast(1 as decimalv3(16,2)) / val from divtest;
+-------------------------------------+
| CAST(1 AS DECIMALV3(16, 2)) / `val` |
+-------------------------------------+
| 0 |
| 0 |
| 0 |
+-------------------------------------+
after
mysql [test]>select cast(1 as DECIMALV3(16, 2)) / cast(3 as DECIMALV3(16, 2));
+-----------------------------------------------------------+
| CAST(1 AS DECIMALV3(16, 2)) / CAST(3 AS DECIMALV3(16, 2)) |
+-----------------------------------------------------------+
| 0.33 |
+-----------------------------------------------------------+
mysql [test]>select cast(1 as decimalv3(16,2)) / val from divtest;
+-------------------------------------+
| CAST(1 AS DECIMALV3(16, 2)) / `val` |
+-------------------------------------+
| 0.250000 |
| 0.200000 |
| 0.333333 |
+-------------------------------------+
This is because in the previous code, the constant 1.000 would be transformed into 1.
remove "ReduceType
recurrent:
./run-regression-test.sh --run -suiteParallel 1 -actionParallel 1 -parallel 1 -d query_p0/sql_functions/window_functions
select /*+ SET_VAR(query_timeout = 600) */ subq_0.`c1` as c0 from (select ref_1.`s_name` as c0, ref_1.`s_suppkey` as c1, ref_1.`s_address` as c2, ref_1.`s_address` as c3 from regression_test_query_p0_sql_functions_window_functions.tpch_tiny_supplier as ref_1 where (ref_1.`s_name` is NULL) or (ref_1.`s_acctbal` is not NULL)) as subq_0 where (subq_0.`c3` is NULL) or (subq_0.`c2` is not NULL)
reason:
FunctionIsNull and FunctionIsNotNull execute returns a const column, but their VectorizedFnCall::is_constant returns false, which causes problems with const handling when VCompoundPred::execute.
This pr converts const column to full column in VCompoundPred execute. In the future, there will be a more thorough solution to such problems.
* [Improve](performance) introduce SchemaCache to cache TabletSchame & Schema
1. When the system is under high-concurrency load with wide table point queries, the frequent memory allocation and deallocation of Schema become evident system bottlenecks. Additionally, the initialization of TabletSchema and Schema also becomes a CPU hotspot.Therefore, the introduction of a SchemaCache is implemented to cache these resources for reuse.
2. Make some variables wrapped with std::unique<unique_ptr>
Performance:
| 状态 | QPS | 平均响应时间 (avg) | P99 响应时间 |
|------------------|-----|------------------|-------------|
| 开启 SchemaCache | 501 | 20ms | 34ms |
| 关闭 SchemaCache | 321 | 31ms | 61ms |
* handle schema change with schema version
* remove useless header
* rebase