pick (#35692)
In the initial version, JdbcExecutor directly used UdfRuntimeException,
which could lead to misunderstanding of the exception. Therefore, I
created a separate Exception for JdbcExecutor to help us view the
exception more clearly.
pick (#37608)
Catch AbstractMethodError in getColumnValue method. Provide a clear
error message suggesting the use of ojdbc8 or higher versions to avoid
compatibility issues.
pick (#36720)
In many cases, we found that users would use JDBC Catalog to perform a
large number of queries, which resulted in the maximum of 10 connections
being insufficient, so I adjusted it to 30, which covered most needs.
backport: #35690
`PropertyConverter.setS3FsAccess` has add customized s3 providers:
```
public static final List<String> AWS_CREDENTIALS_PROVIDERS = Arrays.asList(
DataLakeAWSCredentialsProvider.class.getName(),
TemporaryAWSCredentialsProvider.class.getName(),
SimpleAWSCredentialsProvider.class.getName(),
EnvironmentVariableCredentialsProvider.class.getName(),
IAMInstanceCredentialsProvider.class.getName());
```
And these providers are set as configuration value of
`fs.s3a.aws.credentials.provider`, which will be used as configuration
to build s3 reader in JNI readers. However,
`DataLakeAWSCredentialsProvider` is in `fe-core`, that is not dependent
by JNI readers, so we have to move s3 providers to `fe-common'.
Issue Number: close#35024
This bug is because the fe incorrectly sets the update time of paimon
catalog, causing the be to be unable to update paimon's schema in time.
```c++
private void initTable() {
PaimonTableCacheKey key = new PaimonTableCacheKey(ctlId, dbId, tblId, paimonOptionParams, dbName, tblName);
TableExt tableExt = PaimonTableCache.getTable(key);
if (tableExt.getCreateTime() < lastUpdateTime) {
LOG.warn("invalidate cache table:{}, localTime:{}, remoteTime:{}", key, tableExt.getCreateTime(),
lastUpdateTime);
PaimonTableCache.invalidateTableCache(key);
tableExt = PaimonTableCache.getTable(key);
}
this.table = tableExt.getTable();
paimonAllFieldNames = PaimonScannerUtils.fieldNames(this.table.rowType());
if (LOG.isDebugEnabled()) {
LOG.debug("paimonAllFieldNames:{}", paimonAllFieldNames);
}
}
```
* Adapt paimon 0.6.0 (#33943)
Version 2.0.0 of the shade package eliminates potential jar conflicts, resolves dependency component issues, and significantly reduces package size.
Utilize the directly-dependent guava library instead of relying on transitively included libraries.
* [chore](dependencies)Upgrade paimon to 0.7.0 (#33987)
---------
Co-authored-by: Calvin Kirs <kirs@apache.org>
In order to support paimon with hive2, we need to modify the origin HiveMetastoreClient.java
to let it compatible with both hive2 and hive3.
And this modified HiveMetastoreClient should be at the front of the CLASSPATH, so that
it can overwrite the HiveMetastoreClient in hadoop jar.
This PR mainly changes:
1. Copy HiveMetastoreClient.java in FE to BE's preload jar.
2. Split the origin `preload-extensions-jar-with-dependencies.jar` into 2 jars
1. `preload-extensions-project.jar`, which contains the modified HiveMetastoreClient.
2. `preload-extensions-jar-with-dependencies.jar`, which contains other dependency jars.
3. Modify the `start_be.sh`, to let `preload-extensions-project.jar` be loaded first.
4. Change the way the assemble the jni scanner jar
Only need to assemble the project jar, without other dependencies.
Because actually we only use classed under `org.apache.doris` package.
So remove other unused dependency jars can also reduce the output size of BE.
5. fix bug that the prefix of paimon properties should be `paimon.`, not `paimon`
6. Support paimon with hive2
User can set `hive.version` in paimon catalog properties to specify the hive version.
* [improvement](jdbc catalog) Optimize the closing logic of Jdbc connection after abort
* [improvement](jdbc catalog) Optimize the closing logic of Jdbc connection after abort
* fix
This PR makes the following changes to the connection pool of JDBC Catalog
1. Set the maximum connection survival time, the default is 30 minutes
- Moreover, one-half of the maximum survival time is the recyclable time,
- One-tenth is the check interval for recycling connections
2. Keepalive only takes effect on the connection pool on BE, and will be activated based on one-fifth of the maximum survival time.
3. The maximum number of existing connections is changed from 100 to 10
4. Add the connection cache recycling thread on BE, and add a parameter to control the recycling time, the default is 28800 (8 hours)
5. Add CatalogID to the key of the connection pool cache to achieve better isolation, requires refresh catalog to take effect
6. Upgrade druid connection pool to version 1.2.20
7. Added JdbcResource's setting of default parameters when upgrading the FE version to avoid errors due to unset parameters.
This PR proposes mapping external catalog JSON types to String instead of JsonB in Apache Doris. This change is motivated by the realization that JDBC retrieves JSON data as a String JSON string, regardless of its storage format (Json(String) or Json(Binary)). Mapping to String streamlines data retrieval, simplifies write-backs, and ensures compatibility with all JSON(String) and JSON(Binary) functions, despite potentially misleading displays of JSON data as Strings in Doris. This approach avoids the performance overhead and complexity of converting each row of data from JsonB to String, making the process more efficient and elegant.
About Upgrade
To ensure query compatibility with existing Catalogs in the upgraded version,we currently still retain the capability to query external JSON types as JSONB. However, once you upgrade to the new version and either refresh the Catalog or create a new one, all external JSON types will be treated as Strings. To ensure consistent behavior,and possible future removal of support for JSON as JSONB query code, it is highly recommended that you manually refresh your Catalog as soon as possible after upgrading to the new version.