Commit Graph

54 Commits

Author SHA1 Message Date
7bd6818350 [branch-2.1][improvement](jdbc catalog) Added support for Oracle Raw type (#37776)
pick (#37078)
In previous versions, we adopted the strategy of reading the object
address for Oracle's raw type, which would lead to unstable and
meaningless results. Here I changed it to read hexadecimal or UTF8
2024-07-15 14:43:05 +08:00
39ded1f649 [branch-2.1][improvement](jdbc catalog) Change JdbcExecutor's error reporting from UDF to JDBC (#37635)
pick (#35692)

In the initial version, JdbcExecutor directly used UdfRuntimeException,
which could lead to misunderstanding of the exception. Therefore, I
created a separate Exception for JdbcExecutor to help us view the
exception more clearly.
2024-07-11 15:11:41 +08:00
ef754487d9 [branch-2.1][improvement](jdbc catalog) Catch AbstractMethodError in getColumnValue Method and Suggest Updating to ojdbc8+ (#37634)
pick (#37608)

Catch AbstractMethodError in getColumnValue method. Provide a clear
error message suggesting the use of ojdbc8 or higher versions to avoid
compatibility issues.
2024-07-11 15:10:47 +08:00
62c4451c97 [branch-2.1][improvement](jdbc catalog) Modify the maximum number of connections in the connection pool to 30 by default (#37023)
pick (#36720)

In many cases, we found that users would use JDBC Catalog to perform a
large number of queries, which resulted in the maximum of 10 connections
being insufficient, so I adjusted it to 30, which covered most needs.
2024-07-01 12:22:20 +08:00
e25b0d7c37 [branch-2.1][improvement](mysql catalog) disable mysql AbandonedConnectionCleanupThread (#36970)
pick (#36655)
2024-06-29 18:35:41 +08:00
2e20e38523 [improvement](jdbc catalog) remove useless jdbc catalog code (#34986) (#35418) 2024-05-27 14:25:26 +08:00
3a5fb6265a [refactor](jdbc catalog) split trino jdbc executor (#34932) (#35176)
pick #34932
2024-05-22 19:09:57 +08:00
05a390e050 [refactor](jdbc catalog) split oceanbase jdbc executor (#34869) (#35175)
pick #34869
2024-05-22 19:09:35 +08:00
24990383ff [refactor](jdbc catalog) split clickhouse jdbc executor (#34794) (#35174)
pick master #34794
2024-05-22 19:09:05 +08:00
22d4543346 [refactor](jdbc catalog) split sap_hana jdbc executor (#34772) 2024-05-13 22:36:52 +08:00
fcbdd77f78 [fix](jdbc catalog) Fix ClassLoader Scope in JdbcExecutor Initialization (#34620) 2024-05-10 11:24:39 +08:00
3495ed58e0 [Enhancement](jdbc catalog) Change Jdbc connection pool to hikari (#34045) (#34310) 2024-04-29 20:22:48 +08:00
f6af79c0ed [fix](catalog) Remove unexpected cleanup when reading jdbc data (#33529) 2024-04-17 23:42:12 +08:00
9c6180d9ba [revert](jni) revert part of #32455 #32904 2024-03-27 20:45:44 +08:00
c0d7a5660e [fix](paimon) support paimon with hive2 (#32455)
In order to support paimon with hive2, we need to modify the origin HiveMetastoreClient.java
to let it compatible with both hive2 and hive3.
And this modified HiveMetastoreClient should be at the front of the CLASSPATH, so that
it can overwrite the HiveMetastoreClient in hadoop jar.

This PR mainly changes:

1. Copy HiveMetastoreClient.java in FE to BE's preload jar.

2. Split the origin `preload-extensions-jar-with-dependencies.jar` into 2 jars
    1. `preload-extensions-project.jar`, which contains the modified HiveMetastoreClient.
    2. `preload-extensions-jar-with-dependencies.jar`, which contains other dependency jars.

3. Modify the `start_be.sh`, to let `preload-extensions-project.jar` be loaded first.

4. Change the way the assemble the jni scanner jar
    Only need to assemble the project jar, without other dependencies.
    Because actually we only use classed under `org.apache.doris` package.
    So remove other unused dependency jars can also reduce the output size of BE.

5. fix bug that the prefix of paimon properties should be `paimon.`, not `paimon`

6. Support paimon with hive2
    User can set `hive.version` in paimon catalog properties to specify the hive version.
2024-03-26 15:31:07 +08:00
a10466598b [fix](jdbc catalog) Fix query errors without jdbc pool default value on only BE upgrade (#32618) 2024-03-22 16:36:22 +08:00
e952b5ef5b [opt](jdbc catalog) Refine the jdbc_connector close logic and actively clear the jvm occupied by jdbcexecutor (#32300) 2024-03-21 14:07:23 +08:00
85b2c42f76 [Enhancement](jdbc catalog) Add a property to test the connection when creating a Jdbc catalog (#32125) (#32531) 2024-03-21 14:05:59 +08:00
cf6b22c621 [fix](jdbc catalog) fix type conversion error in MySQL JDBC Driver 5.x (#31880) 2024-03-12 14:07:57 +08:00
0f1cbcc86a [refactor](jdbc catalog) split postgresql jdbc Executor (#31730) 2024-03-06 13:08:04 +08:00
2e9bd268cd [improvement](jdbc catalog) support sqlserver timestamp type read (#31805) 2024-03-06 13:08:04 +08:00
11903d29a1 [fix](jdbc catalog) fix close abort in sqlserver (#31718) 2024-03-06 13:07:49 +08:00
a5b9127656 [refactor](jdbc catalog) split sqlserver jdbc executor (#31679) 2024-03-06 13:04:29 +08:00
07224686ef [feature](jdbc catalog) support db2 jdbc catalog (#31627) 2024-03-01 14:19:28 +08:00
3c37fb085c [refactor](jdbc catalog) split jdbc executor for different data sources (step-1) (#31406) 2024-02-29 12:38:03 +08:00
18f9ca9242 [fix](jdbc catalog) Fix Resource Closing Logic After Connection Abort (#31219)
* [improvement](jdbc catalog) Optimize the closing logic of Jdbc connection after abort

* [improvement](jdbc catalog) Optimize the closing logic of Jdbc connection after abort

* fix
2024-02-22 20:35:10 +08:00
4f8730d092 [improvement](jdbc catalog) Optimize connection pool parameter settings (#30588)
This PR makes the following changes to the connection pool of JDBC Catalog
1. Set the maximum connection survival time, the default is 30 minutes

-   Moreover, one-half of the maximum survival time is the recyclable time,
-   One-tenth is the check interval for recycling connections

2. Keepalive only takes effect on the connection pool on BE, and will be activated based on one-fifth of the maximum survival time.
3. The maximum number of existing connections is changed from 100 to 10
4. Add the connection cache recycling thread on BE, and add a parameter to control the recycling time, the default is 28800 (8 hours)
5. Add CatalogID to the key of the connection pool cache to achieve better isolation, requires refresh catalog to take effect
6. Upgrade druid connection pool to version 1.2.20
7. Added JdbcResource's setting of default parameters when upgrading the FE version to avoid errors due to unset parameters.
2024-02-03 20:26:03 +08:00
bc03354be8 [improvement](jdbc catalog) Optimize the Close logic of JDBC client (#30236)
Optimize the Close logic of the JDBC client so that the Jdbc Catalog can correctly cancel the running query when the query is cancelled.
2024-01-23 13:22:14 +08:00
0ccd706a30 [Enhancement](Jdbc Catalog) Map Jdbc Catalog JSON Type to String for Improved Performance and Compatibility (#30035)
This PR proposes mapping external catalog JSON types to String instead of JsonB in Apache Doris. This change is motivated by the realization that JDBC retrieves JSON data as a String JSON string, regardless of its storage format (Json(String) or Json(Binary)). Mapping to String streamlines data retrieval, simplifies write-backs, and ensures compatibility with all JSON(String) and JSON(Binary) functions, despite potentially misleading displays of JSON data as Strings in Doris. This approach avoids the performance overhead and complexity of converting each row of data from JsonB to String, making the process more efficient and elegant.

About Upgrade
To ensure query compatibility with existing Catalogs in the upgraded version,we currently still retain the capability to query external JSON types as JSONB. However, once you upgrade to the new version and either refresh the Catalog or create a new one, all external JSON types will be treated as Strings. To ensure consistent behavior,and possible future removal of support for JSON as JSONB query code, it is highly recommended that you manually refresh your Catalog as soon as possible after upgrading to the new version.
2024-01-18 12:03:07 +08:00
8fc9c18c85 [improvement](jdbc catalog) Put the jdbc connection pool parameters into catalog properties (#29195) 2024-01-12 11:40:28 +08:00
10623ad671 [improvement](jdbc catalog) Optimize connection pool caching logic (#28859)
In the old caching logic, we only used jdbcurl, user, and password as cache keys. This may cause the old link to be still used when replacing the jar package, so we should concatenate all the parameters required for the connection pool as the key.
2023-12-26 14:12:37 +08:00
3e1e8d2ebe [fix](jdbc catalog) Fixed data conversion problem when all data is null (#28230) 2023-12-11 17:57:57 +08:00
df867a1531 [fix](catalog) Fix ClickHouse DataTime64 precision parsing (#26977) 2023-11-15 10:23:21 +08:00
2f32a721ee [refactor](jni) unified jni framework for jdbc catalog (#26317)
This commit overhauls the JDBC connector logic within our project, transitioning from the previous mechanism of fetching data through JNI calls for individual ResultSet items to a more efficient and unified approach using the VectorTable data structure.
2023-11-13 14:28:15 +08:00
8434389358 [fix](jdbc) fix clickhouse catalog arr nullable and add case (#26639) 2023-11-09 19:32:05 +08:00
47689fd452 [refactor](jni) unified jni framework for java udf (#25302)
Use the unified jni framework to refactor java udf.
The unified jni framework takes VectorTable as the container to transform data between c++ and java, and hide the details of data format conversion.
In addition, the unified framework supports complex and nested types.
The performance of basic types remains consistent, with a 30% improvement in string types and an order of magnitude improvement in complex types.
2023-10-18 09:27:54 +08:00
dbfacdc4af [improvement](jdbc catalog) Optimize Loop Performance by Caching isNebula Method Result (#24260) 2023-09-13 21:40:28 +08:00
f85da7d942 [improvement](jdbc) add profile for jdbc read and convert phase (#23962)
Add 2 metrics in jdbc scan node profile:
- `CallJniNextTime`: call get next from jdbc result set
- `ConvertBatchTime`: call convert jobject to columm block

Also fix a potential concurrency issue when init jdbc connection cache pool
2023-09-10 21:42:06 +08:00
228f0ac5bb [Feature](Multi-Catalog) support query doris bitmap column in external jdbc catalog (#23021) 2023-09-02 12:46:33 +08:00
5ba505ebf4 [fix](multi-catalog)fix avro and jdbc scanner dependency (#23015)
add preload-extensions module, put all conflict dependencies to pom.xml in preload-extensions
2023-08-20 19:28:17 +08:00
221e7bdd17 [test](jdbc external) fix mysql and pg external regression test (#22998) 2023-08-16 10:44:47 +08:00
6f1c03c766 [fix](jdbc_catalog) fix int and bigint in mysql view when use doris catalog (#22251) 2023-07-27 16:50:42 +08:00
4f6a3c5bf0 [feature](catalog) support clob type in oracle jdbc catalog (#21532) 2023-07-27 15:49:15 +08:00
619a2857e1 [improvement](jdbc catalog) improve mysql jdbc catalog read bytea`s types & else improve (#22233) 2023-07-27 10:18:37 +08:00
9abf32324b [improvement](jdbc) add timestamp put to datev2 (#21680) 2023-07-26 09:10:34 +08:00
e8f4323e0f [Fix](jdbcCatalog) fix typo of some variable #22214 2023-07-26 08:34:45 +08:00
cf677b327b [fix](jdbc catalog) Fixed mappings with type errors for bool and tinyint(1) (#22089)
First of all, mysql does not have a boolean type, its boolean type is actually tinyint(1), in the previous logic, We force tinyint(1) to be a boolean by passing tinyInt1isBit=true, which causes an error if tinyint(1) is not a 0 or 1, Therefore, we need to match tinyint(1) according to tinyint instead of boolean, and this change will not affect the correctness of where k = 1 or where k = true queries
2023-07-25 22:45:22 +08:00
999fbdc802 [improvement](jdbc) add new type 'object' of int (#21681) 2023-07-25 21:29:46 +08:00
0be349e250 [feature](jdbc) Support jdbc catalog to read json types (#21341) 2023-07-10 16:21:00 +08:00
e4c0a0ac24 [improve](dependency)Upgrade dependency version (#21431)
exclude old netty version
upgrade spring-boot version to 2.7.13
used ojdbc8 replace ojdbc6
upgrade jackson version to 2.15.2
upgrade fabric8 version to 6.7.2
2023-07-04 11:29:21 +08:00