doris

Author	SHA1	Message	Date
yongkang.zhong	3a22af836e	[fix](jdbc catalog) fix error to clickhouse uint64 type Conversion (#19463 ) * [fix](jdbc catalog) fix error to clickhouse uint64 type Conversion * add test case	2023-05-10 21:53:30 +08:00
yongkang.zhong	1bc405c06f	[fix](catalog) fix doris jdbc catalog largeint select error (#19407 ) when I use mysql-jdbc 5.1.47 create a doris jdbc catalog, the largeint cannot select When mysql-jdbc reads largeint, it will convert the format to string because it is too long mysql> select `largeint` from type3; ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INTERNAL_ERROR]Fail to convert jdbc type of java.lang.String to doris type LARGEINT on column: largeint. You need to check this column type between external table and doris table.	2023-05-09 17:34:48 +08:00
chenlinzhong	aeb3450151	[feature](graph)Support querying data from the Nebula graph database (#19209 ) Support querying data from the Nebula graph database This feature comes from the needs of commercial customers who have used Doris and Nebula, hoping to connect these two databases changes mainly include: * add New Graph Database JDBC Type * Adapt the type and map the graph to the Doris type	2023-05-09 15:30:11 +08:00
Calvin Kirs	5459cd9c30	[Improve](fe)Upgrade dependencies and optimize jar package management (#18882 ) bind netty-version to 4.1.89-final bind jettison to 1.5.4 upgrade hadoop version to 3.3.5 upgrade range-plugins-common to 2.4.0 bind bcprov-jdk15on to 2.4.0 upgrade and bind woodstox to 6.5.1 upgrade and bind kerby to 2.0.3 upgrade hudi to 0.13.0 upgrade parquet to 1.13.0 upgrade maven-source-plugin to 3.2.1 upgrade maven-assembly-plugin to 3.3.0 upgrade maven-javadoc-plugin to 3.3.2 upgrade maven-shade-plugin to 3.3.4 upgrade maven-clean-plugin to 3.1.0 Remove meaningless plugins Optimize doris maven path Unify the Java modules for management in fe	2023-05-04 10:07:37 +08:00
Tiewei Fang	8864266a42	[fix](Jdbc Catalog) fix Druid Pool parameter and set `testWhileIdle = true` (#19049 ) Set `testWhileIdle` for the druid pool to true	2023-04-26 11:44:45 +08:00
zhangstar333	fd905b66b0	[refactor](jdbc) close datasource if no need to maintain the cache (#18724 ) after pr #18670 could use jvm parameters to init jdbc datasource, but when set JDBC_MIN_POOL=0, it can be immediately closed. There is no need to wait for the recycling timer.	2023-04-22 22:07:34 +08:00
Tiewei Fang	13894ae790	[fix](jdbc catalog) Use default value if the user does not set the pool parameter in be.conf #18919	2023-04-22 08:39:26 +08:00
Calvin Kirs	575c1620c2	[Improve](fe)Use commons-lang3 uniformly and refactor PatternGenerator#generateTypePattern (#18666 ) `commons-lang`(1and2) is no longer maintained since 2011, and the official recommendation is `commons-lang3`, which can be smoothly upgraded to be compatible with `commons-lang`. We use both dependencies in `fe`, which can be completely unified. `PatternGenerator#generateTypePattern` has many meaningless loops, and IntegerRange is introduced for, which is unnecessary. So I refactored it.	2023-04-17 20:15:17 +08:00
Ashin Gau	ddbff2aa39	[feature](jni) map c++ block to java vector table (#18566 ) PR(#17960) has introduced vector table which can map java table to c++ block. In some cases(java udf & jdbc exector), we should map c++ block to java table. This PR implements this function. The memory structure of java vector table and c++ block is consistent, so the implementation doesn't copy the block, just passes the memory address.	2023-04-17 00:04:53 +08:00
yongkang.zhong	afdac1204d	[improve](postgresql catalog) support postgresql bytea type to doris string (#18623 ) * [improve](postgresql catalog) support postgresql bytea type to doris string * modify function name * add case	2023-04-16 18:14:42 +08:00
zhangstar333	e1b3955e05	[refactor](jdbc) using jvm parameters to init jdbc datasource (#18670 ) using the jvm parameters to init jdbc datasource connect pool. if anyone don't need to maintain the connect, so could set JDBC_MIN_POOL=0	2023-04-14 18:45:29 +08:00
zhangstar333	1d3699a70c	[refactor](jdbc) refactor jdbc connection num in datasource (#18563 ) now maybe jdbc have problem that there are too many connections and they do not release, so change the property of datasource: init = 1, min = 1, max = 100, and idle time is 10 minutes.	2023-04-13 22:08:08 +08:00
Calvin Kirs	5f981b0b1f	[fix](catalog)Use hive-catalog-shade to solve thrift version compatibility issues (#18504 ) `Hive 3` uses the `thrift-0.9.3` package, and `Doris` uses the `thrift-0.16.0` package. These two packages are not compatible, so we use the `hive-sahde` package to manage hive dependencies in a unified way. This jar package renames the `thrift` class , so the problem of conflict can be resolved.	2023-04-11 13:19:39 +08:00
morrySnow	e29fc3b46b	[fix](chore) fix compile failed in JdbcExecutor and revert #18306 since be crash randomly (#18371 ) fix 2 problems: 1. PR #18187 use the api resizeColumn in JNINativeMethod has been removed by #17960 2. revert PR #18306 to fix pipeline core when load	2023-04-04 20:04:28 +08:00
zhangstar333	54dbb4af67	[vectorzied](jdbc) refactor jdbc table read array type (#18187 ) jdbc read array type get result from Doris is string, PG is java.sql.array, CK is java.lang.object it's difficult to maintain and read the code, so change all database's array result to string, then add a cast function from string to doris array type	2023-04-04 11:57:04 +08:00
yongkang.zhong	fe9d2b00fc	[test](jdbc catalog) add clickhouse jdbc catalog base type test (#18007 )	2023-04-03 20:18:36 +08:00
yongkang.zhong	1c2f95b887	[improve](clickhouse jdbc) support clickhouse jdbc 4.x version (#18258 ) In clickhouse's 4.x version of jdbc, some UInt types use special Java types, so I adapted Doris's ClickHouse JDBC External ``` com.clickhouse.data.value.UnsignedByte; com.clickhouse.data.value.UnsignedInteger; com.clickhouse.data.value.UnsignedLong; com.clickhouse.data.value.UnsignedShort; ```	2023-03-31 13:40:10 +08:00
Ashin Gau	d6b0fe9072	[feature](jni) jni table scanner framework (#17960 ) A framework that read data from jni scanner, which can support the data source from java ecosystem(java API). ## Java Interface Java scanner should extends `org.apache.doris.jni.JniScanner`, implements the following methods: ``` // Initialize JniScanner public abstract void open() throws IOException; // Close JniScanner and release resources public abstract void close() throws IOException; // Scan data and save as vector table public abstract int getNext() throws IOException; ``` See demo usage in `org.apache.doris.jni.MockJniScanner` ## c++ interface C++ reader should use `doris::JniConnector` to get data from `org.apache.doris.jni.JniScanner`. See demo usage in `doris::MockJniReader`. ## Pushed-down predicates Java scanner can get pushed-down predicates by `org.apache.doris.jni.vec.ScanPredicate`. ## Remaining works: 1. Implement complex nested types. 2. Read hudi MOR table as the end-to-end demo usage.	2023-03-30 23:47:45 +08:00
Tiewei Fang	3e8b3d68fc	[BugFix](jdbc catalog) fix OOM when jdbc catalog querys large data from doris #18067 When using JDBC Catalog to query the Doris data, because Doris does not provide the cursor reading method (that is, fetchBatchSize is invalid), Doris will send the data to the client at one time, resulting in client OOM. The MySQL protocol provides a stream reading method. Doris can use this method to avoid OOM. The requirements of using the stream method are setting fetchbatchsize = Integer.MIN_VALUE and setting ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY	2023-03-26 20:02:03 +08:00
yongkang.zhong	e2e806a5e7	[improve](clickhouse jdbc) support clickhouse array type (#17993 ) In this PR, I match the array type of ClickHouse to the array type of Doris's jdbc external.	2023-03-22 19:42:32 +08:00
zhangstar333	e359e412e1	[vectorized](udaf) fix java udaf meet error of std::bad_alloc (#17848 ) Now if the user code of java udaf throws exception, because c++ code of agg function nobody could deal with it, so maybe get error of std::bad_alloc	2023-03-19 11:52:15 +08:00
Pxl	1a549edac2	[Chore](third-party) upgrade thrift from 0.13 to 0.16 (#17202 ) upgrade thrift from 0.13 to 0.16 There is thrift's release notes https://github.com/apache/thrift/blob/master/CHANGES.md	2023-03-10 11:33:16 +08:00
zhangstar333	4ef46159ae	[vectorized](udaf) support array type for java-udaf (#17351 )	2023-03-09 11:30:07 +08:00
Calvin Kirs	d908d5fe01	[dependency](fe)Dependency Upgrade (#17377 ) * Upgrade log4j to 2.X - binding log4j version to 2.18.0 - used log4j-1.2-api complete smooth upgrade * Upgrade filerupload to 1.5 * Upgrade commons-io to 2.7 * Upgrade commons-compress to 1.22 * Upgrade gson to 2.8.9 * Upgrade guava to 30.0-jre * Binding jackson version to 2.14.2 * Upgrade netty-all to 4.1.89.final * Upgrade protobuf to 3.21.12 * Upgrade kafka-clints to 3.4.0 * Upgrade calcite version to 1.33.0 * Upgrade aws-java-sdk to 1.12.302 * Upgrade hadoop to 3.3.4 * Upgrade zookeeper to 3.4.14 * Binding tomcat-embed-core to 8.5.86 * Upgrade apache parent pom to 25 * Use hive-exec-core as a hive dependency, add the missing jar-hive-serde separately * Basic public dependencies are extracted to parent dependencies * Use jackson uniformly as the basic json tool * Remove springloaded, spring-boot-devtools has the same functionality * Modify the spark-related dependency scope to provide, which should be provided at runtime	2023-03-08 14:28:40 +08:00
Tiewei Fang	48c2d806d7	[enhencement](jdbc catalog) Use Druid instead of HikariCP in JdbcClient (#17395 ) This pr does three things: 1. Use Druid instead of HikariCP in JdbcClient 2. when download udf jar, add the name of the jar package after the local file name. 3. refactor some jdbcResource code	2023-03-07 08:51:10 +08:00
Tiewei Fang	c2cc75d741	[BugFix](Jdbc Catalog) Fix null pointer exception in JdbcExecutor (#16958 ) This pr do two things: 1. fix: It use `column[0]` to judge class type in JdbcExecutor, but column[0] may be null ! 2. Enhencement In the original logic, all fields in jdbc catalog table will be set Nullable. However, it is inefficient for nullable fields. Actually, we can know if the fields in data source table is nullable through jdbc. So we can set the corresponding fields in Doris jdbc catalog to nullable or not.	2023-02-23 14:04:54 +08:00
zhangstar333	dc3dab5a23	[vectorized](jdbc) fix jdbc connect sql server error (#16929 )	2023-02-22 19:36:27 +08:00
jakevin	54bf40b6e7	[feature](Nereids): Eliminate duplicate join condition. (#16910 )	2023-02-21 19:40:44 +08:00
zhangstar333	5291f14aff	[vectorized](udf) java udf support array type (#16841 )	2023-02-20 10:00:25 +08:00
zhangstar333	af5dc7565e	[bug](udf) fix udf return type of decimal check scale must is 9 (#16497 )	2023-02-14 10:53:53 +08:00
zhangstar333	b99e2dc727	[bug](jdbc) fix jdbc can't get object of PGobject (#16496 ) when pg table have some unsupported column type like: point, polygon, jsonb...... jdbc catalog will convert it to string type in doris. but get result set in java is org.postgresql.util.PGobject Some test need this pr: #16442	2023-02-10 16:19:02 +08:00
zhangstar333	458adf6c91	[improvement](jdbc) refator jdbc of copy result set by batch (#16337 ) have test jdbc external table with read, 10%+ performance improvement after optimization	2023-02-04 22:51:55 +08:00
zhangstar333	253445ca46	[vectorzied](jdbc) fix jdbc executor for get result by batch and memo… (#15843 ) result set should be get by batch size2. fix memory leak3.	2023-01-21 08:22:22 +08:00
Gabriel	01c001e2ac	[refactor](javaudf) simplify UdfExecutor and UdafExecutor (#16050 ) * [refactor](javaudf) simplify UdfExecutor and UdafExecutor * update * update	2023-01-21 08:07:28 +08:00
Tiewei Fang	7814d2b651	[Fix](Oracle External Table) fix that oracle external table can not insert batch values (#16117 ) Issue Number: close #xxx This pr fix two bugs: _jdbc_scanner may be nullptr in vjdbc_connector.cpp, so we use another method to count jdbc statistic. close [Enhencement](jdbc scanner) add profile for jdbc scanner #15914 In the batch insertion scenario, oracle database does not support syntax insert into tables values (...),(...); , what it supports is: insert all into table(col1,col2) values(c1v1, c2v1) into table(col1,col2) values(c1v2, c2v2) SELECT 1 FROM DUAL;	2023-01-21 07:57:12 +08:00
Tiewei Fang	1638936e3f	[fix](oracle catalog) oracle catalog support `TIMESTAMP` dateType of oracle (#16113 ) `TIMESTAMP` dateType of Oracle will map to `DateTime` dateType of Doris	2023-01-20 14:47:58 +08:00
Mingyu Chen	4035bd83c3	[fix](jdbc) fix jdbc driver bug and external datasource p2 test case issue (#16033 ) Fix bug that when create jdbc resource with only jdbc driver file name, it will failed to do checksum This is because we forgot the pass the full driver url to JdbcClient. Add ResultSet.FETCH_FORWARD and set AutoCommit to false to jdbc connection, so to avoid OOM when fetching large amount of data set useCursorFetch in jdbc url for both MySQL and PostgreSQL. Fix some p2 external datasource bug	2023-01-18 17:48:06 +08:00
Mingyu Chen	4b49d05e97	[refactor](fe) remove type related class to fe-common to reduce java-udf jar size (#15808 )	2023-01-17 00:01:15 +08:00
Gabriel	2c9c7c48ac	[improvement](decimalv3) Java UDF and array type support DECIMALV3 (#15674 )	2023-01-09 15:13:16 +08:00
Tiewei Fang	df2da89b89	[feature](multi-catalog) support postgresql jdbc catalog (#15570 ) support postgresql jdbc catalog	2023-01-06 11:00:59 +08:00
zhangstar333	85c7c531f1	[vectorized](jdbc) support array type in jdbc external table (#15303 )	2022-12-30 00:29:08 +08:00
jiafeng.zhang	d48abd91df	[deps](fe)upgrade deps version (#15262 ) upgrade hadoop version to 2.10.2 jackson-databind to 2.14.1	2022-12-24 22:18:10 +08:00
jiafeng.zhang	e8bac706d3	[deps](FE)Upgrade the velocity version that hive-exec depends on to 2.3 (#15067 )	2022-12-19 14:20:11 +08:00
zhangstar333	17e14e9a63	[bug](udaf) fix java udaf incorrect get null value with row (#15151 )	2022-12-19 10:07:12 +08:00
zhangstar333	962810b973	[Vectorized](jdbc) add check type for jdbc table (#14501 )	2022-12-08 10:27:47 +08:00
zhangstar333	9d2cb133f2	[fix](jdbc) fix logger error of statusLogger unrecognized (#14854 ) * [fix](jdbc) fix logger error of statusLogger unrecognized * update	2022-12-07 11:43:05 +08:00
Tiewei Fang	9272680d00	[feature](multi-catalog) support Jdbc catalog (#14527 ) Issue Number: close #xxx I add jdbc catalog for doris multi-catalog feature. Currently, the jdbc catalog only supports MYSQL DBMS. TODO: support for postgre DB Support for other databases. Problem summary For jdbc catalog, we can create catalog like: CREATE CATALOG jdbc4 PROPERTIES ( "type"="jdbc", "jdbc.user"="root", "jdbc.password"="123456", "jdbc.jdbc_url" = "jdbc:mysql://127.0.0.1:13396/demo?yearIsDateType=false", "jdbc.driver_url" = "file:/mnt/disk2/ftw/tools/jar/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar", "jdbc.driver_class" = "com.mysql.jdbc.Driver" ); Note: yearIsDateType is a param of jdbc: If yearIsDateType configuration property is set to false, then the returned object type is java.sql.Short. If set to true (the default), then the returned object is of type java.sql.Date with the date set to January 1st, at midnight. To compat with mysql, we force the use of yearIsDateType=false in FE. if user sets yearIsDateType=true, doris FE will force to change yearIsDateType=false.	2022-11-30 11:28:08 +08:00
Tiewei Fang	36419fae48	[fix](JdbcExecutor) fix that JdbcExecutor did not load the class jar (#14598 ) JdbcExecutor did not load jdbc driver jar, so add classloader to load jdbc jar.	2022-11-26 23:53:05 +08:00
Gabriel	496a92b668	[JavaUDF](loader) Fix compatible problem for JAVA 11 (#14519 )	2022-11-23 23:36:39 +08:00
zy-kkk	ce489cf723	[Feature](JDBC)support clickhouse jdbc external table (#14244 )	2022-11-21 10:33:53 +08:00

1 2

74 Commits