doris

Author	SHA1	Message	Date
zy-kkk	3e1e8d2ebe	[fix](jdbc catalog) Fixed data conversion problem when all data is null (#28230 )	2023-12-11 17:57:57 +08:00
slothever	1706699e7e	[fix](multi-catalog)support the max compute partition prune (#27154 ) 1. max compute partition prune, we just support filter mc partitions by '='，it can filter just one partition to support multiple partition filter and range operator('>','<', '>='..), the partition prune should be supported. 2. add max compute row count cache and partitionValues cache 3. add max compute regression case	2023-12-01 22:28:26 +08:00
wudongliang	cd6c61347d	[Feature](tvf)(avro-jni) avro-jni add projection push down (#26885 )	2023-11-27 10:33:27 +08:00
slothever	add6bdb240	[fix](multi-catalog)add the max compute fe ut and fix download expired (#27007 ) 1. add the max compute fe ut and fix download expired 2. solve memery leak when allocator close 3. add correct partition rows	2023-11-20 10:42:07 +08:00
Mingyu Chen	c459408580	[fix](jni) avoid BE crash and NPE when close paimon reader (#27129 ) 1. Do not use FATAL log when jni encounter error, to avoid crash. 2. Fix NPE when closing PaimonReader, the reader may not be assigned if PaimonReader open failed.	2023-11-17 20:01:08 +08:00
zy-kkk	df867a1531	[fix](catalog) Fix ClickHouse DataTime64 precision parsing (#26977 )	2023-11-15 10:23:21 +08:00
zy-kkk	2f32a721ee	[refactor](jni) unified jni framework for jdbc catalog (#26317 ) This commit overhauls the JDBC connector logic within our project, transitioning from the previous mechanism of fetching data through JNI calls for individual ResultSet items to a more efficient and unified approach using the VectorTable data structure.	2023-11-13 14:28:15 +08:00
zy-kkk	8434389358	[fix](jdbc) fix clickhouse catalog arr nullable and add case (#26639 )	2023-11-09 19:32:05 +08:00
wudongliang	22bf2889e5	[feature](tvf)(jni-avro)jni-avro scanner add complex data types (#26236 ) Support avro's enum, record, union data types	2023-11-09 13:58:49 +08:00
Gabriel	809510f8b2	[bug](udf) Fix method invoking (#26131 )	2023-10-31 11:46:14 +08:00
DongLiang-0	267c11207b	[feature](paimon)paimon catalog supports complex types (#25364 )	2023-10-23 17:32:13 +08:00
Ashin Gau	a2ceea5951	[refactor](jni) unified jni framework for java udaf (#25591 ) Follow https://github.com/apache/doris/pull/25302, and use the unified jni framework to refactor java udaf. This PR has removed the old interfaces to run java udf/udaf. Thanks to the ease of use of the new framework, the core code for modifying UDAF does not exceed 100 lines, and the logic is similar to that of UDF.	2023-10-20 16:13:40 +08:00
Ashin Gau	47689fd452	[refactor](jni) unified jni framework for java udf (#25302 ) Use the unified jni framework to refactor java udf. The unified jni framework takes VectorTable as the container to transform data between c++ and java, and hide the details of data format conversion. In addition, the unified framework supports complex and nested types. The performance of basic types remains consistent, with a 30% improvement in string types and an order of magnitude improvement in complex types.	2023-10-18 09:27:54 +08:00
slothever	18c2a13e09	[fix](multi-catalog)fix maxcompute partition filter and session creation (#24911 ) add maxcompute partition support fix maxcompute partition filter modify maxcompute session create method	2023-10-17 22:36:10 +08:00
zhangdong	ce18f1148a	[improvement](catalog)compatible with paimon 0.5 (#24985 ) compatible with paimon 0.5 add p0 for paimon,need set enablePaimonTest=true	2023-10-17 22:07:13 +08:00
Ashin Gau	522faa8cd2	[fix](jni) the offset in map type is int64 (#25394 ) The offset in map type column is int64, but #24810 has put as int32, causing error like:	2023-10-13 14:23:17 +08:00
zhangdong	4e8cde127c	[Enhance](catalog)add table cache in paimon jni (#25014 ) - fix get old schema after refresh paimon table - add table cache in paimon jni	2023-10-08 10:36:18 +08:00
Ashin Gau	26818de9c8	[feature](jni) support complex types in jni framework (#24810 ) Support complex types in jni framework, and successfully run end-to-end on hudi. ### How to Use Other scanners only need to implement three interfaces in `ColumnValue`: ``` // Get array elements and append into values void unpackArray(List<ColumnValue> values); // Get map key array&value array, and append into keys&values void unpackMap(List<ColumnValue> keys, List<ColumnValue> values); // Get the struct fields specified by `structFieldIndex`, and append into values void unpackStruct(List<Integer> structFieldIndex, List<ColumnValue> values); ``` Developers can take `HudiColumnValue` as an example.	2023-09-27 14:47:41 +08:00
lsy3993	1f8e0b48bc	[fix](S3)delete main function because hardcoded ip is not safe (#24872 )	2023-09-26 10:49:16 +08:00
Calvin Kirs	c832e018d0	[Dependence](Fe)Upgrade Fe dependencies (#24606 ) * be scanner - Upgrade avro to 1.11.2 fe - Upgrade quartz to 2.5.0-rc1 - Upgrade maxcompute to 0.45-2-publish - Binding avro-ipc to 1.11.2 * Binding hbase version to 2.5.5 binding nimbusds version to 9.35	2023-09-22 10:14:42 +08:00
Mryange	ee56783629	[fix](Java UDF) Do not use enum as the data type for JavaUdfDataType. (#24460 )	2023-09-19 14:06:02 +08:00
slothever	4816ca6679	[fix](multi-catalog)fix mc decimal type parse, fix wrong obj location (#24242 ) 1. mc decimal type need parse correctly by arrow vector method 2. fix wrong obj location if use oss,obs,cosn Will add test case in another PR	2023-09-15 17:44:56 +08:00
zy-kkk	dbfacdc4af	[improvement](jdbc catalog) Optimize Loop Performance by Caching `isNebula` Method Result (#24260 )	2023-09-13 21:40:28 +08:00
slothever	fca34ec337	[fix](multi-catalog)support bit type and hidden mc secret key (#24124 ) support max compute bit type and mask mc secret key bool type will use bit arrow vector should mask secret key: close #24019	2023-09-12 10:36:48 +08:00
Ashin Gau	6e28d878b5	[fix](hudi) compatible with hudi spark configuration and support skip merge (#24067 ) Fix three bugs: 1. Hudi slice maybe has log files only, so `new Path(filePath)` will throw errors. 2. Hive column names are lowercase only, so match column names in ignore-case-mode. 3. Compatible with [Spark Datasource Configs](https://hudi.apache.org/docs/configurations/#Read-Options), so users can add `hoodie.datasource.merge.type=skip_merge` in catalog properties to skip merge logs files.	2023-09-11 19:54:59 +08:00
Mingyu Chen	f85da7d942	[improvement](jdbc) add profile for jdbc read and convert phase (#23962 ) Add 2 metrics in jdbc scan node profile: - `CallJniNextTime`: call get next from jdbc result set - `ConvertBatchTime`: call convert jobject to columm block Also fix a potential concurrency issue when init jdbc connection cache pool	2023-09-10 21:42:06 +08:00
Ashin Gau	13c9c41c1f	[opt](hudi) reduce the memory usage of avro reader (#23745 ) 1. Reduce the number of threads reading avro logs and keep the readers in a fixed thread pool. 2. Regularly cleaning the cached resolvers in the thread local map by reflection.	2023-09-05 23:59:23 +08:00
GoGoWen	228f0ac5bb	[Feature](Multi-Catalog) support query doris bitmap column in external jdbc catalog (#23021 )	2023-09-02 12:46:33 +08:00
Mryange	96c4471b4a	[feature](udf) udf array/map support decimal and update doc (#23560 ) * update * decimal * update table name * remove log * add log	2023-08-31 07:44:18 +08:00
zhangstar333	aef162ad4c	[test](log) add some log in udf function when thrown exception (#23651 ) [test](log) add some log in udf function when thrown exception (#23651)	2023-08-30 14:16:05 +08:00
slothever	f66f161017	[fix](multi-catalog)fix hive table with cosn location issue (#23409 ) Sometimes, the partitions of a hive table may on different storage, eg, some is on HDFS, others on object storage(cos, etc). This PR mainly changes: 1. Fix the bug of accessing files via cosn. 2. Add a new field `fs_name` in TFileRangeDesc This is because, when accessing a file, the BE will get a hdfs client from hdfs client cache, and different file in one query request may have different fs name, eg, some of are `hdfs://`, some of are `cosn://`, so we need to specify fs name for each file, otherwise, it may return error: `reason: IllegalArgumentException: Wrong FS: cosn://doris-build-1308700295/xxxx, expected: hdfs://[172.xxxx:4007](http://172.xxxxx:4007/)`	2023-08-26 00:16:00 +08:00
slothever	5ba505ebf4	[fix](multi-catalog)fix avro and jdbc scanner dependency (#23015 ) add preload-extensions module, put all conflict dependencies to pom.xml in preload-extensions	2023-08-20 19:28:17 +08:00
zy-kkk	221e7bdd17	[test](jdbc external) fix mysql and pg external regression test (#22998 )	2023-08-16 10:44:47 +08:00
zhangdong	fa6110accd	[fix](catalog)paimon support more data type (#22899 )	2023-08-14 13:48:33 +08:00
DongLiang-0	a089fe3e43	[Improve](jni-avro)Reduce the volume of the avro-scanner-jar package (#22276 ) The avro-scanner-jar package is reduced from 204M to 160M. Hadoop-related dependencies in the original avro pom are directly packaged into a jar package, resulting in a jar volume of 200M. Now since there is already a hadoop jar package environment in be lib, it can be directly referenced.	2023-08-11 17:26:14 +08:00
DongLiang-0	db69457576	[fix](avro)Fix S3 TVF avro format reading failure (#22199 ) This pr fixes two issues: 1. when using s3 TVF to query files in AVRO format, due to the change of `TFileType`, the originally queried `FILE_S3 ` becomes `FILE_LOCAL`, causing the query failed. 2. currently, both parameters `s3.virtual.key` and `s3.virtual.bucket` are removed. A new `S3Utils` in jni-avro to parse the bucket and key of s3. The purpose of doing this operation is mainly to unify the parameters of s3.	2023-08-11 17:22:48 +08:00
slothever	209f36f1bf	[fix](multi-catalog)fix jdbc loader (#22814 )	2023-08-11 14:36:19 +08:00
slothever	919bfd73f1	[improvement](multi-catalog)add scanner isolation class loader (#22247 ) Add scanner isolation class loader to make each plugin non-conflicting. The BE will get scanner classes by JNI call and use JniClassLoader load them. In the last version，we always get canner classes from the system class path by default, so it cannot isolate the classes for each scanner	2023-08-10 10:02:46 +08:00
Mryange	768088c95e	[refactor](udaf) refactor call udaf function and support map type in return (#22508 )	2023-08-09 22:44:07 +08:00
Mryange	ddd90855a9	[vectorized](udaf) java udaf support with map type (#22397 ) [vectorized](udaf) java udaf support with map type (#22397) * test * remove some unused * update * add case	2023-08-02 15:03:44 +08:00
Mryange	47c2cc5c74	[vectorized](udf) java udf support with return map type (#22300 )	2023-07-29 12:52:27 +08:00
lsy3993	6f1c03c766	[fix](jdbc_catalog) fix int and bigint in mysql view when use doris catalog (#22251 )	2023-07-27 16:50:42 +08:00
lsy3993	4f6a3c5bf0	[feature](catalog) support clob type in oracle jdbc catalog (#21532 )	2023-07-27 15:49:15 +08:00
zy-kkk	619a2857e1	[improvement](jdbc catalog) improve mysql jdbc catalog read bytea`s types & else improve (#22233 )	2023-07-27 10:18:37 +08:00
Ashin Gau	4c4f08f805	[fix](hudi) the required fields are empty if only reading partition columns (#22187 ) 1. If only read the partition columns, the `JniConnector` will produce empty required fields, so `HudiJniScanner` should read the "_hoodie_record_key" field at least to know how many rows in current hoodie split. Even if the `JniConnector` doesn't read this field, the call of `releaseTable` in `JniConnector` will reclaim the resource. 2. To prevent BE failure and exit, `JniConnector` should call release methods after `HudiJniScanner` is initialized. It should be noted that `VectorTable` is created lazily in `JniScanner`, so we don't need to reclaim the resource when `HudiJniScanner` is failed to initialize. ## Remaining works Other jni readers like `paimon` and `maxcompute` may encounter the same problems, the jni reader need to handle this abnormal situation on its own, and currently this fix can only ensure that BE will not exit.	2023-07-26 10:59:45 +08:00
lsy3993	9abf32324b	[improvement](jdbc) add `timestamp` put to `datev2` (#21680 )	2023-07-26 09:10:34 +08:00
Tiewei Fang	e8f4323e0f	[Fix](jdbcCatalog) fix typo of some variable #22214	2023-07-26 08:34:45 +08:00
Ashin Gau	3414d1a61f	[fix](hudi) table schema is not the same as parquet schema (#22186 ) Upgrade hudi version from 0.13.0 to 0.13.1, and keep the hudi version of jni scanner the same as that of FE. This may fix the bug of the table schema is not same as parquet schema.	2023-07-26 00:29:53 +08:00
zy-kkk	cf677b327b	[fix](jdbc catalog) Fixed mappings with type errors for bool and tinyint(1) (#22089 ) First of all, mysql does not have a boolean type, its boolean type is actually tinyint(1), in the previous logic, We force tinyint(1) to be a boolean by passing tinyInt1isBit=true, which causes an error if tinyint(1) is not a 0 or 1, Therefore, we need to match tinyint(1) according to tinyint instead of boolean, and this change will not affect the correctness of where k = 1 or where k = true queries	2023-07-25 22:45:22 +08:00
lsy3993	999fbdc802	[improvement](jdbc) add new type 'object' of int (#21681 )	2023-07-25 21:29:46 +08:00

1 2

70 Commits