Commit Graph

86 Commits

Author SHA1 Message Date
a3e2c6affe [fix](jdbc catalog) fix JdbcScanNode NOT CompoundPredicate filter expr handling errors (#28497) 2023-12-16 12:54:55 +08:00
01c94a554d [fix](autoinc) Fix broker load when target table has autoinc column (#28402) 2023-12-14 18:02:54 +08:00
c08ab9edc7 [feature](HiveCatalog) Support for getting hive meta data from relational databases under HMS (#28188) 2023-12-14 17:50:17 +08:00
dee89d2c4a [refactor](Nereids) let create table compatible with legacy planner (#28078) 2023-12-13 16:35:40 +08:00
3e1e8d2ebe [fix](jdbc catalog) Fixed data conversion problem when all data is null (#28230) 2023-12-11 17:57:57 +08:00
07336980f9 [fix](meta) show partitions with Limit for external HMS tables (27835) (#27835)
This enhancement shall extend existing logic for SHOW PARTITIONS FROM to include: -

Limit/Offset
Where [partition name only] [equal operator and like operator]
Order by [partition name only]
Issue Number: close #27834
2023-12-09 01:44:45 +08:00
baf85547ae [feature](jdbc) support call function to pass sql directly to jdbc catalog #26492
Support a new stmt in Nereids:
`CALL EXECUTE_STMT("jdbc", "stmt")`

So that we can pass the origin stmt directly to the datasource of a jdbc catalog.

show case:
```
mysql> select * from mysql_catalog.db1.tbl1;
+------+------+
| k1   | k2   |
+------+------+
|  111 | 222  |
+------+------+
1 row in set (0.63 sec)

mysql> call execute("mysql_catalog", "insert into db1.tbl1 values(1,'abc')");
Query OK, 0 rows affected (0.01 sec)

mysql> select * from mysql_catalog.db1.tbl1;
+------+------+
| k1   | k2   |
+------+------+
|  111 | 222  |
|    1 | abc  |
+------+------+
2 rows in set (0.03 sec)

mysql> call execute_stmt("mysql_catalog", "delete from db1.tbl1 where k1=111");
Query OK, 0 rows affected (0.01 sec)

mysql> select * from mysql_catalog.db1.tbl1;
+------+------+
| k1   | k2   |
+------+------+
|    1 | abc  |
+------+------+
1 row in set (0.03 sec)
```
2023-12-08 23:06:05 +08:00
81a0f8c041 [Feature](function) support generating const values from tvf numbers (#28051)
If specified, got a column of constant. otherwise an incremental series like it always be.

mysql> select * from numbers("number" = "5", "const_value" = "-123");
+--------+
| number |
+--------+
|   -123 |
|   -123 |
|   -123 |
|   -123 |
|   -123 |
+--------+
5 rows in set (0.11 sec)
2023-12-07 22:26:43 +08:00
8749e5208f [fix](jdbc catalog) fix insert into jdbc table column order (#27855) 2023-12-01 20:46:48 +08:00
60bc3be8a2 [Opt](Compression) Opt zstd block decompression by ZSTD_decompressDCtx(). (#27534)
Opt zstd block decompression by `ZSTD_decompressDCtx()` to replace streaming decompression.
It will improve performance but consume more memory. 

Test result: 
- env: 1 node(16 cores, 64G).
- parquet column: 100 million rows of char(255) column.
- result: 5.2 -> 4.6.
2023-12-01 09:10:32 +08:00
e4149c6e4c [Fix](parquet-reader) Fix null map issue in parquet reader. (#27777)
Fix null map issue in parquet reader which cause result incorrect such as `min()`, `max()`.

In order to share null map between parquet converted src column and dst column to avoid copying. It is very tricky that will call mutable function `doris_nullable_column->get_null_map_column_ptr()` which will set `_need_update_has_null = true`. Because some operations such as agg will call `has_null()` to set `_need_update_has_null = false`.
2023-11-30 13:55:37 +08:00
573f0eaad9 [fix](regression)fix parquet data page v2 unstable case (#27753) 2023-11-29 18:58:37 +08:00
d771f16b79 [fix](parquet)fix bug that can not read parquet data page v2 (#27655) 2023-11-28 22:43:46 +08:00
cd6c61347d [Feature](tvf)(avro-jni) avro-jni add projection push down (#26885) 2023-11-27 10:33:27 +08:00
cc395f5428 [Fix](hive-transactional-table) Fix NPE when query empty hive transactional table. (#27563) 2023-11-25 10:29:39 +08:00
b477839bce [enhancement](jdbc catalog) Add lowercase column name mapping to Jdbc data source & optimize database and table mapping (#27124)
This PR adds the processing of lowercase Column names in Oracle Jdbc Catalog. In the previous behavior, we changed all Oracle columns to uppercase queries by default, but could not handle the lowercase case. This PR can solve this situation and improve All Jdbc Catalog works
2023-11-17 23:51:47 +08:00
5dbc3cbba4 [test](information_schema)append information_schema external_table_p0 case. (#26846) 2023-11-15 14:30:16 +08:00
df867a1531 [fix](catalog) Fix ClickHouse DataTime64 precision parsing (#26977) 2023-11-15 10:23:21 +08:00
3585c7e216 [test](parquet)append parquet reader byte_array_decimal and rle_bool case (#26751) 2023-11-14 15:05:10 +08:00
a16517a061 [test](tvf) append tvf read hive_text file regression case. (#26790) 2023-11-14 15:03:19 +08:00
0a9d71ebd2 [Fix](Planner) fix varchar does not show real length (#25171)
Problem:
when we create table with datatype varchar(), we regard it to be max length by default. But when we desc, it does not show
real length but show varchar()
Reason:
when we upgrade version from 2.0.1 to 2.0.2, we support new feature of creating varchar(), and it shows the same way with
ddl schema. So user would confuse of the length of varchar
Solved:
change the showing of varchar() to varchar(65533), which in compatible with hive
2023-11-14 10:49:21 +08:00
2f32a721ee [refactor](jni) unified jni framework for jdbc catalog (#26317)
This commit overhauls the JDBC connector logic within our project, transitioning from the previous mechanism of fetching data through JNI calls for individual ResultSet items to a more efficient and unified approach using the VectorTable data structure.
2023-11-13 14:28:15 +08:00
7ce746654a [test](jdbc) add doris and sqlserver jdbc catalog test case (#26656) 2023-11-10 10:32:09 +08:00
49cffd0bc9 [fix](JdbcCatalog) fix that the predicate column name does not have back quote when querying the JDBC appearance (#26479) 2023-11-10 09:54:39 +08:00
8434389358 [fix](jdbc) fix clickhouse catalog arr nullable and add case (#26639) 2023-11-09 19:32:05 +08:00
22bf2889e5 [feature](tvf)(jni-avro)jni-avro scanner add complex data types (#26236)
Support avro's enum, record, union data types
2023-11-09 13:58:49 +08:00
57ed781bb6 [fix](regression-test) Add tvf regression tests (#26455) 2023-11-09 12:09:32 +08:00
d1438a8563 [Fix](orc-reader) Fix orc complex types when late materialization was turned on by disabling late materialization in this case. (#26548)
Fix orc complex types when late materialization was turned on in orc reader by disabling late materialization in this case.
2023-11-09 12:05:43 +08:00
f6b7046a6e [fix](regression-test) add tests for jdbc catalog (#26608) 2023-11-09 11:59:35 +08:00
5bcf6bfd46 [fix](jdbc catalog) fix mysql zero date (#26569) 2023-11-08 21:41:56 +08:00
7730a9025e [Fix](Regression-test) add test for tvf (#26322) 2023-11-03 19:07:07 +08:00
3e10e5af39 [Fix](Serde) Fix content displayed by complex types in MySQL Client (#25946)
This pr makes three changes to the display of complex types:
1. NULL value in complex types refers to being displayed as `null`, not `NULL`
2. struct type is displayed as "column_name": column_value
3. Time types such as `datetime` and `date`, are displayed with double quotes in complex types. like
    `{1, "2023-10-26 12:12:12"}`

This pr also do a code refactor:
1. nesting_level is set to a member variable of the `DataTypeSerDe`, rather than a parameter in methods.

What's more, this pr fix a bug that fileSize is not correct, introduced by this pr: #25854
2023-11-01 23:48:55 +08:00
fef520c617 [regression](catalog)Add test case of paimon complex type (#25834)
Add Paimon complex nested type regression case.
Related pr:#25364
2023-11-01 09:59:55 +08:00
bddb6b6ddc [test](jdbc) fix unstable group_concat distinct case (#26076)
The `group_concat` with `distinct` may return unstable result,
so I remove the distinct and add `order by` to make test case stable
2023-10-30 12:46:11 +08:00
99b45e1938 [fix](Outfile) Export DateTimev2 type of doris to ORC's TimeStamp type (#25470)
Previously,doris's `DateTimev2` was exported to orc as a `String` type.
Now, export doris's `DateTimev2` to orc timestamp type.
2023-10-29 15:59:38 +08:00
501c6096dd Revert "[Test](multi-catalog) Add tpcds sf100 hive shape. (#25639)" (#26069)
This reverts commit 3beba1764c01b6712b108556433c96429c59cc45.
2023-10-29 12:45:32 +08:00
3beba1764c [Test](multi-catalog) Add tpcds sf100 hive shape. (#25639)
Add tpcds sf100 hive shapes.

Disable query64 temporarily because it is not same with emr cluster after collecting metadata by analyze table xxx.
And the root cause need to analyze, will enable in future PR.
2023-10-27 18:39:29 +08:00
c86fad7cbd [Fix](orc-reader) Fix orc decimal128 scale issue. (#25977) 2023-10-26 08:50:18 -05:00
267c11207b [feature](paimon)paimon catalog supports complex types (#25364) 2023-10-23 17:32:13 +08:00
7de3d9882c [regresstion-test](jdbc catalog)Mariadb compatible test (#25664) 2023-10-23 11:51:03 +08:00
3225495233 [regression-test](export) Add some tests that use hive external table to read orc/parquet file exported by doris (#25431)
add some regression test:

1. Export Doris data to the orc/parquet file on HDFS with DORIS.
2. Create external table to read orc/parquet files on hive.
2023-10-18 09:59:15 +08:00
ce18f1148a [improvement](catalog)compatible with paimon 0.5 (#24985)
compatible with paimon 0.5
add p0 for paimon,need set enablePaimonTest=true
2023-10-17 22:07:13 +08:00
652d6c57c0 [fix](jdbc catalog) fix handle oracle date format (#25487) 2023-10-17 02:10:28 -05:00
a364a24ac2 [Enhance](regression) add hive out file check (#25475)
add hive out file check
fix hive sql state with " ; "
2023-10-17 10:11:57 +08:00
85b8497624 [fix](Tvf) return empty set when tvf queries an empty file or an error uri (#25280)
### Before:
return errors when tvf queries an empty file or an error uri:
1. get parsed schema failed, empty csv file
2. Can not get first file, please check uri.

### Now:
we just return empty set when tvf queries an empty file or an error uri.
```sql
mysql> select * from s3( 
"uri" = "https://error_uri/exp_1.csv", 
"s3.access_key"= "xx", 
"s3.secret_key" = "yy", 
"format" = "csv") limit 10;

Empty set (1.29 sec)
```
2023-10-17 09:52:53 +08:00
e94fbe169e [Enhance](regression) add hms catalog broker scan case (#25453) 2023-10-16 12:35:46 +08:00
c6824ce1ae [test](fix) unstable case test_jdbc_query_mysql (#25279) 2023-10-12 03:56:38 -05:00
ba87f7d3a3 [fix](pipelineX) add table sink and some fix in pipelineX (#25314) 2023-10-11 20:18:08 +08:00
977d119545 [fix](Insert select tvf) fix NPE because tvf do not have catalog name (#25149) 2023-10-09 18:02:43 +08:00
7e9ffad933 [fix](ES catalog)Doris cannot parse ES date field without time zone (#24864)
1. Add support for Doris to parse ES date field without time zone info. eg: `2023-04-17T23:01:18.151`, this time will be treated as UTC time, since ES assumes that the time zone for time fields without time zones is UTC.
2. Change local time zone convertion from system local time zone to session variable time zone.
2023-10-08 19:28:08 +08:00