Commit Graph

70 Commits

Author SHA1 Message Date
07224686ef [feature](jdbc catalog) support db2 jdbc catalog (#31627) 2024-03-01 14:19:28 +08:00
44ba9e102c [feature](statistics)support statistics for iceberg/paimon/hudi table (#29868) 2024-01-18 12:03:07 +08:00
c08ab9edc7 [feature](HiveCatalog) Support for getting hive meta data from relational databases under HMS (#28188) 2023-12-14 17:50:17 +08:00
b096062680 [feature-wip](arrow-flight)(step6) Support regression test (#27847)
Design Documentation Linked to #25514

Regression test add a new group: arrow_flight_sql,

./run-regression-test.sh -g arrow_flight_sql to run regression-test, can use jdbc:arrow-flight-sql to run all Suites whose group contains arrow_flight_sql.
./run-regression-test.sh -g p0,arrow_flight_sql to run regression-test, can use jdbc:arrow-flight-sql to run all Suites whose group contains arrow_flight_sql, and use jdbc:mysql to run other Suites whose group contains p0 but does not contain arrow_flight_sql.
Requires attention, the formats of jdbc:arrow-flight-sql and jdbc:mysql and mysql client query results are different, for example:

Datatime field type: jdbc:mysql returns 2010-01-02T05:09:06, mysql client returns 2010-01-02 05:09:06, jdbc:arrow-flight-sql also returns 2010-01-02 05:09 :06.
Array and Map field types: jdbc:mysql returns ["ab", "efg", null], {"f1": 1, "f2": "a"}, jdbc:arrow-flight-sql returns ["ab ","efg",null], {"f1":1,"f2":"a"}, which is missing spaces.
Float field type: jdbc:mysql and mysql client returns 6.333, jdbc:arrow-flight-sql returns 6.333000183105469, in query_p0/subquery/test_subquery.groovy.
If the query result is empty, jdbc:arrow-flight-sql returns empty and jdbc:mysql returns \N.
use database; and query should be divided into two SQL executions as much as possible. otherwise the results may not be as expected. For example: USE information_schema; select cast ("0.0101031417" as datetime) The result is 2000-01-01 03:14:1 (constant fold), select cast ("0.0101031417" as datetime) The result is null (no constant fold),
In addition, doris jdbc:arrow-flight-sql still has unfinished parts, such as:

Unsupported data type: Decimal256. INVALID_ARGUMENT: [INTERNAL_ERROR]Fail to convert block data to arrow data, error: [E3] write_column_to_arrow with type Decimal256
Unsupported null value of map key. INVALID_ARGUMENT: [INTERNAL_ERROR]Fail to convert block data to arrow data, error: [E33] Can not write null value of map key to arrow.
Unsupported data type: ARRAY<MAP<TEXT,TEXT>>
jdbc:arrow-flight-sql not support connecting to specify DB name, such asjdbc:arrow-flight-sql://127.0.0.1:9090/{db_name}", In order to be compatible with regression-test, use db_nameis added before all SQLs whenjdbc:arrow-flight-sql` runs regression test.
select timediff("2010-01-01 01:00:00", "2010-01-02 01:00:00");, error java.lang.NumberFormatException: For input string: "-24:00:00"
2023-12-04 19:23:56 +08:00
6b8ec22436 exclude regression test workload_manager_p1 (#26736) 2023-11-14 17:55:42 +08:00
7b8709a944 [feature](doris compose) Support generate code coverage data (#26804) 2023-11-13 21:47:53 +08:00
40e430ca55 [regression](multi-catalog) add aliyun dlf hive on oss and huawei obs test case (#25650)
add aliyun dlf hive on oss and huawei obs test case
now obs cases have some problem, will not fix this at this PR, just add comment.
2023-10-24 20:52:50 +08:00
10f1957379 [feature](docker)add docker-iceberg init tables (#25424)
Add some init tables for docker-iceberg.
2023-10-24 19:29:57 +08:00
7de3d9882c [regresstion-test](jdbc catalog)Mariadb compatible test (#25664) 2023-10-23 11:51:03 +08:00
18c2a13e09 [fix](multi-catalog)fix maxcompute partition filter and session creation (#24911)
add maxcompute partition support
fix maxcompute partition filter
modify maxcompute session create method
2023-10-17 22:36:10 +08:00
ce18f1148a [improvement](catalog)compatible with paimon 0.5 (#24985)
compatible with paimon 0.5
add p0 for paimon,need set enablePaimonTest=true
2023-10-17 22:07:13 +08:00
73c3e3ab55 [Feature](x-load) support config min replica num for loading data (#21118) 2023-10-11 21:07:35 +08:00
b9496b2a8f [feature](docker) regression test support run suite in docker (#24508) 2023-09-24 21:44:16 +08:00
6f961ba0e9 [Enhance](external)add prepare hive data in case (#24703) 2023-09-22 11:19:46 +08:00
7e467c91d3 [test](regression) add routine load cases (#24194)
add routine load cases
2023-09-12 18:00:01 +08:00
5694f1b04b [fix](conf) revert changes in regression-conf.groovy in #23702 and fix BE compile error (#23788) 2023-09-02 22:31:49 +08:00
a6dff2faf0 [Feature](config) allow update multiple be configs in one request (#23702) 2023-09-02 14:26:54 +08:00
448b7755c6 [feature](jdbc catalog) support doris jdbc catalog array type (#23056) 2023-08-23 21:17:16 +08:00
fa6110accd [fix](catalog)paimon support more data type (#22899) 2023-08-14 13:48:33 +08:00
91b15183e7 [enhance][external]enhance and fix external cases 0807 (#22689)
enhance and fix external cases 0807
2023-08-08 10:53:08 +08:00
3a787b6684 [improvement](regression) syncer regression test (#22490) 2023-08-02 20:09:27 +08:00
4d84cd8ca1 Revert "Revert "[Test](regression) CCR syncer thrift interface regression test (#20935)" (#20990)" (#21022)
This reverts commit 2a294801f1324a999570158eea3224239eefbb29.
2023-06-21 15:20:21 +08:00
2a294801f1 Revert "[Test](regression) CCR syncer thrift interface regression test (#20935)" (#20990)
This reverts commit dd482b74c849b022862e7cfb1f1d0b933a84e3d2.
2023-06-19 21:38:03 +08:00
dd482b74c8 [Test](regression) CCR syncer thrift interface regression test (#20935) 2023-06-18 00:13:09 +08:00
e78149cb65 [Enhencement](Export) add property for outfile/export and add test (#18997)
This pr does three things:
1. add `delete_existing_files` property for outfile/export. If `delete_existing_files = true`, export/outfile will delete all files under file_path first.
2. add p2 test for export
3. modify docs
2023-05-08 14:02:20 +08:00
0c9fb7297e [fix](regression) mv segcompaction_p1 to segcompaction_p2 (#18806)
segcompaction_p1 contains fairly large load jobs, which will exceed
memlimit or timeout in pipeline under such heavy loads.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-04-26 15:34:46 +08:00
72632b1e32 [improvement](regression-test) add max_failure_num to skip tests when too much failure #19003 2023-04-25 09:03:36 +08:00
3007cd49f2 [enhancement](mysql) enable two-way ssl authentication (#18530)
According to the mysql-ssl, enable two-way SSL authentication.
2023-04-21 14:39:14 +08:00
fe9d2b00fc [test](jdbc catalog) add clickhouse jdbc catalog base type test (#18007) 2023-04-03 20:18:36 +08:00
62ec74f4e7 segcompaction featuring verticalcompaction (#16731)
This patchset applies the following changes:

using vertical compaction machanism to do segcompaction
basic (WIP) refraction to separate segcompaction logic from BetaRowsetWriter
add segcompaction specific ut and regression tests
2023-03-01 10:55:40 +08:00
851a3575ae [fix](regression case) exclude test_broker_load suite, reopen after bug fix (#16554)
There is something wrong with the `test_broker_load` suite(s3 auth problem).
So I ignore this case temporarily.
cc @wsjz , please help to solve it and add it back
2023-02-09 15:51:32 +08:00
81dbed70c2 [fix](Nereids) back off on tpch p1 (#16478)
adjust nullable on empty set should apply after unnested sub-query
some function should propagate nullable when args are datev2 or datetimev2
add back tpch sf0.1 nereids regression test
2023-02-08 10:43:13 +08:00
1973b3a86f [test](regression) add tvf regression to test the remove of eof check (#16342)
Add regression test for #16302. This regression test will be failed if add EOF check for non-predicate columns.
2023-02-02 10:06:36 +08:00
c6bc0a03a4 [feature](Load)Suppot MySQL Load Data (#15511)
Main subtask of [DSIP-28](https://cwiki.apache.org/confluence/display/DORIS/DSIP-028%3A+Suppot+MySQL+Load+Data)

## Problem summary
Support mysql load syntax as below: 
```sql
LOAD DATA
    [LOCAL]
    INFILE 'file_name'
    INTO TABLE tbl_name
    [PARTITION (partition_name [, partition_name] ...)]
    [COLUMNS TERMINATED BY 'string']
    [LINES TERMINATED BY 'string']
    [IGNORE number {LINES | ROWS}]
    [(col_name_or_user_var [, col_name_or_user_var] ...)]
    [SET (col_name={expr | DEFAULT} [, col_name={expr | DEFAULT}] ...)]
    [PROPERTIES (key1 = value1 [, key2=value2]) ]
```

For example, 
```sql
            LOAD DATA 
            LOCAL
            INFILE 'local_test.file'
            INTO TABLE db1.table1
            PARTITION (partition_a, partition_b, partition_c, partition_d)
            COLUMNS TERMINATED BY '\t'
            (k1, k2, v2, v10, v11)
            set (c1=k1,c2=k2,c3=v10,c4=v11)
            PROPERTIES ("auth" = "root:", "strict_mode"="true")
```

Note that in this pr the property named `auth` must be set since stream load need auth. I will optimize it later.
2023-01-29 14:44:59 +08:00
ba71516eba [feature](jdbc catalog) support SQLServer jdbc catalog (#16093) 2023-01-20 12:37:38 +08:00
2580c88c1b [feature](multi-catalog) support oracle jdbc catalog (#15862) 2023-01-14 00:01:33 +08:00
732417258c [Bug](pipeline) Fix bugs to pass TPCDS cases (#15194) 2022-12-20 22:29:55 +08:00
8c02f19302 [chore](regression) use correct bucket path in regression conf (#14960) 2022-12-09 16:20:27 +08:00
be3f3978c8 [enhancement](test) remove sf1DataPath conf from regression-conf.groovy (#13861) 2022-12-08 11:24:25 +08:00
0a33508d3b [chore](regression) add S3 config in conf file and change sf1DataPath #14815 2022-12-06 10:00:50 +08:00
07e8af7808 [regression](test) add external regression-test base on emr environment 1.0 11-29 (#14666)
* add external regression-test base on emr environment 1.0 11-29

* delete ak sk info from regression-conf.groovy
2022-12-02 11:30:07 +08:00
f3cf83a933 (fix)[test] add some logs (#14695) 2022-11-30 12:45:12 +08:00
ca90253b09 [config](storage-policy) add a FE config to disable storage policy by default (#14655)
the cold-hot separation feature is still
under development. And seems there are some unsolved feature remains.
So I add a fe config enable_storage_policy, and default is false, to disable the creation and usage of storage policy by default.

So that user can aware that he is using an experimental feature on his own, and it will not be released formally in v1.2.0.

Disable storage policy by default, user can not use or create storage policy. Configured by enable_storage_policy.

Remove property remote_storage_policy, it is duplicate with storage_policy

Change the persist field in DataProperty.java.
And remove remoteCooldownTime from DataProperty, because it can be got from StoragePolicy.
2022-11-30 10:04:33 +08:00
23a8c7eeb6 (fix)(multi-catalog)(es) Fix error result because not used fields_context (#14229)
Fix error result because not used fields_context
2022-11-14 14:00:55 +08:00
7b4c2cabb4 [feature](new-scan) support transactional insert in new scan framework (#13858)
Support running transactional insert operation with new scan framework. eg:

admin set frontend config("enable_new_load_scan_node" = "true");
begin;
insert into tbl1 values(1,2);
insert into tbl1 values(3,4);
insert into tbl1 values(5,6);
commit;
Add some limitation to transactional insert

Do not support non-literal value in insert stmt
Fix some issue about array type:

Forbid cast other non-array type to NESTED array type, it may cause BE crash.
Add getStringValueForArray() method for Expr, to get valid string-formatted array type value.
Add useLocalSessionState=true in regression-test jdbc url
without this config, the jdbc driver will send some init cmd each time it connect to server, such as
select @@session.tx_read_only.
But when we use transactional insert, after begin command, Doris do not support any other type of
stmt except for insert, commit or rollback.
So adding this config to let the jdbc NOT send cmd when connecting.
2022-11-03 08:36:07 +08:00
c418bbd2d1 [feature-wip](new-scan) support Json reader (#13546)
Issue Number: close #12574
This pr adds `NewJsonReader` which implements GenericReader interface to support read json format file.

TODO:
1. modify `_scann_eof` later.
2. Rename `NewJsonReader` to `JsonReader` when `JsonReader` is deleted.
2022-10-26 12:52:21 +08:00
847b80ebfa [test](jdbc) add jdbc and hive regression test (#13143)
1. Modify default behavior of `build.sh`
    The `BUILD_JAVA_UDF` is default ON, so that jvm is needed for compilation and runtime.

2. Add docker-compose for MySQL 5.7, PostgreSQL 14 and Hive 2
   See `docker/thirdparties/docker-compose`.

3. Add some regression test cases for jdbc query on MySQL, PG and Hive Catalog
   The default is `false`, if set to true, you need first start docker for MySQL/PG/Hive.

4. Support `if not exists` and `if exists` for create/drop resource and create/drop encryptkey
2022-10-21 15:29:27 +08:00
3e168c87c6 [improvement](regression-test) wait for publish timeout of stream load (#13531) 2022-10-21 10:11:03 +08:00
dbf71ed3be [feature-wip](new-scan) Support stream load with csv in new scan framework (#13354)
1. Refactor the file reader creation in FileFactory, for simplicity.
    Previously, FileFactory had too many `create_file_reader` interfaces.
    Now unified into two categories: the interface used by the previous BrokerScanNode,
    and the interface used by the new FileScanNode.
    And separate the creation methods of readers that read `StreamLoadPipe` and other readers that read files.

2. Modify the StreamLoadPlanner on FE side to support using ExternalFileScanNode

3. Now for generic reader, the file reader will be created inside the reader, not passed from the outside.

4. Add some test cases for csv stream load, the behavior is same as the old broker scanner.
2022-10-17 23:33:41 +08:00
7c0695c793 [regression](load)Open broker load regression test (#13163) 2022-10-10 18:49:44 +08:00