Commit Graph

81 Commits

Author SHA1 Message Date
05771e8a14 [Enhancement](Load) stream Load using SQL (#23362)
Using stream load in SQL mode

for example:
example.csv

10000,北京
10001,天津
curl -v --location-trusted -u root: -H "sql: insert into test.t1(c1, c2) select c1,c2 from stream(\"format\" = \"CSV\", \"column_separator\" = \",\")" -T example.csv http://127.0.0.1:8030/api/_stream_load_with_sql
curl -v --location-trusted -u root: -H "sql: insert into test.t2(c1, c2, c3) select c1,c2, 'aaa' from stream(\"format\" = \"CSV\", \"column_separator\" = \",\")" -T example.csv http://127.0.0.1:8030/api/_stream_load_with_sql
curl -v --location-trusted -u root: -H "sql: insert into test.t3(c1, c2) select c1, count(1) from stream(\"format\" = \"CSV\", \"column_separator\" = \",\") group by c1" -T example.csv http://127.0.0.1:8030/api/_stream_load_with_sql
2023-08-30 19:02:48 +08:00
c41179b8e9 [fix](regression) Improve the robustness when close target connection (#23012) 2023-08-16 11:42:58 +08:00
66784cef71 [Enhancement](Load) Stream Load using SQL (#22509)
This PR was originally #16940 , but it has not been updated for a long time due to the original author @Cai-Yao . At present, we will merge some of the code into the master first.

thanks @Cai-Yao @yiguolei
2023-08-08 13:49:04 +08:00
3a787b6684 [improvement](regression) syncer regression test (#22490) 2023-08-02 20:09:27 +08:00
Pxl
ae809fbeba [Bug](storage )fix dead lock when create_tablet need lock two tablet && update mv_p0… (#21969)
fix dead lock when create_tablet need lock two tablet && update mv_p0/ssb case
2023-07-22 15:27:05 +08:00
48bfb8e9cf [Enhancement](regression-test)Add regression test for MoW backup and restore (#21223) 2023-07-05 15:16:04 +08:00
15ec191a77 [Fix](CCR) Use tableId as the credential for CCR syncer instead of tableName (#21466) 2023-07-05 10:16:09 +08:00
8c532e8808 [fix](restore) work around, ingest binlog after backup/restore which local_tablet.partition_id is not correct, use req.partition_id (#21288)
* work around, ingest binlog after backup/restore which local_tablet.partition_id is not correct, use by
req.partition_id

Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-06-29 17:19:02 +08:00
4b94d34ec2 [fix](regression) Add get master token into regression framework (#21198) 2023-06-27 11:54:31 +08:00
4d84cd8ca1 Revert "Revert "[Test](regression) CCR syncer thrift interface regression test (#20935)" (#20990)" (#21022)
This reverts commit 2a294801f1324a999570158eea3224239eefbb29.
2023-06-21 15:20:21 +08:00
2a294801f1 Revert "[Test](regression) CCR syncer thrift interface regression test (#20935)" (#20990)
This reverts commit dd482b74c849b022862e7cfb1f1d0b933a84e3d2.
2023-06-19 21:38:03 +08:00
dd482b74c8 [Test](regression) CCR syncer thrift interface regression test (#20935) 2023-06-18 00:13:09 +08:00
731ce5802e [Test][Framework] add enableCacheData option for test framework (#20874) 2023-06-16 14:12:21 +08:00
61d9bd2ba1 [fix](regression) fix export file test cases (#20463) 2023-06-06 20:07:31 +08:00
492154ee55 [fix](regression-test) add jdbc timeout (#20228)
In some cases ( or bugs), doris may returned query to jdbc, but jdbc can not recognized what doris sent back,
so hanged. To fix this, add a timeout of 30 minutes to jdbc connection.
2023-06-01 10:50:17 +08:00
a7f3bfec89 [refactor](cluster)(step-2) remove cluster related to Backend (#19842) 2023-05-21 09:00:35 +08:00
72632b1e32 [improvement](regression-test) add max_failure_num to skip tests when too much failure #19003 2023-04-25 09:03:36 +08:00
3007cd49f2 [enhancement](mysql) enable two-way ssl authentication (#18530)
According to the mysql-ssl, enable two-way SSL authentication.
2023-04-21 14:39:14 +08:00
918a244068 [chore](pom) update apache pom to 29 (#18843) 2023-04-20 16:57:05 +08:00
ccb3541fa5 [chore](regression) print exception along with error sql when run sql file (#18374) 2023-04-07 14:19:47 +08:00
8011bdb30d [improvement](test) print exception when streamload fails (#18315) 2023-04-03 08:56:54 +08:00
ff66efd7d0 [improvement](test) print response of streamload (#18313)
We need reponse text to reason failures of streamload.
2023-04-02 20:08:28 +08:00
238223fb8b [regression-test](log) add log for malforamt response of stream load (#18173) 2023-03-29 15:52:44 +08:00
a65616a5cd [enhancement](MTMV) Add a timeout for regression tests (#18048)
MTMV regression tests may loop forever due to some potential bugs. Therefore, we add a timeout to avoid endless loop. The value of the timeout is hard coded 30 minutes now.
2023-03-24 10:39:42 +08:00
d3e7f12ada [refactor](Nereids) refactor column pruning (#17579)
This pr refactor the column pruning by the visitor, the good sides
1. easy to provide ability of column pruning for new plan by implement the interface `OutputPrunable` if the plan contains output field or do nothing if not contains output field, don't need to add new rule like `PruneXxxChildColumns`, few scenarios need to override the visit function to write special logic, like prune the LogicalSetOperation and Aggregate
2. support shrink output field in some plans, this can skip some useless operations so improvement

example:
```sql
select id 
from (
  select id, sum(age)
  from student
  group by id
)a
```

we should prune the useless `sum (age)` in the aggregate.
before refactor:
```
LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true )
+--LogicalSubQueryAlias ( qualifier=[a] )
   +--LogicalAggregate ( groupByExpr=[id#0], outputExpr=[id#0, sum(age#2) AS `sum(age)`#4], hasRepeat=false )
      +--LogicalProject ( distinct=false, projects=[id#0, age#2], excepts=[], canEliminate=true )
         +--LogicalOlapScan ( qualified=default_cluster:test.student, indexName=<index_not_selected>, selectedIndexId=10007, preAgg=ON )
```

after refactor:
```
LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true )
+--LogicalSubQueryAlias ( qualifier=[a] )
   +--LogicalAggregate ( groupByExpr=[id#0], outputExpr=[id#0], hasRepeat=false )
      +--LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true )
         +--LogicalOlapScan ( qualified=default_cluster:test.student, indexName=<index_not_selected>, selectedIndexId=10007, preAgg=ON )
```
2023-03-24 09:00:48 +08:00
a73524af49 [fix](regression-test) print real and expect rows when fail in exception (#17949) 2023-03-21 08:52:04 +08:00
62a03ec24c [feature](regression) add http test action (#17567) 2023-03-09 15:13:04 +08:00
82df2ae9d8 [feature](mysql) Support secure MySQL connection to FE (#17138)
Background:
Doris currently does not support SSL connection from MySQL clients, it's not secure enough in some cases, especially access Doris via the public internet.

Solution:
- Use TLS1.2 protocol to encrypt information.
- Implementation details
  * server <--- connect <--- client
  * if enable SSL: {
  * server <--- SSL connection request packet <--- client
  * server <--- SSL Exchange ---> client } (we will add this `if` logic part in this PR)
  * server ---> handshake request packet ---> client
  * server <--- encrypted data ---> client (this part will be realized in this PR)
- reference1 https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_connection_phase.html#sect_protocol_connection_phase_initial_handshake_ssl_handshake
- reference2 https://www.rfc-editor.org/rfc/rfc5246

close #16313

Signed-off-by: Yukang Lian <yukang.lian2022@gmail.com>
Co-authored-by: Gavin Chou <gavineaglechou@gmail.com>
Co-authored-by: morningman <morningman@163.com>
2023-03-04 12:14:48 +08:00
Pxl
f26f0a1059 [Regression Test] modify expectRelativeError from 1e-10 to 1e-8 (#17162) 2023-02-27 14:23:28 +08:00
Pxl
0691586eb7 [Chore](regression-test) add createMV action && add some mv case from fe ut MaterializedViewFunctionTest (#16825)
1. add createMV action
2. add some mv case from fe ut MaterializedViewFunctionTest
3. reduce mv scheduler interval time from 10s to 0.3s
2023-02-24 16:35:37 +08:00
30915c8626 [Bug](regression-framework) fix regression framework throw strange exception (#16273)
fix regression framework throw strange exception
2023-01-31 16:52:19 +08:00
7d648a94d0 [fix](Nereids): fix scalar_function A-F. (#16209)
* [fix](Nereids): fix scalar_function A-F.

* [Fix](regression-test)fix regression test framework cannot compare double value nan and inf.

* revert dround()
2023-01-30 00:37:34 +08:00
c6bc0a03a4 [feature](Load)Suppot MySQL Load Data (#15511)
Main subtask of [DSIP-28](https://cwiki.apache.org/confluence/display/DORIS/DSIP-028%3A+Suppot+MySQL+Load+Data)

## Problem summary
Support mysql load syntax as below: 
```sql
LOAD DATA
    [LOCAL]
    INFILE 'file_name'
    INTO TABLE tbl_name
    [PARTITION (partition_name [, partition_name] ...)]
    [COLUMNS TERMINATED BY 'string']
    [LINES TERMINATED BY 'string']
    [IGNORE number {LINES | ROWS}]
    [(col_name_or_user_var [, col_name_or_user_var] ...)]
    [SET (col_name={expr | DEFAULT} [, col_name={expr | DEFAULT}] ...)]
    [PROPERTIES (key1 = value1 [, key2=value2]) ]
```

For example, 
```sql
            LOAD DATA 
            LOCAL
            INFILE 'local_test.file'
            INTO TABLE db1.table1
            PARTITION (partition_a, partition_b, partition_c, partition_d)
            COLUMNS TERMINATED BY '\t'
            (k1, k2, v2, v10, v11)
            set (c1=k1,c2=k2,c3=v10,c4=v11)
            PROPERTIES ("auth" = "root:", "strict_mode"="true")
```

Note that in this pr the property named `auth` must be set since stream load need auth. I will optimize it later.
2023-01-29 14:44:59 +08:00
116e17428b [Enhancement](point query optimize) improve performace of point query on primary keys (#15491)
1. support row format using codec of jsonb
2. short path optimize for point query
3. support prepared statement for point query
4. support mysql binary format
2023-01-20 13:33:01 +08:00
45b39c5aaf [enhancement](regression-test) Support BenchmarkAction (#16071)
Support benchmarkAction for regression test, this action can help us to run the benchmark queries and print the result

example:

benchmark {
    executeTimes 3
    warmUp true
    skipFailure true
    printResult true

    sqls(["select 1", "select 2"])
}
2023-01-19 08:02:05 +08:00
87756f5441 [regresstion](query) query with limit 0 regresstion test (#15245) 2022-12-22 14:06:44 +08:00
be3f3978c8 [enhancement](test) remove sf1DataPath conf from regression-conf.groovy (#13861) 2022-12-08 11:24:25 +08:00
494dba6c2b [improvement](fix) return only if all sqls inside one sql file run out (#14791) 2022-12-05 10:18:45 +08:00
6e3716e0ea [enhancement](regression) split ssb sf1 to sf0.1 to get smaller test data size (#14437) 2022-11-22 10:36:12 +08:00
034aa20b0a [fix](regression)when using regression-conf-custom.groovy, properties in regression-conf.groovy are missing #14458 2022-11-22 08:44:50 +08:00
6eea855e78 [feature](Nereids) Support lots of scalar function and fix some bug (#13764)
Proposed changes
1. function interfaces that can search the matched signature, say ComputeSignature. It's equal to the Function.CompareMode.
   - IdenticalSignature: equal to Function.CompareMode.IS_IDENTICAL
   - NullOrIdenticalSignature: equal to Function.CompareMode.IS_INDISTINGUISHABLE
   - ImplicitlyCastableSignature: equal to Function.CompareMode.IS_SUPERTYPE_OF
   - ExplicitlyCastableSignature: equal to Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF
3. generate lots of scalar functions
4. bug-fix: disassemble avg function compute wrong result because the wrong input type, the AggregateParam.inputTypesBeforeDissemble is use to save the origin input type and pass to backend to find the correct global aggregate function.
5. bug-fix: subquery with OneRowRelation will crash because wrong nullable property


Note:
1. currently no more unit test/regression test for the scalar functions, I will add the test until migrate aggregate functions for unified processing.
2. A known problem is can not invoke the variable length function, I will fix it later.
2022-11-02 18:01:08 +08:00
d2c5c1af3b [feature](regression) add custom config file for Regression: regression-conf-custom.groovy (#13783) 2022-10-31 22:49:06 +08:00
5bd66243ee [minor](log) remove some unused logs (#13689)
1. When running regression test with specific suites or group, do not print other suite name or file name
2. Remove unused alter table job log.
2022-10-27 09:37:32 +08:00
3e168c87c6 [improvement](regression-test) wait for publish timeout of stream load (#13531) 2022-10-21 10:11:03 +08:00
8637ac1ca3 [regression](framework)set random parallel_fragment_exec_instance_num… (#13383)
Some problems have been found with the setting of parallel_fragment_exec_inistance_num > 1.
Try to use this way to set a random parallel_fragment_exec_inistance_num value for each query to cover more situations.
2022-10-20 10:02:27 +08:00
0e3522c088 [improvement](test) set default value of parallel config items to 10 (#13234) 2022-10-10 15:58:44 +08:00
984d387945 [Regression](load) Add broker load regression test. (#13062)
Add basic broker load regression test. It has been tested. But default
2022-10-04 21:29:05 +08:00
d10ab474f4 [fix](test) try to let cases run in parallel (#13114) 2022-10-04 20:56:22 +08:00
48d32de9ae [enhancement](test) add some cases from trino to p0 (#12699) 2022-09-30 21:35:30 +08:00
6b6d548df9 [enhancement](test) add more p0 cases (#12285) 2022-09-29 10:45:17 +08:00