Commit Graph

65 Commits

Author SHA1 Message Date
72632b1e32 [improvement](regression-test) add max_failure_num to skip tests when too much failure #19003 2023-04-25 09:03:36 +08:00
3007cd49f2 [enhancement](mysql) enable two-way ssl authentication (#18530)
According to the mysql-ssl, enable two-way SSL authentication.
2023-04-21 14:39:14 +08:00
918a244068 [chore](pom) update apache pom to 29 (#18843) 2023-04-20 16:57:05 +08:00
ccb3541fa5 [chore](regression) print exception along with error sql when run sql file (#18374) 2023-04-07 14:19:47 +08:00
8011bdb30d [improvement](test) print exception when streamload fails (#18315) 2023-04-03 08:56:54 +08:00
ff66efd7d0 [improvement](test) print response of streamload (#18313)
We need reponse text to reason failures of streamload.
2023-04-02 20:08:28 +08:00
238223fb8b [regression-test](log) add log for malforamt response of stream load (#18173) 2023-03-29 15:52:44 +08:00
a65616a5cd [enhancement](MTMV) Add a timeout for regression tests (#18048)
MTMV regression tests may loop forever due to some potential bugs. Therefore, we add a timeout to avoid endless loop. The value of the timeout is hard coded 30 minutes now.
2023-03-24 10:39:42 +08:00
d3e7f12ada [refactor](Nereids) refactor column pruning (#17579)
This pr refactor the column pruning by the visitor, the good sides
1. easy to provide ability of column pruning for new plan by implement the interface `OutputPrunable` if the plan contains output field or do nothing if not contains output field, don't need to add new rule like `PruneXxxChildColumns`, few scenarios need to override the visit function to write special logic, like prune the LogicalSetOperation and Aggregate
2. support shrink output field in some plans, this can skip some useless operations so improvement

example:
```sql
select id 
from (
  select id, sum(age)
  from student
  group by id
)a
```

we should prune the useless `sum (age)` in the aggregate.
before refactor:
```
LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true )
+--LogicalSubQueryAlias ( qualifier=[a] )
   +--LogicalAggregate ( groupByExpr=[id#0], outputExpr=[id#0, sum(age#2) AS `sum(age)`#4], hasRepeat=false )
      +--LogicalProject ( distinct=false, projects=[id#0, age#2], excepts=[], canEliminate=true )
         +--LogicalOlapScan ( qualified=default_cluster:test.student, indexName=<index_not_selected>, selectedIndexId=10007, preAgg=ON )
```

after refactor:
```
LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true )
+--LogicalSubQueryAlias ( qualifier=[a] )
   +--LogicalAggregate ( groupByExpr=[id#0], outputExpr=[id#0], hasRepeat=false )
      +--LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true )
         +--LogicalOlapScan ( qualified=default_cluster:test.student, indexName=<index_not_selected>, selectedIndexId=10007, preAgg=ON )
```
2023-03-24 09:00:48 +08:00
a73524af49 [fix](regression-test) print real and expect rows when fail in exception (#17949) 2023-03-21 08:52:04 +08:00
62a03ec24c [feature](regression) add http test action (#17567) 2023-03-09 15:13:04 +08:00
82df2ae9d8 [feature](mysql) Support secure MySQL connection to FE (#17138)
Background:
Doris currently does not support SSL connection from MySQL clients, it's not secure enough in some cases, especially access Doris via the public internet.

Solution:
- Use TLS1.2 protocol to encrypt information.
- Implementation details
  * server <--- connect <--- client
  * if enable SSL: {
  * server <--- SSL connection request packet <--- client
  * server <--- SSL Exchange ---> client } (we will add this `if` logic part in this PR)
  * server ---> handshake request packet ---> client
  * server <--- encrypted data ---> client (this part will be realized in this PR)
- reference1 https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_connection_phase.html#sect_protocol_connection_phase_initial_handshake_ssl_handshake
- reference2 https://www.rfc-editor.org/rfc/rfc5246

close #16313

Signed-off-by: Yukang Lian <yukang.lian2022@gmail.com>
Co-authored-by: Gavin Chou <gavineaglechou@gmail.com>
Co-authored-by: morningman <morningman@163.com>
2023-03-04 12:14:48 +08:00
Pxl
f26f0a1059 [Regression Test] modify expectRelativeError from 1e-10 to 1e-8 (#17162) 2023-02-27 14:23:28 +08:00
Pxl
0691586eb7 [Chore](regression-test) add createMV action && add some mv case from fe ut MaterializedViewFunctionTest (#16825)
1. add createMV action
2. add some mv case from fe ut MaterializedViewFunctionTest
3. reduce mv scheduler interval time from 10s to 0.3s
2023-02-24 16:35:37 +08:00
30915c8626 [Bug](regression-framework) fix regression framework throw strange exception (#16273)
fix regression framework throw strange exception
2023-01-31 16:52:19 +08:00
7d648a94d0 [fix](Nereids): fix scalar_function A-F. (#16209)
* [fix](Nereids): fix scalar_function A-F.

* [Fix](regression-test)fix regression test framework cannot compare double value nan and inf.

* revert dround()
2023-01-30 00:37:34 +08:00
c6bc0a03a4 [feature](Load)Suppot MySQL Load Data (#15511)
Main subtask of [DSIP-28](https://cwiki.apache.org/confluence/display/DORIS/DSIP-028%3A+Suppot+MySQL+Load+Data)

## Problem summary
Support mysql load syntax as below: 
```sql
LOAD DATA
    [LOCAL]
    INFILE 'file_name'
    INTO TABLE tbl_name
    [PARTITION (partition_name [, partition_name] ...)]
    [COLUMNS TERMINATED BY 'string']
    [LINES TERMINATED BY 'string']
    [IGNORE number {LINES | ROWS}]
    [(col_name_or_user_var [, col_name_or_user_var] ...)]
    [SET (col_name={expr | DEFAULT} [, col_name={expr | DEFAULT}] ...)]
    [PROPERTIES (key1 = value1 [, key2=value2]) ]
```

For example, 
```sql
            LOAD DATA 
            LOCAL
            INFILE 'local_test.file'
            INTO TABLE db1.table1
            PARTITION (partition_a, partition_b, partition_c, partition_d)
            COLUMNS TERMINATED BY '\t'
            (k1, k2, v2, v10, v11)
            set (c1=k1,c2=k2,c3=v10,c4=v11)
            PROPERTIES ("auth" = "root:", "strict_mode"="true")
```

Note that in this pr the property named `auth` must be set since stream load need auth. I will optimize it later.
2023-01-29 14:44:59 +08:00
116e17428b [Enhancement](point query optimize) improve performace of point query on primary keys (#15491)
1. support row format using codec of jsonb
2. short path optimize for point query
3. support prepared statement for point query
4. support mysql binary format
2023-01-20 13:33:01 +08:00
45b39c5aaf [enhancement](regression-test) Support BenchmarkAction (#16071)
Support benchmarkAction for regression test, this action can help us to run the benchmark queries and print the result

example:

benchmark {
    executeTimes 3
    warmUp true
    skipFailure true
    printResult true

    sqls(["select 1", "select 2"])
}
2023-01-19 08:02:05 +08:00
87756f5441 [regresstion](query) query with limit 0 regresstion test (#15245) 2022-12-22 14:06:44 +08:00
be3f3978c8 [enhancement](test) remove sf1DataPath conf from regression-conf.groovy (#13861) 2022-12-08 11:24:25 +08:00
494dba6c2b [improvement](fix) return only if all sqls inside one sql file run out (#14791) 2022-12-05 10:18:45 +08:00
6e3716e0ea [enhancement](regression) split ssb sf1 to sf0.1 to get smaller test data size (#14437) 2022-11-22 10:36:12 +08:00
034aa20b0a [fix](regression)when using regression-conf-custom.groovy, properties in regression-conf.groovy are missing #14458 2022-11-22 08:44:50 +08:00
6eea855e78 [feature](Nereids) Support lots of scalar function and fix some bug (#13764)
Proposed changes
1. function interfaces that can search the matched signature, say ComputeSignature. It's equal to the Function.CompareMode.
   - IdenticalSignature: equal to Function.CompareMode.IS_IDENTICAL
   - NullOrIdenticalSignature: equal to Function.CompareMode.IS_INDISTINGUISHABLE
   - ImplicitlyCastableSignature: equal to Function.CompareMode.IS_SUPERTYPE_OF
   - ExplicitlyCastableSignature: equal to Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF
3. generate lots of scalar functions
4. bug-fix: disassemble avg function compute wrong result because the wrong input type, the AggregateParam.inputTypesBeforeDissemble is use to save the origin input type and pass to backend to find the correct global aggregate function.
5. bug-fix: subquery with OneRowRelation will crash because wrong nullable property


Note:
1. currently no more unit test/regression test for the scalar functions, I will add the test until migrate aggregate functions for unified processing.
2. A known problem is can not invoke the variable length function, I will fix it later.
2022-11-02 18:01:08 +08:00
d2c5c1af3b [feature](regression) add custom config file for Regression: regression-conf-custom.groovy (#13783) 2022-10-31 22:49:06 +08:00
5bd66243ee [minor](log) remove some unused logs (#13689)
1. When running regression test with specific suites or group, do not print other suite name or file name
2. Remove unused alter table job log.
2022-10-27 09:37:32 +08:00
3e168c87c6 [improvement](regression-test) wait for publish timeout of stream load (#13531) 2022-10-21 10:11:03 +08:00
8637ac1ca3 [regression](framework)set random parallel_fragment_exec_instance_num… (#13383)
Some problems have been found with the setting of parallel_fragment_exec_inistance_num > 1.
Try to use this way to set a random parallel_fragment_exec_inistance_num value for each query to cover more situations.
2022-10-20 10:02:27 +08:00
0e3522c088 [improvement](test) set default value of parallel config items to 10 (#13234) 2022-10-10 15:58:44 +08:00
984d387945 [Regression](load) Add broker load regression test. (#13062)
Add basic broker load regression test. It has been tested. But default
2022-10-04 21:29:05 +08:00
d10ab474f4 [fix](test) try to let cases run in parallel (#13114) 2022-10-04 20:56:22 +08:00
48d32de9ae [enhancement](test) add some cases from trino to p0 (#12699) 2022-09-30 21:35:30 +08:00
6b6d548df9 [enhancement](test) add more p0 cases (#12285) 2022-09-29 10:45:17 +08:00
e627d285e0 [chore](regression-test) add default group(p0) for regression-test (#12977) 2022-09-28 11:47:19 +08:00
a79d2e592b [improvement](test) cache data from s3 to cacheDataPath (#13018)
Now, regression data is stored in sf1DataPath, which is local or remote.
For performance reason, we use local dir for community pipeline, however, we need prepare data for every machine, 
this process is easy mistake. So we cache data from s3 in local transparently, thus, we just need to config one data source.
2022-09-28 10:43:55 +08:00
3130a19fe9 [feature](regression) Enhancement regression frame, support http post… (#12565) 2022-09-14 15:31:59 +08:00
772e5907f2 [enhancement](test) add some p0 cases (#12240) 2022-09-07 09:10:42 +08:00
f3cb0c24ee [enhancement](test) add restore action and s3 helper methond (#12084)
Co-authored-by: morrySnow <morrysnow@126.com>
Co-authored-by: SWJTU-ZhangLei <1091517373@qq.com>
2022-08-31 23:08:23 +08:00
05da3d947f [feature-wip](new-scan) add scanner scheduling framework (#11582)
There are currently many types of ScanNodes in Doris. And most of the logic of these ScanNodes is the same, including:

Runtime filter
Predicate pushdown
Scanner generation and scheduling
So I intend to unify the common logic of all ScanNodes.
Different data sources only need to implement different Scanners for data access.
So that the future optimization for scan can be applied to the scan of all data sources,
while also reducing the code duplication.

This PR mainly adds 4 new class:

VScanner
All Scanners' parent class. The subclasses can inherit this class to implement specific data access methods.

VScanNode
The unified ScanNode, and is responsible for common logic including RuntimeFilter, predicate pushdown, Scanner generation and scheduling.

ScannerContext
ScannerContext is responsible for recording the execution status
of a group of Scanners corresponding to a ScanNode.
Including how many scanners are being scheduled, and maintaining
a producer-consumer blocks queue between scanners and scan nodes.

ScannerContext is also the scheduling unit of ScannerScheduler.
ScannerScheduler schedules a ScannerContext at a time,
and submits the Scanners to the scanner thread pool for data scanning.

ScannerScheduler
Unified responsible for all Scanner scheduling tasks

Test:
This work is still in progress and default is disabled.
I tested it with jmeter with 50 concurrency, but currently the scanner is just return without data.
The QPS can reach about 9000.
I can't compare it to origin implement because no data is read for now. I will test it when new olap scanner is ready.
Co-authored-by: morningman <morningman@apache.org>
2022-08-23 08:45:18 +08:00
ff1971f916 [improvement](test) add dryRun option and group all cases into either p0 or p1 (#11576)
1. add dryRun option to list tests
2. group all cases into p0 p1 p2
2022-08-17 22:45:53 +08:00
ee4d9d4347 [improvement](test) group some cases and group a case to p0 if it is not grouped (#11548) 2022-08-06 15:12:08 +08:00
93cb80c9cb [test] use suffix of directory as group name and use directory as dbname (#11142)
* use suffix of directory as group name and use directory as dbname

We can rename tpcds_sf1 to tpcds_sf1_p1, then tpcds_sf1 will be in group
p1.  We will group cases to p0, p1, p2, p3 in the future.

p0: function cases running in seconds.
p1: cases with expected out running in minutes, like tpcds_sf1
p2: cases with expected out running in hours, like tpcds_sf10 tpcds_sf100
p3: cases without without expected out to test core dump.
2022-07-25 12:10:31 +08:00
7c7852994c (fix)(Nereids) fix ssb and add regression test case (#11095)
current nereids planner execute ssb will run into dead loop and crash be, this pr fix this problem and add some regression test case prevent execute ssb failed
2022-07-23 12:41:47 +08:00
ae53a8a7e9 [regression] sf1DataPath can be url or local path (#11065) 2022-07-21 14:35:24 +08:00
2b6cdcf599 [improvement] add an option to let regression stop when a failure happen (#10939)
For community pipeline, it is a waste of resource to run tests with errors.
2022-07-18 08:53:17 +08:00
41f71f3ade [regression] add ssb sf1 test (#10831)
Co-authored-by: stephen <hello-stephen@qq.com>
2022-07-14 15:03:40 +08:00
4719d4705f [regression] update test framework and fix cases (#10686)
and regression test exclude suite test_create_table_with_bloom_filter temporarily.

Co-authored-by: stephen <hello-stephen@qq.com>
2022-07-13 10:16:16 +08:00
2b2bf017f8 [enhancement](regression-test) add real data path for regression test. (#10577)
in some situation, we need compare real result with
previous result for  analyzing.
2022-07-08 20:51:23 +08:00
43015f11a5 [Improvement] remove beHttpAddress in regression test (#10623) 2022-07-06 08:59:29 +08:00