Commit Graph

959 Commits

Author SHA1 Message Date
73621bdb18 [enhance](Nereids) process DELETE_SIGN_COLUMN of OlapTable(#16030)
1. add DELETE_SIGN_COLUMN in non-visible-columns in LogicalOlapScan
2. when the table has a delete sign, add a filter `delete_sign_coumn = 0`
3. use output slots and non-visible slots to bind slot
2023-01-20 11:27:35 +08:00
2018b49ef0 [opt](test) scalar_types_p0 use 100k lines dataset and scalar_types_p2 use 1000k (#16104) 2023-01-19 22:59:29 +08:00
dd869077f8 [fix](nereids) do not generate compare between Date to Date (#16061)
BE storage Engine has some bug in Date comparison, and hence if we push down predicates like Date'x' < Date 'y', we get error results.
This pr just convert expr like ’Date'x' < Date 'y',‘ to DateTime'x' < DateTime 'y'

TODO:
do storage engine support date slot compare with datetime?
if it support, we could avoid add cast on the slot
and then, this expression could push down to storage engine.
2023-01-19 15:56:51 +08:00
21b78cb820 [fix](nereids) Fix bind failed of the slots in the group by clause (#16077)
Child's slot with same name to the slots in the outputexpression would be discarded which would cause the bind failed, since the slots in the group by expressions cannot find the corresponding bound slots from the child's output
2023-01-19 15:36:13 +08:00
0144c51ddb [fix](nereids) fix bug in CaseWhen.getDataType and add some missing case for findTightestCommonType (#15776) 2023-01-19 15:30:25 +08:00
6e090e4daf [Bug](predicate) fix date predicate (#16053) 2023-01-19 14:14:48 +08:00
c5beab39c0 [fix](nereids) Bind slot in having to its direct child instead of grand child (#16047)
For example, in this case, the `date` in having clause should be bind to alias which has same name, instead of `date` field of the relation

SELECT date_format(date, '%x%v') AS `date` FROM `tb_holiday` WHERE `date` between 20221111 AND 20221116 HAVING date = 202245 ORDER BY date;
2023-01-19 13:19:16 +08:00
abdf56bfa5 [fix](Nereids) wrong result of group_concat with order by or null args (#16081)
1. signatures without order element are wrong
2. signature with one arg is miss
3. group_concat should be NullableAggregateFunction
4. fold constant on fe should not fold NullableAggregateFunction with null arg

TODO
1. reorder rewrite rules, and then only forbid fold constant on NullableAggregateFunction with alwaysNullable == true
2023-01-19 11:22:30 +08:00
45b39c5aaf [enhancement](regression-test) Support BenchmarkAction (#16071)
Support benchmarkAction for regression test, this action can help us to run the benchmark queries and print the result

example:

benchmark {
    executeTimes 3
    warmUp true
    skipFailure true
    printResult true

    sqls(["select 1", "select 2"])
}
2023-01-19 08:02:05 +08:00
d8f598eeab [enhancement](Nereids) add timestampadd, timestampdiff functions (#16072) 2023-01-19 01:05:25 +08:00
baf62b4418 [test](Nereids) add regression-test for running_difference and regexp_extract_all (#16049) 2023-01-18 22:24:52 +08:00
feeb69438b [opt](Nereids) optimize DistributeSpec generator of OlapScan (#15965)
use the size of selected partitions instead of olap table partition size to decide whether generate hashDistributeSpec
2023-01-18 20:18:11 +08:00
34075368ec (improvement)[bucket] Add auto bucket implement (#15250) 2023-01-18 19:50:18 +08:00
0916cbcb10 [ehancement](nereids) Made the parse for named expression more complete (#16010)
After this PR, we could support such grammar.

SELECT SUBSTRING("dddd编", 0, 3) AS "测试";
SELECT SUBSTRING("dddd编", 0, 3) "测试";
2023-01-18 19:44:51 +08:00
4035bd83c3 [fix](jdbc) fix jdbc driver bug and external datasource p2 test case issue (#16033)
Fix bug that when create jdbc resource with only jdbc driver file name, it will failed to do checksum
This is because we forgot the pass the full driver url to JdbcClient.

Add ResultSet.FETCH_FORWARD and set AutoCommit to false to jdbc connection, so to avoid OOM when fetching large amount of data

set useCursorFetch in jdbc url for both MySQL and PostgreSQL.

Fix some p2 external datasource bug
2023-01-18 17:48:06 +08:00
1fa2b662cf [opt](Nereids) add date_add/sub function (#16048)
1. add week_add week_diff function
2. register all date_add/date_diff function
2023-01-18 17:11:44 +08:00
94628f09e9 [regression-test](spark-connector) Add the regression case of the spark doris connector (#14877)
* [regression-test](spark-connector) Add the regression case of the spark doris connector
2023-01-18 16:41:41 +08:00
bd0d650c3d [fix](Nereids) prohibit cross join with on clause (#16035) 2023-01-18 16:21:01 +08:00
65d9293fa9 [testcase](bitmap index)bitmap index testcase (#15975)
* add bitmap index testcases for all scalar types

* add bitmap index testcases for all scalar types
2023-01-18 14:17:24 +08:00
46ce97a190 [enhance](planner)convert 'or' into 'in-predicate' (#15737)
in previous [PR 12872](https://github.com/apache/doris/pull/12872), we convert multi equals on same slot into `in predicate`. for example, `a =1 or a = 2` => `a in (1, 2)`

This pr makes 4 changes about convert or to in:
1. fix a bug: `Not IN`  is merged with equal. `a =1 or a not in (2, 3)` => `a in (1, 2, 3)`
2. extends this rule on more cases
  - merge for more than one slot: 'a =1 or a = 2 or b = 3 or b = 4' => `a in (1, 2) or b in (3, 4)`
  - merge skip not-equal and not-in: 'a =1 or a = 2 or b !=3 or c not in (1, 2)' => 'a in (1, 2) or b!=3 or c not in (1,2)`
3. rewrite recursively. 
4. OrToIn is implemented in ExtractCommonFactorsRule. This rule will generate new exprs. OrToIn should apply on such generated exprs. for example `(a=1 and b=2) or (a=3 and b=4)` => `(a=1 or a=3) and (b=2 or b=4) and [(a=1 and b=2) or (a=3 and b=4)]` => `a in (1,3) and b in (2 ,4) and [(a=1 and b=2) or (a=3 and b=4)]` 

In addition, this pr add toString() for some Expr.
2023-01-18 12:33:20 +08:00
121f4d6ac0 [fix](Nereids) cannot put two same table value function into one memo (#16026) 2023-01-18 11:32:09 +08:00
96b9115286 [fix](nereids) fix bug of invalid column in olap scan node when a materialized view is selected (#15976)
if a materialized view is selected, the olap scan node's NonUserVisibleOutput property may contains column from other materialized view. This pr remove invalid column
2023-01-18 01:02:12 +08:00
388d623506 [fix](MTMV) Refine the process of refreshing data (#16006)
1. Remove some redundant code.
2. Fix the issue with the state of MTMV task.
3. Fix the case - test_create_mtmv.

## Problem summary

1. We used a retry policy to re-run the failed MTMV tasks, but we set the state to `FAILURE` during re-running the tasks.
We should do this after all the retry runs fail.
2. There are some redundant code can be removed.
3. In the case test_create_mtmv, we created many background tasks to refresh the data. Some task may fail due to the concurrency and cause the test fail. Actually, we only need single one task to verify the functionality.
2023-01-17 23:08:12 +08:00
0c8255d9b8 [fix](nereids)nest loop join should support filter conjuncts like hash join (#15979) 2023-01-17 20:38:38 +08:00
3d05ffb10e [fix](Nereids) add 'integer' as alias of int type (#15983) 2023-01-17 20:33:26 +08:00
4d863a18c3 [fix](regression-test) Fix the build for Java UDF Case (#15851)
After opening the project in Intellij Idea, we can see the cause. It is because Apache Maven of which the version is 3.8.1 or newer blocks http repositories by default. Therefore, we can fix this issue by adding a https repository which contains this package in pom.xml.
2023-01-17 20:25:53 +08:00
e2d145cf5d [fix](fe)fix anti join bug (#15955)
* [fix](fe)fix anti join bug

* fix fe ut
2023-01-17 20:25:00 +08:00
02a7995171 [fix](planner)wrong result when has order by under join (#15974) 2023-01-17 20:20:56 +08:00
38663526b7 [fix](planner) Keep type of null literal expr when register conjuncts (#15878)
For now, type information of child expr which is NullLiteral would get lost in the CastExpr#getResultValue, this will produce a NullLiteral with Null  type which cause BE core when doing cast
2023-01-17 16:48:02 +08:00
7e4bc1fee6 [fix](Nereids) add a rule to adjust nullable of all expressions (#15791)
we have some rules that change output's nullable in rewrite step. So we need a rule to adjust nullable at the end of rewrite step.

TODO
- remove the output slot map
- add nullable compare into slot reference
- use exprid to compare two slot if do not need to compare nullable
- merge all rules into one to adjust all type plans
2023-01-17 15:51:25 +08:00
d062ca2944 [refactor](vectorized) remove unnecessary vectorization check (#15984) 2023-01-17 12:21:46 +08:00
d98abb12f9 [fix](Nereids)set oepration type coercion is diff with legacy planner (#15982) 2023-01-17 11:41:41 +08:00
ce1d19b373 [fix](Nereids) lateral view cannot bind function nested in generators (#15960) 2023-01-17 11:37:56 +08:00
6609eb804d [fix](regression) result of withUnionAll in query_p0/select_no_from is unstable (#15958) 2023-01-17 11:34:41 +08:00
8d25b156aa [fix](nereids) bind slot using exactly match (#15950)
example:
unbound slot k
bounded [k, t.k]

In previous binding algorithm, there are 2 candidate bindings,
in which bounded k is exactly matched unbound slot k, it has higher priority than that of t1.k
2023-01-17 11:25:08 +08:00
b1caa68706 [Feature-WIP](inverted index) inverted index reader's implementation, and add mysql_fulltext regression case to test fulltext query (#15823)
Issue Number: Step2 of DSIP-023: Add inverted index for full text search
implementation of inverted index reader

dependency pr: #14211 #15807 #15821
2023-01-17 09:13:56 +08:00
806cd9fb3c [regression-test](topn)add test cases for nonkey topn query for each scalar type (#15790)
related to #15558 #15693
1. dup key table with 17 scalar datatypes
2. unique key table with mow enabled
3. unique key table with mow disabled
2023-01-16 16:49:59 +08:00
63d48564ed [fix](datetimev2) fix datetimev2 error with T (#15915)
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2023-01-16 15:30:48 +08:00
fa03c8a241 [feature](nereids) const folding for in-predicate with null literal (#15880)
select 1 in (2 , null)  => null
select 1 in (1 , null)  => true
select 1 not in (2 , null)  => null
select 1 not in (1 , null)  => false
2023-01-16 13:48:45 +08:00
Pxl
81bab55d43 [Bug](function) catch function calculation error on aggregate node to avoid core dump (#15903) 2023-01-16 11:21:28 +08:00
0d61ca7cdd [chore](regression) remove redundant case (#15935) 2023-01-15 11:06:33 +08:00
5af7bcaa55 [Bug](decimalv3) Fix missing precision and scale in predicates (#15930) 2023-01-15 00:01:48 +08:00
429af016dd [fix](test) donot use same table name in a database (#15931) 2023-01-15 00:01:33 +08:00
a65044dbac [fix](nereids) unstable regression case in nereids_syntax_p0 (#15896) 2023-01-14 22:37:30 +08:00
29863112a4 (test) change remote_fragment_exec_timeout_ms in p0/p1 to 60 seconds (#15932)
test_join case failed due to send fragment timeout frequently.
2023-01-14 22:07:35 +08:00
313e14d220 [Bugfix] (ROLLUP) fix the coredump when add rollup by link schema change (#15654)
Because of the rollup has the same keys and the keys's order is same, BE will do linked schema change. The base tablet's segments will link to the new rollup tablet. But the unique id from the base tablet is starting from 0 and as the rollup tablet also. In this case, the unique id 4 in the base table is column 'city', but in the rollup tablet is 'cost'. It will decode the varcode page to bigint page so that be coredump. It needs to be rejected.

I think that if a rollup add by link schema change, it means this rollup is redundant. It brings no additional revenue and wastes storage space. So It needs to be rejected.
2023-01-14 10:20:07 +08:00
2580c88c1b [feature](multi-catalog) support oracle jdbc catalog (#15862) 2023-01-14 00:01:33 +08:00
a788623ee2 doris largeint type execute where query, the result is incorrect (#15034) 2023-01-13 23:12:02 +08:00
fbe68e7ec8 [regression-test](MTMV) Make the case test_create_mtmv more robust (addendum) (#15909) 2023-01-13 22:51:47 +08:00
514de605b6 [Bug](predicate) add double predicate creator (#15762)
Add one double predicator the same as integer predicate creator.
2023-01-13 18:34:09 +08:00