Commit Graph

103 Commits

Author SHA1 Message Date
52e645abd2 [Feature](Nereids): support cte for update and delete statements of Nereids (#23384) 2023-08-31 23:36:27 +08:00
7379cdc995 [feature](nereids) support subquery in select list (#23271)
1. add scalar subquery's output to LogicalApply's output
2. for in and exists subquery's, add mark join slot into LogicalApply's output
3. forbid push down alias through join if the project list have any mark join slots.
4. move normalize aggregate rule to analysis phase
2023-08-31 15:51:32 +08:00
6f3e2a30e6 [Feat](Nereids) Add leading and ordered hint (#22057)
Add leading hint and ordered hint. Usage:
select /*+ ordered / * from a join b on xxx; which will limit join order to original order
select /+ leading ({b a}) */ from a join b on xxx; which will change join order to b join a.
2023-08-28 21:04:40 +08:00
f32efe5758 [Fix](Outfile) Fix that it does not report error when export table to S3 with an incorrect ak/sk/bucket (#23441)
Problem:
It will return a result although we use wrong ak/sk/bucket name, such as:
```sql
mysql> select * from demo.student
    -> into outfile "s3://xxxx/exp_"
    -> format as csv
    -> properties(
    ->   "s3.endpoint" = "https://cos.ap-beijing.myqcloud.com",
    ->   "s3.region" = "ap-beijing",
    ->   "s3.access_key"= "xxx",
    ->   "s3.secret_key" = "yyyy"
    -> );
+------------+-----------+----------+----------------------------------------------------------------------------------------------------+
| FileNumber | TotalRows | FileSize | URL                                                                                                |
+------------+-----------+----------+----------------------------------------------------------------------------------------------------+
|          1 |         3 |       26 | s3://xxxx/exp_2ae166e2981d4c08-b577290f93aa82ba_ |
+------------+-----------+----------+----------------------------------------------------------------------------------------------------+
1 row in set (0.15 sec)
```

The reason for this is that we did not catch the error returned by `close()` phase.
2023-08-26 00:19:30 +08:00
18094511e7 [fix](Outfile/Nereids) fix that csv_with_names and csv_with_names_and_types file format could not be exported on nereids (#23387)
This problem is casued by #21197

Fixed an issue that `csv_with_names` and `csv_with_names_and_types` file format could not be exported on nereids optimizer when using `select...into outfile`.
2023-08-25 11:12:04 +08:00
7cfb3cc0aa [fix](functions) fix function substitute for datetimeV1/V2 (#23344)
* fix

* function fe
2023-08-25 09:59:38 +08:00
f6c5c8f7b5 [Fix](Nereids) fix that select...from tablets() are invalidated when there exists predicates (#23365)
Problem: `select...from tablets()` are invalidated when there exists predicates, such as:
```sql
// The all data is:
mysql> select * from student3;
+------+------+------+
| id   | name | age  |
+------+------+------+
|    1 | ftw  |   18 |
|    3 | yy   |   19 |
|    4 | xx   |   21 |
|    2 | cyx  |   20 |
+------+------+------+

// when we specified tablet to read:
mysql> select * from student3 tablet(131131);
+------+------+------+
| id   | name | age  |
+------+------+------+
|    1 | ftw  |   18 |
|    3 | yy   |   19 |
+------+------+------+

// Howerver, when there exists predicates, the `tablet(131131)` is invalidated
mysql> select * from student3 tablet(131131) where id > 1;
+------+------+------+
| id   | name | age  |
+------+------+------+
|    4 | xx   |   21 |
|    3 | yy   |   19 |
|    2 | cyx  |   20 |
+------+------+------+
```

After the fix, we get promising data
```sql
mysql> select * from student3 tablet(131131) where id > 1;
+------+------+------+
| id   | name | age  |
+------+------+------+
|    3 | yy   |   19 |
+------+------+------+
```
2023-08-24 23:29:59 +08:00
51ac92f65c Revert "[fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236)" (#23368)
This reverts commit 1c3cc77a54938ed948ad8186b8dea8385977d23c.
2023-08-23 18:27:35 +08:00
5d9678700c [feature](Nereids) support select tablets with nereids optimizer (#23164) 2023-08-22 10:14:27 +08:00
1c3cc77a54 [fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236)
* [fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty

* add ut

* fix nereids

* fix regression-test
2023-08-18 14:37:49 +08:00
7de362f646 [fix](Nereids): expand other join which has or condition (#22809) 2023-08-15 16:49:19 +08:00
41ff48f838 [regresstion][external]fix case test_show_where and es_query 0811 (#22898) 2023-08-12 19:41:55 +08:00
045843991a [Fix](Nereids) fix insert into table of random distribution for nereids (#22831)
currently insert into a table of random distribution info is not supported, we fix it by set physical properties to Any.
2023-08-11 19:26:39 +08:00
658d75c816 [feature](Nereids): normalize join condition after expanding or condition NLJ (#22555) 2023-08-04 13:37:37 +08:00
4322fdc96d [feature](Nereids): add or expansion in CBO(#22465) 2023-08-03 13:29:33 +08:00
3a1d678ca9 [Fix](Planner) fix parse error of view with group_concat order by (#22196)
Problem:
    When create view with projection group_concat(xxx, xxx order by orderkey). It will failed during second parse of inline view

For example:
    it works when doing 
    "SELECT id, group_concat(`name`, "," ORDER BY id) AS test_group_column FROM  test GROUP BY id"
    but when create view it does not work
    "create view test_view as SELECT id, group_concat(`name`, "," ORDER BY id) AS test_group_column FROM  test GROUP BY id"

Reason:
    when creating view, we will doing parse again of view.toSql() to check whether it has some syntax error. And when doing toSql() to group_concat with order by, it add seperate ', ' between second parameter and order by. So when parsing again, it
would failed because it is different semantic with original statement.
    group_concat(`name`, "," ORDER BY id)  ==> group_concat(`name`, "," , ORDER BY id)

Solved:
    Change toSql of group_concat and add order by statement analyze() of group_concat in Planner cause it would work if we get order by from view statement and do not analyze and binding slot reference to it
2023-07-31 17:20:23 +08:00
0c734a861e [Enhancement](delete) eliminate reading the old values of non-key columns for delete stmt (#22270) 2023-07-28 14:37:33 +08:00
be69025878 [opt](Nereids) add partial update support for delete stmt (#22184)
Currently, the new optimizer don't consider anything about partial update.
This PR add the ability to convert a delete statement to a partial update insert statement
for merge-on-write unique table
2023-07-26 17:34:31 +08:00
21a3593a9a [fix](Nereids) translate failed when enable topn two phase opt (#22197)
1. should not add rowid slot to reslovedTupleExprs
2. should set notMaterialize to sort's tuple when do two phase opt
2023-07-26 11:38:50 +08:00
732e0d14ff [Enhancement](window-funnel)add different modes for window_funnel() function (#20563) 2023-07-21 13:57:27 +08:00
ee65e0a6b1 [fix](Nereids) should not remove any limit from uncorrelated subquery (#21976)
We should not remove any limit from uncorrelated subquery. For Example
```sql
-- should return nothing, but return all tuple of t if we remove limit from exists
SELECT * FROM t WHERE EXISTS (SELECT * FROM t limit 0);

-- should return the tuple with smallest c1 in t,
-- but report error if we remove limit from scalar subquery
SELECT * FROM t WHERE c1 = (SELECT * FROM t ORDER BY c1 LIMIT 1);
```
2023-07-20 18:37:04 +08:00
86d7233b06 [fix](nereids) ExtractAndNormalizeWindowExpression rule should push down correct exprs to child (#21827)
consider the window function:
```sql
substr(
ref_1.cp_type,
sum(CASE WHEN ref_1.cp_type = 0 THEN 3 ELSE 2 END) OVER (),
1)
```
Before the pr, only "CASE WHEN ref_1.cp_type  = 0 THEN 3 ELSE 2 END" is pushed down.
But both "ref_1.cp_type" and "CASE WHEN ref_1.cp_type  = 0 THEN 3 ELSE 2 END"
should be pushed down.
This pr fix it
2023-07-20 11:47:55 +08:00
fff1983f40 [fix](planner)use tupleId of agg node to get its unsigned conjuncts (#21949) 2023-07-19 00:46:49 +08:00
c9a99ce171 [Feature](Nereids) support udf for Nereids (#18257)
Support alias function, Java UDF, Java UDAF for Nereids.
Implementation:
UDFs(alias function, Java UD(A)F) are saved in database object, we get it by FunctionDesc, which requires function name and arg types. So firstly we bind expressions of its children so that we can get the return type of args. Then we get the best selection.

Secondly:
For alias function:
The original function of the alias function is represented as original planner-style function, it's too hard to translate it to nereids-style expression hence we transfer it to the corresponding sql and parse it. Now we get the nereids-style function, and try to bind the function.
the bound function will also change the type by add cast node of its children to its expecting input types, so that if we travel a bound function more than one times, the cast node will be different. To solve the problem, we add a flag isAnalyzedFunction. it's set false by default and will be set true when return from the visitor function. If the flag is true, it will return immediately in visitor function.

Now we can ensure that the bound functions in children will be the same though we travel it more than one time. we can replace the alias function to its original function and bind the unbound functions.

For JavaUDF and JavaUDAF
JavaUDF and JavaUDAF can be recognized as a catalog function and hard to be entirely translated to Nereids-style function, we create a nereids expression object JavaUdf and JavaUdaf to wrap it.

All in all, now Nereids support UDFs and nesting them.
2023-07-14 17:02:01 +08:00
d4bdd6768c [Feature](Nereids) support select into outfile (#21197) 2023-07-13 17:01:47 +08:00
8973610543 [feature](datetime) "timediff" supports calculating microseconds (#21371) 2023-07-10 19:21:32 +08:00
0469c02202 [Test](regression) Temporarily disable quickTest for SHOW CREATE TABLE to adapt to enable_feature_binlog=true (#21247) 2023-07-05 10:12:02 +08:00
4d84cd8ca1 Revert "Revert "[Test](regression) CCR syncer thrift interface regression test (#20935)" (#20990)" (#21022)
This reverts commit 2a294801f1324a999570158eea3224239eefbb29.
2023-06-21 15:20:21 +08:00
824bc02603 [Function] Support date function: microsecond() (#20044) 2023-06-20 10:32:54 +08:00
2a294801f1 Revert "[Test](regression) CCR syncer thrift interface regression test (#20935)" (#20990)
This reverts commit dd482b74c849b022862e7cfb1f1d0b933a84e3d2.
2023-06-19 21:38:03 +08:00
5ae14549d1 [Feature](Nereids) support delete using syntax to delete data from unique key table (#20452) 2023-06-18 16:22:21 +08:00
dd482b74c8 [Test](regression) CCR syncer thrift interface regression test (#20935) 2023-06-18 00:13:09 +08:00
5573858cb4 [Enhance](regression-test) use another db than test_query_db for nereids_p0 (#19467)
replace test_query_db to nereids_test_query_db in nereids_p0 directory
split test_join into five files to run faster.
2023-06-16 16:11:14 +08:00
Pxl
a0d4f11667 [Bug](function) catch error state in function cast to avoid core dump (#20751)
catch error state in function cast to avoid core dump
2023-06-14 17:34:34 +08:00
feb21fc9e9 [fix](group_concat) use default seperator ',' instead of ', ' for group_concat, to be consistant with mysql (#20741) 2023-06-13 17:20:29 +08:00
c25c19bddc [test](regression) Add cases to test join condition push and not like (#20453)
Add testing cases to issue #19613
2023-06-12 18:26:23 +08:00
10134ea8c6 [fix](planner) fix RewriteInPredicateRule may be useless (#20668)
Issue Number: close #20669

RewriteInPredicateRule may cast InPredicate expr's two child to the same type, for example: where cast(age as char) in ('11'), the type of age is int, RewriteInPredicateRule will cast expr's two child type to int. As in the example above, child 0 will be such struct: 
```
child 0: type: int
    |---  child: type : char
            |-- child: type : int
```

Due to the RewriteInPredicateRule cast the type of the expr to int, it will reanalyze stmt, but it will reset stmt first before reanalyze the stmt, and reset opt will change child 0 to such struct:
```
child: type : char
    |-- child: type : int
```
It cause two child's type will be cast to varchar in func castAllToCompatibleType, the logic of RewriteInPredicateRule will be useless.

In 1.1-lts and 1.2-lts, such case  " where cast(age as char) in ('11')"  can't work well,  because func castAllToCompatibleType will cast int to char but int can't cast to char(master can work well because func castAllToCompatibleType will cast int to varchar in such case).
```
MySQL [test]> select user_id from test_cast where cast(age as char) in ('45');
ERROR 1105 (HY000): errCode = 2, detailMessage = type not match, originType=INT, targeType=CHAR(*)
```
2023-06-12 14:39:01 +08:00
d2d6ce5d0b [fix](nereids) add push down filter and project through cte anchor rules (#20547)
we should not plan any Filter or Project above CteAnchor, because there are project or filter under anchor sometimes.
and the whole plan can not translate to a valid plan for BE.
2023-06-08 10:34:42 +08:00
05bdbce8fc [Feature](Nereids) support update unique table statement (#20313) 2023-06-06 20:32:43 +08:00
ae428c29e2 [feature](planner)(nereids) support user defined variable (#20334)
Support user-defined variables.
After this PR, we can use `set @a = xx` to define a user variable and use it in the query like `select @a`.

the changes of this PR:
1. Support the grammar for `set user variable` in the parser.
2. Add the `userVars` in `VariableMgr` to store the user-defined variables.
3. For the `set @a = xx`, we will store the variable name and its value in the `userVars` in `VariableMgr`.
4. For the `select @a`, we will get the value for the variable name in `userVars`.
2023-06-06 14:35:16 +08:00
1b94b6368f [fix](load) in strict mode, return error for insert if datatype convert fails (#20378)
* [fix](load) in strict mode, return error for load and insert if datatype convert fails

Revert "[fix](MySQL) the way Doris handles boolean type is consistent with MySQL (#19416)"

This reverts commit 68eb420cabe5b26b09d6d4a2724ae12699bdee87.

Since it changed other behaviours, e.g. in strict mode insert into t_int values ("a"),
it will result 0 is inserted into table, but it should return error instead.

* fix be ut

* fix regression tests
2023-06-06 12:04:03 +08:00
c6387847aa [fix](nereids) change defaultConcreteType function's return value for decimal (#20380)
1. add default decimalv2 and decimalv3 for NullType
2. change defaultConcreteType of decimalv3 to this
2023-06-05 10:50:07 +08:00
a8a4da9b9e [fix](nereids)dphyper join reorder may cache wrong project list for project node (#20209)
* [fix](nereids)dphyper join reorder may cache wrong project list for project node
2023-06-02 09:35:28 +08:00
519f01133a [feature](decimal)support cast rounding half up and div precision increment in decimalv3. (#19811) 2023-06-01 13:09:58 +08:00
cc41cb0e7e [Fix](Nereids) fix some insert into select bugs (#20052)
fix 3 bugs:

1. failed to insert into a table with mv.
```sql
create table t (
    id int,
   c1 int,
   c2 int,
   c3 int
) duplicate key(id)
distributed by hash(id) buckets 4

create materialized view k12s3m as select id, sum(c1), max(c3) from t group by id;

insert into t select -4, -4, -4, 'd';
```
insert will rise exception because mv column is not handled. now we will add a target column and value as defineExpr.

2. failed to insert into a table with not all the columns.
```sql
insert into t(c1, c2) select c1, c2 from t
```
and t(id ukey, c1, c2, c3), will insert too many data, we fix it by change the output partitions.

3. failed to insert into a table with complex select.
the select statement has join or agg, fix the bug by the way similar to the one at 2nd bug.
2023-06-01 12:15:19 +08:00
68e593fbf1 [fix](nereids)(planner) case when should return NullLiteral when all case result is NullLiteral (#20280) 2023-06-01 11:11:41 +08:00
1f22aa6961 [fix](nereids) like function's nullable property should be PropagateNullable (#20237) 2023-05-31 12:13:38 +08:00
accaff1026 [Feature](compaction) wip: single replica compaction (#19237)
Currently, compaction is executed separately for each backend, and the reconstruction of the index during compaction leads to high CPU usage. To address this, we are introducing single replica compaction, where a specific primary replica is selected to perform compaction, and the remaining replicas fetch the compaction results from the primary replica.

The Backend (BE) requests replica information for all peers corresponding to a tablet from the Frontend (FE). This information includes the host where the replica is located and the replica_id. By calculating hash(replica_id), the replica with the smallest hash value is responsible for executing compaction, while the remaining replicas are responsible for fetching the compaction results from this replica.
The compaction task producer thread, before submitting a compaction task, checks whether the local replica should fetch from its peer. If it should, the task is then submitted to the single replica compaction thread pool.
When performing single replica compaction, the process begins by requesting rowset versions from the target replica. These rowset_versions are then compared with the local rowset versions. The first version that can be fetched is selected.
2023-05-30 21:12:48 +08:00
a855253543 [fix](Nereids) filter should not push through union to OneRowRelation (#20132)
## Problem summary
When we want to push the filter through the union. We should check whether the union's children are `OneRowRelation` or not. If there are some `OneRowRelation`, we shouldn't push down the filter to that part

Before this PR
```
mysql> select * from (select 1 as a, 2 as b union all select 3, 3) t where a = 1;
+------+------+
| a    | b    |
+------+------+
|    1 |    2 |
|    3 |    3 |
+------+------+
2 rows in set (0.01 sec)
```

After this PR
```
mysql> select * from (select 1 as a, 2 as b union all select 3, 3) t where a = 1;
+------+------+
| a    | b    |
+------+------+
|    1 |    2 |
+------+------+
1 row in set (0.38 sec)
```
2023-05-30 17:06:52 +08:00
55ccddb62c [Conf](decimalv3) enable decimalv3 by default 2023-05-29 15:38:31 +08:00