Commit Graph

204 Commits

Author SHA1 Message Date
f85f89f240 [fix](planner) Fix incosistency between groupby expression and output of aggregation node (#17438) 2023-03-07 09:38:20 +08:00
aedbc5fcb1 [fix](planner) Slots in the cojuncts of table function node didn't got materialized #17460 2023-03-07 08:50:33 +08:00
ee1be6edd7 [chore](fe) enhance_mysql_data_type (#17429) 2023-03-06 10:42:01 +08:00
a8f20eb4ac [Enhencement](schema_scanner) Optimize the performance of reading information schema tables (#17371)
batch fill block
batch call rpc from FE to get table desc
For 34w colunms

SELECT COUNT( * ) FROM information_schema.columns;
time: 10.3s --> 0.4s
2023-03-06 09:53:01 +08:00
27352afdf6 [fix](fe)support multi distinct group_concat (#17237)
* [fix](fe)support multi distinct group_concat

* update based on comments
2023-03-02 17:53:13 +08:00
722755efe9 [fix](planner) change back legacy planner type coercion (#17070)
revert legacy planner change in #16844
2023-03-01 20:55:56 +08:00
979cf42d7a [Bug](decimalv3) Use correct decimal scale for function round (#17232)
Co-authored-by: maochongxin <maochongxin@gmail.com>
2023-03-01 12:28:41 +08:00
1dd2a41e38 [vectorized](bug) fix window function can't handle first row of beyond (#17084)
Issue Number: close #16845
2023-02-28 17:30:23 +08:00
3e40467ce6 [Bug](vec) Fix chinese pinyin order by (#17152)
bug: some chinese word not sort by pinyin in GBK coding

CREATE TABLE `test_convert` (
                 `a` varchar(100) NULL
             ) ENGINE=OLAP
               DUPLICATE KEY(`a`)
               DISTRIBUTED BY HASH(`a`) BUCKETS 3
               PROPERTIES (
               "replication_allocation" = "tag.location.default: 1"
               );
insert into test_convert values("b"), ("a"), ("c"), ("睿"), ("多"), ("丝");
Query OK, 6 rows affected (0.03 sec)
{'label':'insert_ca73a6acc2194d5b_888218a3949355a6', 'status':'VISIBLE', 'txnId':'18068'}
mysql [test]>select * from test_convert;
+------+
| a    |
+------+
| a    |
| c    |
| 丝   |
| b    |
| 多   |
| 睿   |
+------+
6 rows in set (0.01 sec)
mysql [test]>select * from test_convert order by convert(a using gbk);          
+------+
| a    |
+------+
| a    |
| b    |
| c    |
| 多   |
| 丝   |
| 睿   |
+------+
6 rows in set (0.01 sec)
2023-02-28 14:29:56 +08:00
17c8123371 [test](regression) add some regression cases on constant evaluation. (#16599) 2023-02-28 10:57:37 +08:00
c0360f80bb [enhancement](aggregate-function) enhance aggregate funtion collect and add group_array aliases (#15339)
Enhance aggregate function `collect_set` and `collect_list` to support optional `max_size` param,
which enables to limit the number of elements in result array.
2023-02-27 14:22:30 +08:00
b5d67781a2 [Fix](function)fix datatime-diff function's overflow (#16935) 2023-02-24 20:06:06 +08:00
cf5bc9594b [fix](planner) conjuncts of the outer query block didn't work when it's on the results expr of inline view (#17036)
Here is a cases:

select id, name
from (select '123' as id, '1234' as name, age from test_insert ) a
where name != '1234';
2023-02-24 15:27:34 +08:00
479d57df88 [fix](planner) the project expr should be calculated in join node in some case (#17035)
Consider the sql bellow:

select sum(cc.qlnm) as qlnm
FROM
  outerjoin_A
  left join (SELECT
      outerjoin_B.b,
      coalesce(outerjoin_C.c, 0) AS qlnm
    FROM
      outerjoin_B
      inner JOIN outerjoin_C ON outerjoin_B.b = outerjoin_C.c
  ) cc on outerjoin_A.a = cc.b
group by outerjoin_A.a;

The coalesce(outerjoin_C.c, 0) was calculated in the agg node, which is wrong.
This pr correct this, and the expr is calculated in the inner join node now.
2023-02-24 15:20:05 +08:00
883f575cfe [fix](string function) fix wrong usage of iconv_open (#17048)
* [fix](string function) fix wrong usage of iconv_open

Also add test case for function convert

* fix test case
2023-02-24 09:13:10 +08:00
526a66e9fb [Function](array-type) support array_apply (#17020)
Filter array to match specific binary condition

```
mysql> select array_apply([1000000, 1000001, 1000002], '=', 1000002);
+-------------------------------------------------------------+
| array_apply(ARRAY(1000000, 1000001, 1000002), '=', 1000002) |
+-------------------------------------------------------------+
| [1000002]                                                   |
+-------------------------------------------------------------+
```
2023-02-23 17:38:16 +08:00
7956800df7 [refactor](Nereids) let type coercion same with legacy planner (#16844)
- change for Nereids
1. add a variable length parameter to the ctor of Count for a good error reporting of Count(a, b)
2. refactor StringRegexPredicate, let it inherit from ScalarFunction
3. remove useless class TypeCollection
4. use catalog.Type.Collection to check expression arguments type
5. change type coercion for TimestampArithmetic, divide, integral divide, comparison predicate, case when and in predicate. Let them same as legacy planner.

- change for legacy planner
1. change the common type of floating and Decimal from Decimal to Double
2023-02-22 17:29:37 +08:00
a95f47ac0a [ehancement](planner) Support filter the output of set operation node (#16666) 2023-02-21 19:22:09 +08:00
8b70bfdc31 [Feature](map-type) Support stream load and fix some bugs for map type (#16776)
1、support stream load with json, csv format for map
2、fix olap convertor when compaction action in map column which has null
3、support select outToFile for map
4、add some regression-test
2023-02-19 15:11:54 +08:00
d6a841409f [Enhancement](func)Introduce non_nullable extraction function. #16621
Introduced a new function non_nullable to BE, which can extract concrete data column from a nullable column. If the input argument is already not a nullable column, raise an error.
2023-02-18 20:44:07 +08:00
861e4bc64a [fix](planner) Nullable of slot descriptor is mistaken and cause BE crash #16862 2023-02-18 20:39:56 +08:00
6acee1ce88 [Fix](topn opt) double check plan From OriginalPlanner to make sure optimized SQL is a general topn query (#16848)
From the original logic, query like `select * from a where exists (select * from b order by 1) order by 1 limit 1` is a query contains subquery,
but the top query will pass `checkEnableTwoPhaseRead` and set `isTwoPhaseOptEnabled=true`.So check the double plan is a general topn query plan is needed, and rollback the needMaterialize flag setted by the previous `analyze`.
2023-02-17 10:59:35 +08:00
de1337511c [Bug](Datetime) Fix date time function mem use after free (#16814) 2023-02-16 16:15:58 +08:00
0c56a4622c [Feature](struct-type) Add implicitly cast for struct-type (#16613)
Currently not support insert {1, 'a'} into struct<f1:tinyint, f2:varchar(20)>
This commit will support implicitly cast the char type in the struct to varchar.
Add implicitly cast for struct-type.
2023-02-15 16:55:00 +08:00
0c20c607b2 fix stats (#16556) 2023-02-10 11:00:01 +08:00
f0b0eedbc5 [fix](planner)group_concat lost order by info in second phase merge agg (#16479) 2023-02-08 20:48:52 +08:00
41947c73eb [Feature](array-function) Support array functions for nested type datev2 and datetimev2 (#16382) 2023-02-08 12:51:07 +08:00
289a4b2ea4 [fix](func) fix truncate float type result error (#16468)
When the argument of truncate function is float type, it can match both truncate(DECIMALV3) and truncate(DOUBLE), if the match is truncate(DECIMALV3), the precision is lost when converting float to DECIMALV3(38, 0).

Here I modify it to match truncate(DOUBLE) for now, maybe we still need to solve the problem of losing precision when converting float to DECIMALV3.
2023-02-08 08:57:43 +08:00
1d0fdff98a [Bug](sort) disable 2phase read for sort by expressions exclude slotref (#16460)
```
create table tbl1 (k1 varchar(100), k2 string) distributed by hash(k1) buckets 1 properties("replication_num" = "1");

insert into tbl1 values(1, "alice");

select cast(k1 as INT) as id from tbl1 order by id limit 2;
```

The above query could pass `checkEnableTwoPhaseRead` since the order by element is SlotRef but actually it's an function call expr
2023-02-07 19:42:54 +08:00
dccd04a3ba [fix](fe)predicate is wrongly pushed through CUBE function (#15831) 2023-02-06 11:29:15 +08:00
09870098af [fix](func) fix core dump when the pattern of the regexp_extract_all function does not contain subpatterns (#16408) 2023-02-05 01:16:54 +08:00
ca7b2e27a8 [regression-test](function) add regression test for money_format with truncate (#16052) 2023-02-04 23:10:01 +08:00
918004c016 [Bug](date) Fix BE crash caused by function datediff (#16397)
* [Bug](date) Fix BE crash caused by function `datediff`

* update
2023-02-04 18:43:23 +08:00
5e232a30d8 [fix](planner) Doris returns empty sets when select from a inline view (#16370)
Doris always delays the execution of expressions as possible as it can, so as the expansion of constant expression. Given below SQL:

```sql
select i from (select 'abc' as i, sum(birth) as j from  subquerytest2) as tmp
```

The aggregation would be eliminated, since its output is not required by the outer block, but the expasion for constant expression would be done in the final result expr, and since aggreagete output has been eliminate, the expasion would actually do nothing, and finally cause a empty results.

To fix this, we materialize the results expr in the inner block for such SQL, it may affect performance, but better than let system produce a mistaken result.
2023-02-03 21:23:52 +08:00
941e192019 [enhancement](test) add function case date_sub(datetime,INTERVAL dayofmonth(datetime)-1 DAY) (#16306) 2023-02-02 09:56:01 +08:00
bb0d4ba787 [BugFix](sort) use correct agg function when using 2 phase sort for agg table (#16185) 2023-02-01 20:07:43 +08:00
e3c8fffd99 [function](round) fix decimal scale for scale not specified (#15541) 2023-02-01 14:58:48 +08:00
95d7c2de26 [Refactor](function) Rewrite the function elt (#16287) 2023-02-01 11:17:06 +08:00
ca7eb94f23 [improvement](agg-function) Increase the limit maximum number of agg function parameters (#15924) 2023-01-31 21:03:50 +08:00
e7c1d81419 [fix](planner) Pushdown constant predicate to all scan node in the lieteral view. #16217
Before this PR, planner might push a constant FALSE predicate to the wrong scan nodes in the literal view, and make this predicate useless
2023-01-30 22:18:43 +08:00
6bebf92254 [fix][FE] fix be coredump when children of FunctionCallExpr is folded (#16064)
Co-authored-by: shizhiqiang03 <shizhiqiang03@meituan.com>
fix be coredump when children of FunctionCallExpr is folded
2023-01-30 15:25:00 +08:00
69e748b076 [fix](schema scanner)change schema_scanner::get_next_row to get_next_block (#15718) 2023-01-30 10:01:50 +08:00
eb7da1c0ee [fix](datatype) fix some bugs about data type array datetimev2 and decimalv3 (#16132) 2023-01-29 14:26:08 +08:00
578a855b3e [Bug](topn-opt) filter condition for analytic info for two phase read opt (#16173)
two phase read optimization should not be enabled when query has analytic info
2023-01-29 12:06:18 +08:00
b919cbe487 [ehancement](nereids) Enhancement for limit clause (#16114)
support limit offset without order by.
the legacy planner supoort this feature in PR #15218
2023-01-28 11:04:03 +08:00
9ffd109b35 [fix](datetimev2) Fix BE datetimev2 type returning wrong result (#15885) 2023-01-20 22:25:20 +08:00
abdf56bfa5 [fix](Nereids) wrong result of group_concat with order by or null args (#16081)
1. signatures without order element are wrong
2. signature with one arg is miss
3. group_concat should be NullableAggregateFunction
4. fold constant on fe should not fold NullableAggregateFunction with null arg

TODO
1. reorder rewrite rules, and then only forbid fold constant on NullableAggregateFunction with alwaysNullable == true
2023-01-19 11:22:30 +08:00
feeb69438b [opt](Nereids) optimize DistributeSpec generator of OlapScan (#15965)
use the size of selected partitions instead of olap table partition size to decide whether generate hashDistributeSpec
2023-01-18 20:18:11 +08:00
e2d145cf5d [fix](fe)fix anti join bug (#15955)
* [fix](fe)fix anti join bug

* fix fe ut
2023-01-17 20:25:00 +08:00
6609eb804d [fix](regression) result of withUnionAll in query_p0/select_no_from is unstable (#15958) 2023-01-17 11:34:41 +08:00