Commit Graph

1326 Commits

Author SHA1 Message Date
Pxl
5f0bb49d46 [Feature](materialized-view) support create mv contain aggstate column (#20812)
support create mv contain aggstate column
2023-06-21 13:06:52 +08:00
18beb822a3 [FIX](array-type) fix array string output with fe const expr (#21042)
fe foldconstRule make array() function expr with const literal , and would not pass this array literal to be . but we should make fe array string output format is same with be array string output
2023-06-21 11:52:02 +08:00
0cf9de8cef [fix](decimalv3) fix result error when cast a round decimalv3 to double (#20678) 2023-06-21 00:02:48 +08:00
2c11ce0a02 [bugfix](topn) fix key topn merge block conflict with index predicate result columns (#20820) 2023-06-20 21:23:00 +08:00
f10258577b [Fix](Planner) Fix group concat with multi distinct and segs (#20912)
Problem:
when use select group_concat(distinct a, 'seg1'), group_concat(distinct b, 'seg2') ... Error would rised
Reason:
Group_concat function regard 'seg' as arguments also, so multi distinct column error would rised
Solved:
let Multi Distinct group_concat function only get first argument as real argument
2023-06-20 21:00:18 +08:00
7e01f074e2 [improvement](jdbc mysql) support auto calculate the precision of timestamp/datetime (#20788) 2023-06-20 10:39:34 +08:00
824bc02603 [Function] Support date function: microsecond() (#20044) 2023-06-20 10:32:54 +08:00
d02ecef406 [fix](Nereids): revert push down alias into union (#20991)
revert #20543 to tmp avoid problem
2023-06-20 09:32:26 +08:00
5a28b6f9fc [fix](datetime) Fix the error in date calculation that includes constants (#20863)
before

```
mysql> select hours_add('2023-03-30 22:23:45.23452',8);
+-------------------------------------+
| hours_add('2023-03-30 22:23:45', 8) |
+-------------------------------------+
| 2023-03-31 06:23:45                 |
+-------------------------------------+

mysql> select date_add('2023-03-30 22:23:45.23452',8);
+------------------------------------+
| date_add('2023-03-30 22:23:45', 8) |
+------------------------------------+
| 2023-04-07 22:23:45                |
+------------------------------------+

mysql [test]>select hours_add('2023-03-30 22:23:45.23452',8);
+-------------------------------------------+
| hours_add('2023-03-30 22:23:45.23452', 8) |
+-------------------------------------------+
| 2023-03-31 06:23:45.000234                |
+-------------------------------------------+
```

after

```
mysql [test]>select hours_add('2023-03-30 22:23:45.23452',8);
+-------------------------------------------+
| hours_add('2023-03-30 22:23:45.23452', 8) |
+-------------------------------------------+
| 2023-03-31 06:23:45.23452                 |
+-------------------------------------------+
1 row in set (0.01 sec)

mysql [test]>select date_add('2023-03-30 22:23:45.23452',8);
+------------------------------------------+
| date_add('2023-03-30 22:23:45.23452', 8) |
+------------------------------------------+
| 2023-04-07 22:23:45.23452                |
+------------------------------------------+
1 row in set (0.00 sec)

mysql [test]>set enable_nereids_planner=true;
Query OK, 0 rows affected (0.00 sec)

mysql [test]>set enable_fallback_to_original_planner=false;
Query OK, 0 rows affected (0.00 sec)

mysql [test]>select hours_add('2023-03-30 22:23:45.23452',8);
+-------------------------------------------+
| hours_add('2023-03-30 22:23:45.23452', 8) |
+-------------------------------------------+
| 2023-03-31 06:23:45.23452                 |
+-------------------------------------------+
1 row in set (0.03 sec)

mysql [test]>select date_add('2023-03-30 22:23:45.23452',8);
+------------------------------------------+
| days_add('2023-03-30 22:23:45.23452', 8) |
+------------------------------------------+
| 2023-04-07 22:23:45.23452                |
+------------------------------------------+
1 row in set (0.00 sec)
```
2023-06-19 23:44:30 +08:00
e6f50c04f1 [fix](nereids)SubqueryToApply rule lost is null condition (#20971)
* [fix](nereids)SubqueryToApply rule lost is null condition
2023-06-19 23:43:40 +08:00
f20ef165fe [opt](Nereids) update join stats derive (#20895)
in hash join condition, some equals are trustable, some are not.
an equal is trustable if one side is almost unique, like primary key. for such equal condition we could estimate more accurate.
the problem is in rewriten q20, the are 2 equal condition, one is trustable, another is not. But we treat both of them as trustable.

Test result:
on tpch100, from 2.2 sec to 0.44 sec
no impact on tpch other queries
no performance impact on tpcds queries
2023-06-19 23:40:44 +08:00
2a294801f1 Revert "[Test](regression) CCR syncer thrift interface regression test (#20935)" (#20990)
This reverts commit dd482b74c849b022862e7cfb1f1d0b933a84e3d2.
2023-06-19 21:38:03 +08:00
dd5ecea36a [fix](compress) snappy does not work right (#20934) 2023-06-19 14:11:10 +08:00
fb9fcf460a [fix](leftjoin) fix bug of left and full join with other conjuncts (#20946)
Fix bug of left and full outer join with other conjuncts. When equal matched row count of a probe row exceed batch_size, some times the _join_node->_is_any_probe_match_row_output flag is not set correcty, which result in outputing extra rows for the probe row.
2023-06-19 12:27:06 +08:00
Pxl
85c5d7c6a9 [Chore](materialized-view) add ssb_flat mv test case (#20869)
add ssb_flat mv test case
2023-06-19 10:51:50 +08:00
1efd345963 [Enhancement](table) adding information_schema.parameters table (#20259)
this is a virtual table for compatibility information_schema parameters table
2023-06-19 09:05:46 +08:00
8366ce7a81 [enhancement](insert-stmt) Make insert into tbl values(); compatible with mysql (#20694) 2023-06-18 19:56:07 +08:00
ac3290021d [fix](Nereids): MergeSetOperations can merge SetOperation ALL. (#20902) 2023-06-18 17:49:03 +08:00
5ae14549d1 [Feature](Nereids) support delete using syntax to delete data from unique key table (#20452) 2023-06-18 16:22:21 +08:00
dd482b74c8 [Test](regression) CCR syncer thrift interface regression test (#20935) 2023-06-18 00:13:09 +08:00
fe18cfa2fb [improvement](pg jdbc)Support for automatically obtaining the precision of the postgresql timestamp type (#20909) 2023-06-16 23:41:09 +08:00
367f64e7bd [improvement](jdbc) support insert autoinc and default value column to mysql (#20765)
In JdbcMysqlClient, I've added methods to retrieve auto-increment and default value columns from MySQL. These columns are then mapped into Doris metadata to make them visible to users.

When handling the InsertStmt into an execution plan, Doris used to automatically fill in NULL or default values for columns not specified in the InsertStmt. However, in the JDBC catalog, we don't need Doris to handle these unspecified columns, so I've made changes to skip them directly.

For the insert prepared statement required for writing, our previous behavior was to obtain all columns for placeholders. So, the change I made is to pass in the columns processed by the execution plan during the sink task generation stage for dynamic generation.
2023-06-16 23:38:11 +08:00
e834637a5b [improvement](ck jdbc) Support for automatically getting the precision of clickhouse's datetime64 type (#20887) 2023-06-16 23:37:30 +08:00
bf197ee8d2 [opt](nereids) adjust cost model for BroadCastJoin and PartitionJoin (#20713)
we add penalty for broadcast join (bc for brief in the following).
the intuition of penalty is as follow:
1. if the build side is very small (< 1M), we prefer bc, and set `penalty=1`, which means no penalty
2. if build side is more than 1M, we consider the ratio of the probe row count to the build row count. the less the ratio is, the higher penalty is.

this pr has positive impact on tpch queries. Only q3 is changed. in out test (tpch 1T, 3BE) q3 improved from 5.1sec to 2.5 sec.
this pr has positive impact on tpcds queries. test on tpcds sf100 (3BE), cold run improve from 163 sec to 156 sec, hot run improves from 155 sec to 149 sec
2023-06-16 22:49:04 +08:00
5dc0f90c7f [opt](Nereids) revert convert IN with 2 options to OR expression rule (#20894)
revert this rule because it has negative effect on predicate push-down-to-storage-layer
2023-06-16 19:11:37 +08:00
5573858cb4 [Enhance](regression-test) use another db than test_query_db for nereids_p0 (#19467)
replace test_query_db to nereids_test_query_db in nereids_p0 directory
split test_join into five files to run faster.
2023-06-16 16:11:14 +08:00
97135a1cbb [Feature] (json)add json_contains function (#20824) 2023-06-16 15:10:12 +08:00
179b933ccc [test](regression) update some case in p2 (#20683) 2023-06-16 10:09:22 +08:00
d9b3c2aba2 [improvement](jdbc) support support get mysql information_schema's table and clickhouse system's table (#20768) 2023-06-15 14:53:51 +08:00
Pxl
01e53f4e67 [Bug](materialized-view) fix problems about create mv on ssb_flat q4.1 failed (#20658)
fix problems about create mv on ssb_flat q4.1 failed
2023-06-15 14:38:21 +08:00
09d187ec77 [improvement](ck jdbc) Optimized reading of datetime and ip types of the ClickHouse JDBC Catalog (#20804) 2023-06-14 23:28:08 +08:00
Pxl
a0d4f11667 [Bug](function) catch error state in function cast to avoid core dump (#20751)
catch error state in function cast to avoid core dump
2023-06-14 17:34:34 +08:00
1c9f107185 [feature](nereids) support match syntax (#20781)
Support match syntax in nereids.
match syntax use like:
```sql
select * from test where msg match "hello";
select * from test where msg match_any "hello";
select * from test where msg match_all "hello hi";
select * from test where msg match_phrase "hello world";
```
`match` is same as `match_any`.

the pr of match syntax in original planner: https://github.com/apache/doris/pull/14211
2023-06-14 17:30:27 +08:00
affe36d32e [test](find_in_set) add find_in_set function test case (#20718) 2023-06-14 09:43:48 +08:00
7636dd1fdc [fix](nereids) always use colocate scan when agg's fragment has olap scan (#20695) 2023-06-13 17:59:17 +08:00
7942bd0bf9 [fix](planner) cast string literal to date like type should not be an implict cast (#20709)
1. cast string literal to date like type should not be an implict cast
2. the string representation of float like type should not be scientific notation
3. the data type of like function's regex expr should be string type even if it's a null literal
4. add -Xss4m in fe.conf to prevent stack overflow in some case
2023-06-13 17:57:14 +08:00
feb21fc9e9 [fix](group_concat) use default seperator ',' instead of ', ' for group_concat, to be consistant with mysql (#20741) 2023-06-13 17:20:29 +08:00
2adf5169e6 [improvement](test) improve p2 case of githubevents (#20727)
Check rows of github_events table after restore finish.
2023-06-13 14:31:24 +08:00
73ad885e19 [Feature][Fix](multi-catalog) Implements transactional hive full acid tables. (#20679)
After supporting insert-only transactional hive full acid tables #19518, #19419, this PR support transactional hive full acid tables.

Support hive3 transactional hive full acid tables.
Hive2 transactional hive full acid tables need to run major compactions.
2023-06-13 08:55:16 +08:00
c25c19bddc [test](regression) Add cases to test join condition push and not like (#20453)
Add testing cases to issue #19613
2023-06-12 18:26:23 +08:00
99c0592157 [Feature](array-function) Support array_pushback function #17417 (#19988)
Implement array_pushback.

mysql> select array_pushback([1, 2], 3);
+--------------------------------+
| array_pushback(ARRAY(1, 2), 3) |
+--------------------------------+
| [1, 2, 3]                      |
+--------------------------------+
1 row in set (0.01 sec)
2023-06-12 16:51:12 +08:00
141813b476 [tpcds](nereids) estimate distribution cost by byte size instead of row count (#20642)
this pr impacts tpch q16 Agg strategy, but no performance issue
this pr improves tpcds sf100

before:
cold 141 sec
hot 133 sec

after:
code 137 sec
hot 128 sec
2023-06-12 16:23:49 +08:00
0b228b3414 [fix](load)Support load json data with default value (#20624)
* support json default value

---------

Co-authored-by: duanxujian <duanxujian@jd.com>
2023-06-12 14:51:31 +08:00
10134ea8c6 [fix](planner) fix RewriteInPredicateRule may be useless (#20668)
Issue Number: close #20669

RewriteInPredicateRule may cast InPredicate expr's two child to the same type, for example: where cast(age as char) in ('11'), the type of age is int, RewriteInPredicateRule will cast expr's two child type to int. As in the example above, child 0 will be such struct: 
```
child 0: type: int
    |---  child: type : char
            |-- child: type : int
```

Due to the RewriteInPredicateRule cast the type of the expr to int, it will reanalyze stmt, but it will reset stmt first before reanalyze the stmt, and reset opt will change child 0 to such struct:
```
child: type : char
    |-- child: type : int
```
It cause two child's type will be cast to varchar in func castAllToCompatibleType, the logic of RewriteInPredicateRule will be useless.

In 1.1-lts and 1.2-lts, such case  " where cast(age as char) in ('11')"  can't work well,  because func castAllToCompatibleType will cast int to char but int can't cast to char(master can work well because func castAllToCompatibleType will cast int to varchar in such case).
```
MySQL [test]> select user_id from test_cast where cast(age as char) in ('45');
ERROR 1105 (HY000): errCode = 2, detailMessage = type not match, originType=INT, targeType=CHAR(*)
```
2023-06-12 14:39:01 +08:00
Pxl
7f8c5c81e7 [Feature](agg_state) support agg_state combinator on nereids (#20164)
support agg_state combinator on nereids
2023-06-12 12:49:26 +08:00
bcc37c9405 [fix](planner)the common type of floating and decimal should be floating type (#20634)
* [fix](planner)the common type of floating and decimal should be floating type

* fix test cases
2023-06-12 11:32:23 +08:00
4c340f2851 [Feature] (Multi-Catalog) support query hll column in doris jdbc table - part 1 (#19413)
Issue Number: close #17895
2023-06-12 11:16:19 +08:00
a347063390 [fix](case expr) fix coredump of case for null value 2 (#20635)
fix coredump of case for null value 2
2023-06-11 23:08:53 +08:00
987b29ded5 [fix](nereids)avoid to derive rowCount NaN (#20523)
the formula used to compute ndv after filter implies that the new rowCount is smaller than the original rowCount. When we apply this formula to join, we should add branch if new row count is bigger than original row count.
when new row count is bigger, the ndv is not changed.
2023-06-10 15:40:14 +08:00
def6a8ec94 [regression](nereids) check tpch sf1T and sf500 plan shape on 3 BE environment #20610 2023-06-09 22:46:40 +08:00