Commit Graph

632 Commits

Author SHA1 Message Date
f2f06c1acc [feature](nereids) Support select temp partition (#15579)
Support such grammer:
    select * from t_p temporary partition(tp1);
    select * from t_p temporary partitions(tp1);
    select * from t_p temporary partition tp1;
2023-01-04 11:04:36 +08:00
eef1f432dd [Bug](datetimev2/decimalv3) Fix wrong predicate infer rule (#15574) 2023-01-04 10:03:43 +08:00
a97f582b93 [fix](nereids) use DAYS as default unit for DATE_ADD and DATE_SUB function (#15559) 2023-01-04 01:55:15 +08:00
18bc354c06 [fix](Nereids) use correct column unique id when read data from non-base index (#15534)
When light schema change is enabled by default, a column in OLAP scan is retrieved by column unique id instead of the column name. Columns with the same name would use different unique IDs among materialized indexes.
This PR ensures that the column in the OLAP scan node could use the correct column unique id.
2023-01-04 01:41:25 +08:00
8d0c06c897 [fix](nereids) binding priority in agg-sort, having, group_by_key (#15240)
This PR defines order_key and having_key binding priority.

1. order key priority
 ```
                select
                        col1 * -1 as col1    # inner_col1 * -1 as alias_col1
                from
                        t
                order by col1;     # order by order_col1
```
to bind `order_col1`, `alias_col1` has higher priority than `inner_col1`

2. having key priority
```
       select (a-1) as a  # inner_a - 1 as alias_a
       from bind_priority_tbl 
       group by a 
       having a=1;
```
to bind having key, `inner_a` has higher priority than `alias_a`

3. group by key binding priority
```
SELECT date_format(b.k10,
         '%Y%m%d') AS k10
FROM test a
LEFT JOIN 
    (SELECT k10
    FROM baseall) b
    ON a.k10 = b.k10
GROUP BY  k10;
```
group_by_key (k10) binding priority:

- agg.child.output
- agg.output
if binding with agg.child.output failed(the slot not found, or more than one candidate slot found in agg.child.output), nereids try to bind group_by_key with agg.output.
In above example, nereids found 2 candidate slots (a.k10, b.k10) in agg.child.output for group_by_key (k10), binding with agg.child.output failed. Then nereids try to bind group_by_key with agg.output, that is `date_format(b.k10, '%Y%m%d') AS k10`. and finally, group_by_key is bound with `alias k10`
2023-01-03 22:09:28 +08:00
55dc541c90 [Fix](Nereids) aggregate function except COUNT should nullable without group by expr (#15547)
Co-authored-by: mch_ucchi
2023-01-03 21:28:07 +08:00
Pxl
85fe9d2496 [Bug](filter) fix not in(null) return true (#15466)
fix not in(null) return true
2023-01-03 21:14:50 +08:00
1dabcb0111 [Fix](Nereids) fix except and intersect error for statsCalculator (#15557)
When calculating the statsCalculator of except and intersect, the slotId of the corresponding column was not replaced with the slotId of output, resulting in NPE.
2023-01-03 17:06:57 +08:00
b50448d5c4 [vectorized](udaf) fix udaf result is null when has multiple aggs (#15554) 2023-01-03 16:03:43 +08:00
8748f65a1b [fix](nereids)support nulls first/last in order by clause (#15530) 2023-01-03 14:56:00 +08:00
31548cfe2a [fix](nereids) check failed that exchange node under agg must from PhysicalDistribute (#15473)
when nereids translates PhysicalHashAggreg node to original plan, if the input fragment root is exchange node, nereids assumes that this exchanged node is generated from PhyscialDistirbute node.
But this assumption is not true. For example, sort node could be translated to exchange(merge phase)+sort(local phase).
2023-01-03 11:19:25 +08:00
ad9a67a76a [Bug](decimalv3) Fix wrong decimalv3 value after insertion (#15505) 2023-01-01 11:08:59 +08:00
781fa17993 [fix](Nereids) round function return type should be double (#15502) 2022-12-30 23:36:15 +08:00
100834df8b [fix](nereids) fix some arrgregate bugs in Nereids (#15326)
1. the agg function without distinct keyword should be a "merge" funcion in threePhaseAggregateWithDistinct
2. use aggregateParam.aggMode.consumeAggregateBuffer instead of aggregateParam.aggPhase.isGlobal() to indicate if a agg function is a "merge" function
3. add an AvgDistinctToSumDivCount rule to support avg(distinct xxx) in some case
4. AggregateExpression's nullable method should call inner function's nullable method.
5. add a bind slot rule to bind pattern "logicalSort(logicalHaving(logicalProject()))"
6. don't remove project node in PhysicalPlanTranslator
7. add a cast to bigint expr when count( distinct datelike type )
8. fallback to old optimizer if bitmap runtime filter is enabled.
9. fix exchange node mem leak
2022-12-30 23:07:37 +08:00
93a25e1af5 [fix](nereids) the project node is lost when creating PhysicalStorageLayerAggregate node (#15467) 2022-12-30 16:33:24 +08:00
5ec4e5586f [refactor]remove seek block in segmentIterator (#15413)
* remove seek block

* add reg test

Co-authored-by: Wang Bo <wangbo36@meituan.com>
2022-12-30 14:14:16 +08:00
520b6d7910 [Improvement](decimalv3) Add a config to check overflow for DECIMALV3 (#15463) 2022-12-30 14:02:24 +08:00
2339dcda05 [fix](icebergv2)update icebergv2 regression case (#15442)
update icebergv2 regression case
Co-authored-by: jinzhe <jinzhe@selectdb.com>
2022-12-30 12:24:26 +08:00
917b266799 [fix](planner) table valued function could not used in subquery (#15496) 2022-12-30 10:01:25 +08:00
edb9a3b58d [Bug](timediff) Fix wrong result for function timediff (#15312) 2022-12-30 00:28:51 +08:00
9a517d6a8f [DataType](Deciamlv3) change the avg function scale of decimalv3 (#15445) 2022-12-30 00:27:51 +08:00
3ff01ca799 [feature-wip](multi-catalog) support Iceberg time travel in external table (#15418)
For example
SELECT* FROM tbl FOR VERSION AS OF 10963874102873;
SELECT* FROM tbl FOR TIME AS OF '1986-10-26 01:21:00';
2022-12-30 00:25:21 +08:00
e651a9bb11 [feature](nereids) add variance function for nereids (#15370)
support variance function. currently, it dose not support decimalV3 type
2022-12-29 18:33:52 +08:00
c22ba8e160 [Bug](Decimalv3) coredump of decimalv3 multiply (#15452) 2022-12-29 15:35:17 +08:00
25b257e37c [enhancement](session var) varariable to control whether to rewrite OR to IN or not (#15437) 2022-12-29 14:50:32 +08:00
5b09d27d54 [feature-wip](nereids) Made decimal in nereids more complete (#15087)
1. Add IntegralDivide operator to support `DIV` semantics
2. Add more operator rewriter to keep expression type consistent between operators
3. Support the convertion between float type and decimal type.

After this PR, below cases could be executed normaly like the legacy optimizer:
  use test_query_db;
  select k1, k5,100000*k5 from test order by k1, k2, k3, k4;
  select avg(k9) as a from test group by k1 having a < 100.0 order by a;
2022-12-29 13:01:47 +08:00
1f98dd2c74 [fix](Nereids) Generate is missing on alias query (#15416)
support table generating function on query alias, syntax as:
```sql
SELECT * FROM (SELECT * FROM tbl) tmp LATERAL VIEW explode(c1) gtmp AS ce;
```
2022-12-29 11:11:25 +08:00
0e154feeb9 [feature](multi catalog nereids)Add file scan node to nereids. (#15201)
Add file scan node to nereids, so that the new planner could support external hms table.
2022-12-29 10:31:11 +08:00
95e6553d90 [feature-wip](nereids) Implement using join (#15311) 2022-12-28 19:22:20 +08:00
0f8b15b902 [feature](nereids) support string alias in select list (#15369)
support such syntax: select '' as 'b', col1 from select_with_const
2022-12-28 17:26:48 +08:00
d05f430ca2 [feature](nereids) support syntax: count(all *) (#15376) 2022-12-28 11:09:56 +08:00
2af831de33 [Fix](Nereids)fix group by binding error, resulting in incorrect results (#15328)
Original: group by is bound to the outputExpression of the current node.

Problem: When the name of the new reference of outputExpression is the same as the child's output column, the child's output column should be used for group by, but at this time, the new reference of the node's outputExpression will be used for group by, resulting in an error

Now: Give priority to the child's output for group by binding. If the child does not have a corresponding column, use the outputExpression of this node for binding
2022-12-28 10:42:21 +08:00
28bb13a026 [feature](light-schema-change) enable light schema change by default (#15344) 2022-12-28 09:29:26 +08:00
22b31e516c [Bug](decimalv3) select view of decimalv3 error (#15404) 2022-12-28 08:38:33 +08:00
03aef7a8ac [fix](nereids) sender in union's child fragment has no destination (#15402)
1. always create an exchange node for set operation node's children
2. fix cast expr's nullability bug.
2022-12-27 22:36:54 +08:00
51b14c06d3 [enhancement](nereids) support approx_count_distinct function (#15406) 2022-12-27 22:25:21 +08:00
c63dda99db [test](Nereids) Disable some regression test for materialized index. (#15387)
When light schema change is enabled by default (#15344), regression tests that run SQL by selecting data from the materialized index will fail.
This PR disabled those failed queries in the regression test. Those tests would be added back when nereids planner could give the correct plan when light schema change is enabled.
2022-12-27 19:24:03 +08:00
0550dfaeb2 [enhancement](rewrite) add OrToIn rule and fix ExtractCommonFactorsRule apply problems (#12872)
Co-authored-by: wuhangze <wuhangze@jd.com>
2022-12-27 18:39:53 +08:00
a07ca41f8e [Fix](Nereids) fix repeat node nullable error bugs (#15251) 2022-12-27 17:01:33 +08:00
5a8201320a [fix](nereids) group by constants produce wrong result (#15322)
SELECT 2 FROM tbl GROUP BY 1

it should produce 2 would the table is not empty when table is not empty. Before this PR, the execution of nereids generated plan would produce empty result set
2022-12-27 14:35:02 +08:00
8879400419 [feature](nereids) Support query on specific partitions (#15243) 2022-12-27 00:32:14 +08:00
a1c6ea876f [fix](inbitmap) fix core dump caused by bitmap filter with union (#15333)
The join node need project operation to remove unnecessary columns from the output tuples.
For SetOperationNode output tuple and input tuple is consistent and do not need project,
but the children of SetOperationNode may be join nodes, so the children of the SetOperationNode
need to do the project operation.
2022-12-26 23:14:32 +08:00
fc8f6a0715 [fix](multi-catalog) throw NPE when reading data after EOF (#15358)
1. Fix 1 bug:  
Throw null pointer exception when reading data after the reader reaches the end of file, so should return directly when `_do_lazy_read` read no data.

2. Optimize code:  
Remove unused parameters.

3. Fix regression test
2022-12-26 22:49:35 +08:00
72f0003753 [enhancement](regression) use sf0.1 data in datev2 and decimalv3 cases (#15342) 2022-12-26 19:15:49 +08:00
8b6e4e74e7 [improvement](jdbc) add default jdbc driver's dir (#15346)
Add a new config "jdbc_drivers_dir" for both FE and BE.
User can put jdbc drivers' jar file in this dir, and only specify file name in "driver_url" properties
when creating jdbc resource.
And Doris will find jar files in this dir.

Also modify the logic so that when the jdbc resource is modified, the corresponding jdbc table
will get the latest properties.
2022-12-26 11:51:12 +08:00
bf71943605 [feature](load) stream load trim double quotes for csv (#15241) 2022-12-26 11:45:54 +08:00
6bec1ffc47 [feature](planner) remove restrict of offset without order by (#15218)
Support SELECT * FROM tbl LIMIT 5, 3;
2022-12-26 09:37:41 +08:00
ec055e1acb [feature](new file reader) Integrate new file reader (#15175) 2022-12-26 08:55:52 +08:00
82d316b419 [bug](decimalv3) Fix wrong decimal scale for arithmetic expr (#15316) 2022-12-24 21:57:46 +08:00
e72404c537 [fix](scan) fix that be may core dump when the predicates are all false (#15332) 2022-12-24 15:27:43 +08:00