Commit Graph

609 Commits

Author SHA1 Message Date
c22ba8e160 [Bug](Decimalv3) coredump of decimalv3 multiply (#15452) 2022-12-29 15:35:17 +08:00
25b257e37c [enhancement](session var) varariable to control whether to rewrite OR to IN or not (#15437) 2022-12-29 14:50:32 +08:00
5b09d27d54 [feature-wip](nereids) Made decimal in nereids more complete (#15087)
1. Add IntegralDivide operator to support `DIV` semantics
2. Add more operator rewriter to keep expression type consistent between operators
3. Support the convertion between float type and decimal type.

After this PR, below cases could be executed normaly like the legacy optimizer:
  use test_query_db;
  select k1, k5,100000*k5 from test order by k1, k2, k3, k4;
  select avg(k9) as a from test group by k1 having a < 100.0 order by a;
2022-12-29 13:01:47 +08:00
1f98dd2c74 [fix](Nereids) Generate is missing on alias query (#15416)
support table generating function on query alias, syntax as:
```sql
SELECT * FROM (SELECT * FROM tbl) tmp LATERAL VIEW explode(c1) gtmp AS ce;
```
2022-12-29 11:11:25 +08:00
0e154feeb9 [feature](multi catalog nereids)Add file scan node to nereids. (#15201)
Add file scan node to nereids, so that the new planner could support external hms table.
2022-12-29 10:31:11 +08:00
95e6553d90 [feature-wip](nereids) Implement using join (#15311) 2022-12-28 19:22:20 +08:00
0f8b15b902 [feature](nereids) support string alias in select list (#15369)
support such syntax: select '' as 'b', col1 from select_with_const
2022-12-28 17:26:48 +08:00
d05f430ca2 [feature](nereids) support syntax: count(all *) (#15376) 2022-12-28 11:09:56 +08:00
2af831de33 [Fix](Nereids)fix group by binding error, resulting in incorrect results (#15328)
Original: group by is bound to the outputExpression of the current node.

Problem: When the name of the new reference of outputExpression is the same as the child's output column, the child's output column should be used for group by, but at this time, the new reference of the node's outputExpression will be used for group by, resulting in an error

Now: Give priority to the child's output for group by binding. If the child does not have a corresponding column, use the outputExpression of this node for binding
2022-12-28 10:42:21 +08:00
28bb13a026 [feature](light-schema-change) enable light schema change by default (#15344) 2022-12-28 09:29:26 +08:00
22b31e516c [Bug](decimalv3) select view of decimalv3 error (#15404) 2022-12-28 08:38:33 +08:00
03aef7a8ac [fix](nereids) sender in union's child fragment has no destination (#15402)
1. always create an exchange node for set operation node's children
2. fix cast expr's nullability bug.
2022-12-27 22:36:54 +08:00
51b14c06d3 [enhancement](nereids) support approx_count_distinct function (#15406) 2022-12-27 22:25:21 +08:00
c63dda99db [test](Nereids) Disable some regression test for materialized index. (#15387)
When light schema change is enabled by default (#15344), regression tests that run SQL by selecting data from the materialized index will fail.
This PR disabled those failed queries in the regression test. Those tests would be added back when nereids planner could give the correct plan when light schema change is enabled.
2022-12-27 19:24:03 +08:00
0550dfaeb2 [enhancement](rewrite) add OrToIn rule and fix ExtractCommonFactorsRule apply problems (#12872)
Co-authored-by: wuhangze <wuhangze@jd.com>
2022-12-27 18:39:53 +08:00
a07ca41f8e [Fix](Nereids) fix repeat node nullable error bugs (#15251) 2022-12-27 17:01:33 +08:00
5a8201320a [fix](nereids) group by constants produce wrong result (#15322)
SELECT 2 FROM tbl GROUP BY 1

it should produce 2 would the table is not empty when table is not empty. Before this PR, the execution of nereids generated plan would produce empty result set
2022-12-27 14:35:02 +08:00
8879400419 [feature](nereids) Support query on specific partitions (#15243) 2022-12-27 00:32:14 +08:00
a1c6ea876f [fix](inbitmap) fix core dump caused by bitmap filter with union (#15333)
The join node need project operation to remove unnecessary columns from the output tuples.
For SetOperationNode output tuple and input tuple is consistent and do not need project,
but the children of SetOperationNode may be join nodes, so the children of the SetOperationNode
need to do the project operation.
2022-12-26 23:14:32 +08:00
fc8f6a0715 [fix](multi-catalog) throw NPE when reading data after EOF (#15358)
1. Fix 1 bug:  
Throw null pointer exception when reading data after the reader reaches the end of file, so should return directly when `_do_lazy_read` read no data.

2. Optimize code:  
Remove unused parameters.

3. Fix regression test
2022-12-26 22:49:35 +08:00
72f0003753 [enhancement](regression) use sf0.1 data in datev2 and decimalv3 cases (#15342) 2022-12-26 19:15:49 +08:00
8b6e4e74e7 [improvement](jdbc) add default jdbc driver's dir (#15346)
Add a new config "jdbc_drivers_dir" for both FE and BE.
User can put jdbc drivers' jar file in this dir, and only specify file name in "driver_url" properties
when creating jdbc resource.
And Doris will find jar files in this dir.

Also modify the logic so that when the jdbc resource is modified, the corresponding jdbc table
will get the latest properties.
2022-12-26 11:51:12 +08:00
bf71943605 [feature](load) stream load trim double quotes for csv (#15241) 2022-12-26 11:45:54 +08:00
6bec1ffc47 [feature](planner) remove restrict of offset without order by (#15218)
Support SELECT * FROM tbl LIMIT 5, 3;
2022-12-26 09:37:41 +08:00
ec055e1acb [feature](new file reader) Integrate new file reader (#15175) 2022-12-26 08:55:52 +08:00
82d316b419 [bug](decimalv3) Fix wrong decimal scale for arithmetic expr (#15316) 2022-12-24 21:57:46 +08:00
e72404c537 [fix](scan) fix that be may core dump when the predicates are all false (#15332) 2022-12-24 15:27:43 +08:00
8c0de789e4 [feature](Nereids) support table generating function (#15121) 2022-12-23 20:36:33 +08:00
ede68e075d [fix](iceberg-v2) fix fe iceberg split, add regression case (#15299) 2022-12-23 19:33:00 +08:00
a98636a970 [bugfix](from_unixtime) fix timezone not work for from_unixtime (#15298)
* [bugfix](from_unixtime) fix timezone not work for from_unixtime
2022-12-23 19:05:09 +08:00
bfaaa2bd7c [feature](Nereids) support digital_masking function (#15252) 2022-12-23 18:59:08 +08:00
e7a077a81f [fix](jdbc catalog) fix bugs of jdbc catalog and table valued function (#15216)
* fix bugs

* add `desc function` test

* add test

* fix
2022-12-23 16:46:39 +08:00
8a810cd554 [fix](bitmapfilter) fix core dump caused by bitmap filter (#15296)
Do not push down the bitmap filter to a non-integer column
2022-12-23 16:42:45 +08:00
cb295de981 [Bug](decimalv3) Fix wrong precision of DECIMALV3 (#15302)
* [Bug](decimalv3) Fix wrong precision of DECIMALV3

* update
2022-12-23 14:11:08 +08:00
82fbfab77f [fix](union)the union node should not pass through children in some case (#15286)
the union node will make children pass through in wrong condition. If the children's materialized slots are different from union node, children can't be passed through.
2022-12-23 10:27:49 +08:00
09a22813e4 [feature](Nereids) support syntax SELECT DISTINCT (#15197)
Add a new rule 'ProjectWithDistinctToAggregate' to support "select distinct xx from table".
This rule check's the logicalProject node's isDisinct property and replace the logicalProject node with a LogicalAggregate node.
So any rule before this, if createing a new logicalProject node, should make sure isDisinct property is correctly passed around.
please see rule BindSlotReference or BindFunction for example.
2022-12-22 23:54:08 +08:00
df5969ab58 [Feature] Support function roundBankers (#15154) 2022-12-22 22:53:09 +08:00
e331e0420b [improvement](topn)add per scanner limit check for new scanner (#15231)
Optimize for key topn query like `SELECT * FROM store_sales ORDER BY ss_sold_date_sk, ss_sold_time_sk LIMIT 100` 
(ss_sold_date_sk, ss_sold_time_sk is prefix of table sort key). 

Check per scanner limit and set eof true to reduce the data need to be read.
2022-12-22 22:39:31 +08:00
1fdd4172bd [fix](Inbitmap) fix in bitmap result error when left expr is constant (#15271)
* [fix](Inbitmap) fix in bitmap result error when left expr is constant

1. When left expr of the in predicate is a constant, instead of generating a bitmap filter, rewrite sql to use `bitmap_contains`.
  For example,"select k1, k2 from (select 2 k1, 11 k2) t where k1 in (select bitmap_col from bitmap_tbl)"
  => "select k1, k2 from (select 2 k1, 11 k2) t left semi join bitmap_tbl b on bitmap_contains(b.bitmap_col, t.k1)"

* add regression test
2022-12-22 19:25:09 +08:00
a87f905a2d [Feature](Nereids) unnest subquery in 'not in' predicate into NULL AWARE ANTI JOIN (#15230)
when we process not in subquery. if the subquery return column is nullable, we need a NULL AWARE ANTI JOIN instead of ANTI JOIN.
Doris already support NULL AWARE ANTI JOIN in PR #13871
Nereids need to do that so.
2022-12-22 14:13:47 +08:00
1520a4af6d [refactor](resource) use resource to create external catalog (#14978)
Use resource to create external catalog.
-- HMS
mysql> create resource hms_resource properties(
    -> "type"="hms",
    -> 'hive.metastore.uris' = 'thrift://172.21.0.44:7004',
    -> 'dfs.nameservices'='HANN',
    -> 'dfs.ha.namenodes.HANN'='nn1,nn2',
    -> 'dfs.namenode.rpc-address.HANN.nn1'='172.21.0.32:4007',
    -> 'dfs.namenode.rpc-address.HANN.nn2'='172.21.0.44:4007',
    -> 'dfs.client.failover.proxy.provider.HANN'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider'
    -> );

-- MYSQL
mysql> create resource mysql_resource properties (
    -> "type"="jdbc",
    -> "user"="root",
    -> "password"="123456",
    -> "jdbc_url" = "jdbc:mysql://127.0.0.1:3316/doris_test?useSSL=false",
    -> "driver_url" = "https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/mysql-connector-java-8.0.25.jar",
    -> "driver_class" = "com.mysql.cj.jdbc.Driver");

-- ES
mysql> create resource es_resource properties (
    -> "type"="es",
    -> "hosts"="http://127.0.0.1:29200",
    -> "nodes_discovery"="false",
    -> "enable_keyword_sniff"="true");
2022-12-22 13:45:55 +08:00
2bb4ea5dea [regresion-test](icebergv2) add icebergv2 test case (#15187) 2022-12-22 13:45:07 +08:00
1ca1417824 [feature](multi-catalog) support show tables/table status from catalog.db (#15180)
support 'show tables from catalog.db' and 'show table status from catalog.db'
2022-12-22 09:22:40 +08:00
56f7ba19c0 [opt](planner) add session var: COMPACT_EQUAL_TO_IN_PREDICATE_THRESHOLD (#15225)
in previous pr(#14876) we compact equals like "a=1 or a=2 or a = 3 " in to "a in (1, 2, 3)"
this pr set a lower bound for the number of equals COMPACT_EQUAL_TO_IN_PREDICATE_THRESHOLD (default is 2)

for performance reason, we create a hashSet to collect literals, like {1,2,3}. and hence, the literals in "in-predicates" are in random order.

for regression test, if we need stable output of explain string, set COMPACT_EQUAL_TO_IN_PREDICATE_THRESHOLD to a large number to avoid compact rule.
2022-12-21 21:10:47 +08:00
649bbc1e58 [fix](nereids) Fix case-when (#15150) 2022-12-21 21:03:50 +08:00
90349f0e61 [Feature](Nereids) support mask function (#15120)
support function for nereids: mask, mask_first_n, mask_last_n
2022-12-21 10:25:11 +08:00
d0d7a6d8ad [fix](multi-catalog) can't show databases when creating a new user in external catalog (#15204)
Fix bug: A new user with grants to access external catalog can't show databases.
2022-12-21 08:58:06 +08:00
732417258c [Bug](pipeline) Fix bugs to pass TPCDS cases (#15194) 2022-12-20 22:29:55 +08:00
5cf21fa7d1 [feature](planner) mark join to support subquery in disjunction (#14579)
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
2022-12-20 15:22:43 +08:00
d9550c311e [feature](Nereids) implement setOperation (#15020)
The pr implements the SetOperation.

- Adapt to the EliminateUnnecessaryProject rule to ensure that the project under SetOperation is not deleted.
- Add predicate pushdown of SetOperation
- Optimization: Merge multiple SetOperations with the same type and the same qualifier
- Optimization: merge oneRowRelation and union
2022-12-20 15:14:29 +08:00