Commit Graph

5755 Commits

Author SHA1 Message Date
917b266799 [fix](planner) table valued function could not used in subquery (#15496) 2022-12-30 10:01:25 +08:00
2c8de30cce [optimize](multi-catalog) use dictionary encode&filter to process delete files (#15441)
**Optimize**
PR #14470 has used `Expr` to filter delete rows to match current data file,
but the rows in the delete file are [sorted by file_path then position](https://iceberg.apache.org/spec/#position-delete-files)
to optimize filtering rows while scanning, so this PR remove `Expr` and use binary search to filter delete rows.

In addition, delete files are likely to be encoded in dictionary, it's time-consuming to decode `file_path`
columns into `ColumnString`, so this PR use `ColumnDictionary` to read `file_path` column.

After testing, the performance of iceberg v2's MOR is improved by 30%+.

**Fix Bug**
Lazy-read-block may not have the filter column, if the whole group is filtered by `Expr`
and the batch_eof is generated from next batch.
2022-12-30 08:57:55 +08:00
85c7c531f1 [vectorized](jdbc) support array type in jdbc external table (#15303) 2022-12-30 00:29:08 +08:00
9a517d6a8f [DataType](Deciamlv3) change the avg function scale of decimalv3 (#15445) 2022-12-30 00:27:51 +08:00
3ff01ca799 [feature-wip](multi-catalog) support Iceberg time travel in external table (#15418)
For example
SELECT* FROM tbl FOR VERSION AS OF 10963874102873;
SELECT* FROM tbl FOR TIME AS OF '1986-10-26 01:21:00';
2022-12-30 00:25:21 +08:00
6c847daba0 [Feature](Nereids) Support grouping set for materialized index. (#15383)
This PR adds support for materialized index selecting when the query has grouping sets.
2022-12-29 23:17:02 +08:00
dda505487c [fix](nereids) SimplifyArithmeticRuleTest ut failed (#15486)
this PR remove typeCoercion on expected expr in ExpressionRewriteTestHelper. Because we should not rewrite expected expr at all. It will change the expected expr unexpectedly.
2022-12-29 22:53:27 +08:00
c54c2f8035 [fix](statistics) fix npe when __internal_schema not created (#15464) 2022-12-29 21:24:33 +08:00
79113b0cd1 [Fix](storage) Fix bug that cooldown time is error (#15444)
Cooldown time is wrong for data in SSD, because cooldown time for all `table/partitionis`
is only calculated once when class `DataProperty` loaded and that cannot be updated later.
This patch is to ensure that cooldown time for each table/partition can be calculated in real time
when table/partition is created.
Co-authored-by: weizuo <weizuo@xiaomi.com>
2022-12-29 21:01:36 +08:00
e651a9bb11 [feature](nereids) add variance function for nereids (#15370)
support variance function. currently, it dose not support decimalV3 type
2022-12-29 18:33:52 +08:00
25b257e37c [enhancement](session var) varariable to control whether to rewrite OR to IN or not (#15437) 2022-12-29 14:50:32 +08:00
d95be84629 [enhancement](profile) add session variable parallel_fragment_exec_instance_num to profile (#15457) 2022-12-29 14:46:07 +08:00
657f3e6318 [fix](pipeline) disable sharing hashtable for broadcast join for pipeline engine (#15432) 2022-12-29 14:19:57 +08:00
7ab6ea684b [Improvement](meta) hide password of show catalog xxx stmt and for es catalog (#15410)
* [Improvement](meta) hide password of show catalog xxx

* hide es password in show create ctlg and show ctlg xx stmt
2022-12-29 14:16:32 +08:00
5b09d27d54 [feature-wip](nereids) Made decimal in nereids more complete (#15087)
1. Add IntegralDivide operator to support `DIV` semantics
2. Add more operator rewriter to keep expression type consistent between operators
3. Support the convertion between float type and decimal type.

After this PR, below cases could be executed normaly like the legacy optimizer:
  use test_query_db;
  select k1, k5,100000*k5 from test order by k1, k2, k3, k4;
  select avg(k9) as a from test group by k1 having a < 100.0 order by a;
2022-12-29 13:01:47 +08:00
304781d837 [Improvement](meta) update show create function result (#15414)
currently, show create function is designed for native function, it has some non suitable points.

this pr bring several improvements, and make result of show create function can be used to create function.

1. add type property.
2. add ALWAYS_NULLABLE perperty for java_udf
3. use file property rather than object_file for java_udf, follow usage of create java_udf
4. remove md5 property, coz file may vary when create function again.
5. remove INIT_FN,UPDATE_FN,MERGE_FN,SERIALIZE_FN etc properties for java_udf, cos java_udf does not need these properties.
2022-12-29 11:40:14 +08:00
3146fc8189 [bug](jdbc) fix jdbc external table with char type length error (#15386)
Now have test pg and oracle with char(100), if data='abc'
but read string data length is 100, so need trim extral spaces
2022-12-29 11:19:03 +08:00
1f98dd2c74 [fix](Nereids) Generate is missing on alias query (#15416)
support table generating function on query alias, syntax as:
```sql
SELECT * FROM (SELECT * FROM tbl) tmp LATERAL VIEW explode(c1) gtmp AS ce;
```
2022-12-29 11:11:25 +08:00
0e154feeb9 [feature](multi catalog nereids)Add file scan node to nereids. (#15201)
Add file scan node to nereids, so that the new planner could support external hms table.
2022-12-29 10:31:11 +08:00
1b1083eb52 [fix](metric) fix prometheus metric format error for doris_fe_query_latency_ms (#15447)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-12-29 08:51:15 +08:00
4336aaa01a [bug](datetimev2) fix wrong info when show create table (#15422)
* [bug](datetimev2) fix wrong info when show create table

* update
2022-12-28 19:55:43 +08:00
95e6553d90 [feature-wip](nereids) Implement using join (#15311) 2022-12-28 19:22:20 +08:00
75aa00d3d0 [Feature](NGram BloomFilter Index) add new ngram bloom filter index to speed up like query (#11579)
This PR implement  the new bloom filter index: NGram bloom filter index, which was proposed in  #10733.
The new index can improve the like query performance greatly, from our some test case , can  get order of magnitude  improve.
For how to use it you can check the docs in this PR, and the index based on the ```enable_function_pushdown```,
you need set it to ```true```, to make the index work for like query.
2022-12-28 18:01:50 +08:00
0f8b15b902 [feature](nereids) support string alias in select list (#15369)
support such syntax: select '' as 'b', col1 from select_with_const
2022-12-28 17:26:48 +08:00
69d95c857a [feature](remote)Add alter storage policy (#15381)
* add alter storage policy

* add alter storage policy

* add alter storage policy
2022-12-28 16:09:06 +08:00
8342691b62 [feature](remote)Add drop storage policy (#15364)
* add drop storage policy

* add drop storage policy

* add drop storage policy

* add drop storage policy
2022-12-28 16:04:30 +08:00
8ce62600dc [Bug] #14876 && #15225 have some bugs in rewrite or to in, revert them (#15420) 2022-12-28 13:30:09 +08:00
Pxl
121f00b6e2 [Bug](function) forbid hll_union input not hll type param (#15397)
forbid hll_union input not hll type param
2022-12-28 12:23:34 +08:00
d05f430ca2 [feature](nereids) support syntax: count(all *) (#15376) 2022-12-28 11:09:56 +08:00
2af831de33 [Fix](Nereids)fix group by binding error, resulting in incorrect results (#15328)
Original: group by is bound to the outputExpression of the current node.

Problem: When the name of the new reference of outputExpression is the same as the child's output column, the child's output column should be used for group by, but at this time, the new reference of the node's outputExpression will be used for group by, resulting in an error

Now: Give priority to the child's output for group by binding. If the child does not have a corresponding column, use the outputExpression of this node for binding
2022-12-28 10:42:21 +08:00
28bb13a026 [feature](light-schema-change) enable light schema change by default (#15344) 2022-12-28 09:29:26 +08:00
22b31e516c [Bug](decimalv3) select view of decimalv3 error (#15404) 2022-12-28 08:38:33 +08:00
5ac7b09765 [feature](Nereids) Support SchemaScan (#15411)
such as: select * from information_schema.backends;
2022-12-28 00:33:48 +08:00
03aef7a8ac [fix](nereids) sender in union's child fragment has no destination (#15402)
1. always create an exchange node for set operation node's children
2. fix cast expr's nullability bug.
2022-12-27 22:36:54 +08:00
51b14c06d3 [enhancement](nereids) support approx_count_distinct function (#15406) 2022-12-27 22:25:21 +08:00
0550dfaeb2 [enhancement](rewrite) add OrToIn rule and fix ExtractCommonFactorsRule apply problems (#12872)
Co-authored-by: wuhangze <wuhangze@jd.com>
2022-12-27 18:39:53 +08:00
849adca225 [fix](Nereids) expressions is not be extracted when binding function (#15361) 2022-12-27 17:12:07 +08:00
a07ca41f8e [Fix](Nereids) fix repeat node nullable error bugs (#15251) 2022-12-27 17:01:33 +08:00
5a8201320a [fix](nereids) group by constants produce wrong result (#15322)
SELECT 2 FROM tbl GROUP BY 1

it should produce 2 would the table is not empty when table is not empty. Before this PR, the execution of nereids generated plan would produce empty result set
2022-12-27 14:35:02 +08:00
b3f77a2e00 [feature](Show) add one show type cast command (#15137) 2022-12-27 14:19:04 +08:00
69068f9835 [fix](planner) fix hll_union plan: Invalid Aggregate Operator: hll_union (#14931)
When using hll_union aggregate function, PREAGGREGATION is always OFF and Rollup cannot be hit.
2022-12-27 11:20:41 +08:00
99868f9929 [fix](planner) set limit to nagtive value when has offset in limit (#15367) 2022-12-27 11:16:19 +08:00
0f922a83c7 [fix](nereids) group_bit_xxx signature update (#15318) 2022-12-27 00:41:42 +08:00
325d247b92 [Feature](Nereids) Support hll and count for materialized index. (#15275) 2022-12-27 00:38:04 +08:00
650136c32e [Enhancement](fe): replace assertTrue(X.equals(X)) with assertEquals (#15356) 2022-12-27 00:37:24 +08:00
8879400419 [feature](nereids) Support query on specific partitions (#15243) 2022-12-27 00:32:14 +08:00
a1c6ea876f [fix](inbitmap) fix core dump caused by bitmap filter with union (#15333)
The join node need project operation to remove unnecessary columns from the output tuples.
For SetOperationNode output tuple and input tuple is consistent and do not need project,
but the children of SetOperationNode may be join nodes, so the children of the SetOperationNode
need to do the project operation.
2022-12-26 23:14:32 +08:00
301640d3c0 [fix](string) fix offsets over flow for extreme large String column (#15360)
* [fix](string) fix offsets over flow for extreme large String column

* fix
2022-12-26 21:23:58 +08:00
ae87415174 [Feature](Nereids) add simplify arithmetic rule (#15242)
support simplify arithmetic rule

for example :
a + 1 > 1
=> a > 0
2022-12-26 16:57:59 +08:00
1400a89065 [Bug](Compile) fix compile error by using correct method name (#15355)
fix compile error by using correct method name
2022-12-26 14:58:01 +08:00