Commit Graph

2605 Commits

Author SHA1 Message Date
ed8db3727c [feature](partial update) support MOW partial update for insert statement (#21597) 2023-09-16 17:11:59 +08:00
81b6ab9b68 [Fix](topn opt) only allow duplicate key or MOW model to use 2 phase read opt in nereids planner (#24485)
The fetch phase is not support aggregation at present
2023-09-16 10:01:36 +08:00
4dad7c94da [fix](orc) fix the count(*) pushdown issue in orc format (#24446)
In previous, when querying hive table in orc format, and the file is splitted.
the result of select count(*) may be multiple of the real row number.

This is because the number of rows should be got after orc strip prune,
otherwise, it may return wrong result
2023-09-16 09:57:39 +08:00
298bf0885d [fix](nereids) correlated anti join shouldn't be translated to null aware anti join (#24290)
original SQL
select t1.* from t1 where t1.k1 not in ( select t3.k1 from t3 where t1.k2 = t3.k2 );

rewrite SQL
before (wrong):
select t1.* from t1 null aware left anti join t2 on t1.k1 = t3.k1 and t1.k2 = t3.k2;
now (correct):
select t1.* from t1 left anti join t3 on t1.k2 = t3.k2 and (t1.k1 = t3.k1 or t3.k1 is null or t1.k1 is null);
2023-09-15 22:50:36 +08:00
1c142309a6 [refactor](jdbc catalog) refactor JdbcFunctionPushDownRule (#23826)
1. Change from using string matching function to using Expr matching
2. Replace the `nvl` function with `ifnull` when pushed down to MySQL
3. Adapt ClickHouse's `from_unixtime` function to push down
4. Non-function filtering can still be pushed down when `enable_func_pushdown` is set to false
2023-09-15 22:16:07 +08:00
ba4c738ac7 [Feature](Nereids) support values table (#23121)
support insert into table values(...) for Nereids.
sql like:
insert into t values(1, 2, 3)
insert into t values(1 + 1, dayofweek(now()), 4), (4, 5, 6)
insert into t values('1', '6.5', cast(1.5 as int))
2023-09-15 21:46:37 +08:00
b407f275c8 [fix](hive) fix partition prune issue and some external table test cases (#24338)
1. Fix hive partition prune bug, introduced from #23845, will fail `test_hive_default_partition` test case.
2. Fix `test_local_tvf.groovy` test case, the path of local tvf should be relative path.
3. Fix `test_external_catalog_hive` test case, the `partitions` is now reserve keywords
4. Support `local` tvf in Nereids, but fix related issue like:

```
Caused by: java.lang.NullPointerException
        at org.apache.doris.nereids.stats.ExpressionEstimation.castMinMax(ExpressionEstimation.java:171) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.stats.ExpressionEstimation.visitCast(ExpressionEstimation.java:167) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.stats.ExpressionEstimation.visitCast(ExpressionEstimation.java:109) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.expressions.Cast.accept(Cast.java:55) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.stats.ExpressionEstimation.visitAlias(ExpressionEstimation.java:394) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.stats.ExpressionEstimation.visitAlias(ExpressionEstimation.java:109) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.expressions.Alias.accept(Alias.java:145) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.stats.ExpressionEstimation.estimate(ExpressionEstimation.java:119) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.stats.StatsCalculator.lambda$computeProject$7(StatsCalculator.java:785) ~[doris-fe.jar:1.2-SNAPSHOT]
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_341]
        at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) ~[?:1.8.0_341]
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[?:1.8.0_341]
```
2023-09-15 20:57:04 +08:00
ab69416922 [Bug](pipelineX) fix streaming agg (#24449)
fix streaming agg
2023-09-15 19:22:54 +08:00
c4c4162da2 [fix](test)rename table name to prevent conflicting with other test case (#24391) 2023-09-15 19:11:41 +08:00
07d4769134 [fix](bitmap) fix coredump of bitmap_from_array caused by null array literal (#24404) 2023-09-15 18:36:33 +08:00
0742af70ea [Fix](planner) fix select from inline table return only the first row (#24365) 2023-09-15 18:14:54 +08:00
4ad5845dcc [fix](planner) add if function signature for jsonb data type (#24436) 2023-09-15 17:50:51 +08:00
fa37a8bba8 [opt](stats) remove corresponding col stats status if the loading at the end of analyze task is failed (#24405) 2023-09-15 17:46:48 +08:00
dc0c39f1d8 [Enhance](external)change hive docker to host network and add hive case (#24401)
1. Change the external hive docker network mode from the bridge mode to the host mode to support the external test of the multi-node doris cluster
2. Added more hive test data in various formats
3. Added a test case with hive
2023-09-15 17:46:24 +08:00
699023069d [regression](lateral view) add test case for explode_bitmap (#24421) 2023-09-15 17:30:26 +08:00
7fd72351f9 [fix](agg) windown_funnel compatibility issue with multi backends (#24385) 2023-09-15 17:22:47 +08:00
d24f3efd4a [pipelineX](profile) Phase 1: refactor pipelineX detailed profile (#24322) 2023-09-15 16:14:05 +08:00
6dbe07bd3b [Enhancement](inverted index) use conjunction query to accelerate fulltext equal query (#24373) 2023-09-15 15:34:57 +08:00
3c18ed4e86 [test](fix) remove unused test case test_mtmv_ssb_ddl.groovy (#24434)
* forbid: test_mtmv_ssb_ddl

* remove: test_mtmv_ssb_ddl.groovy
2023-09-15 15:02:31 +08:00
08740b47cd [FIX](decimalv3) fix decimalv3 value with leading zeros (#24416)
now we make error if we deal with leading zeros in decimal value , type_precision >= precision will make value overflow and DCHECK will fail , so if here has leading zero we should only make type_precision > precision to make value right
2023-09-15 13:35:20 +08:00
29fe87982f [improve](outfile) add file_suffix options for outfile (#24334) 2023-09-15 12:58:41 +08:00
c5e7f55b63 [performance](executor) optimize time_round function (#23058)
optimize time_round function
2023-09-15 10:49:22 +08:00
dbd7733e02 [feature](regression) Add p2 level test for schema change (#20243) 2023-09-15 10:39:07 +08:00
00bb32cfc0 [opt](nereids) enable two phase partition topn opt #23870
Enable two phase partition topn optimization, instead of original full sort at the second phase.
E.g, partial plan of tpcds q67 is as following and a full sort after exchange will have performance impact, especially if the window column's ndv is very high and the number of window is huge.

------PhysicalTopN
--------filter((rk <= 100))
----------PhysicalWindow
------------PhysicalQuickSort
--------------PhysicalDistribute
----------------PhysicalPartitionTopN
------------------PhysicalProject

Under this scenario, the second phase full sort can be transformed to a global PhysicalPartitionTopN and reduce the cost from full sort. The plan will be optimized to the following:

------PhysicalTopN
--------filter((rk <= 100))
----------PhysicalWindow
------------PhysicalPartitionTopN
--------------PhysicalDistribute
----------------PhysicalPartitionTopN
------------------PhysicalProject
2023-09-15 10:30:34 +08:00
c5ef6cfea2 [fix](Table-Valued Function) fix be core when user sepcified empty column_separator using hdfs tvf (#24369) 2023-09-14 23:19:48 +08:00
5ba1f62da8 [enhancement](Nereids) make stats unchanged (#23737)
make stats unchanged when explore plan
2023-09-14 22:18:54 +08:00
d4756d3118 [feature](Nereids): fold Cast(s as date/datetime) on FE (#24353)
cast("20210101" as Date) -> DateLiteral(2021, 1, 1)
2023-09-14 22:08:26 +08:00
f61e6483bf [enhancement](broker-load) support compress type for old broker load, and split compress type from file format (#23882) 2023-09-14 21:42:28 +08:00
07720d3ff9 [feature](replica version) Add admin set replica version statement (#23706) 2023-09-14 21:12:00 +08:00
927de33166 [config](log) disable StreamLoad log default and enable in regression pipeline (#24354)
disable StreamLoad log default and enable in regression pipeline
2023-09-14 20:47:26 +08:00
eb65cc6954 [Fix](nereids) eliminate_outer_join regression case fix #24262 2023-09-14 18:22:17 +08:00
3ee89aea35 [Feature](merge-on-write)Support ignore mode for merge-on-write unique table (#21773) 2023-09-14 18:03:51 +08:00
68b13ab50f [Fix](Full compaction) Fix full compaction by table id case (#24265) 2023-09-14 18:03:28 +08:00
9c6734e68e [bugfix](index) Fix build index limitations (#24358)
1. skip existed index on column with different id on build index
2. allow build index for CANCELED or FINISHED state
2023-09-14 17:53:22 +08:00
d035a58374 [feature](nereids) support unnest subquery in LogicalOneRowRelation (#24355)
select (select 1);
before : 
ERROR 1105 (HY000): errCode = 2, detailMessage = Subquery is not supported in the select list.
after:
mysql> select (select 1);
+---------------------------------------------------------------------+
|  (SCALARSUBQUERY) (LogicalOneRowRelation ( projects=[1 AS `1`#0] )) |
+---------------------------------------------------------------------+
|                                                                   1 |
+---------------------------------------------------------------------+
1 row in set (0.61 sec)
2023-09-14 17:22:08 +08:00
4fbb25bc55 [Enhancement](function) Support date_trunc(date) and use it in auto partition (#24341)
Support date_trunc(date) and use it in auto partition
2023-09-14 16:53:09 +08:00
b6d7116dea [fix](datetime) fix compare of DatetimeLiteral (#24343)
fix compare of DatetimeLiteral
2023-09-14 16:51:50 +08:00
4efc68a33d [fix](test)disable join reorder for test_bitmap_filter regression test (#23150)
the nereids planner may reorder the join without any statistics info. This could lead to very bad join order that cause the query timeout. This pr disable join reorder for this sql.
2023-09-14 16:05:09 +08:00
ccba5a729a [fix](planner)cast string to float like type should return NULL literal if it fails (#24222) 2023-09-14 15:59:20 +08:00
1ef22d7f7c [Feature](variant) add variant type (#24170)
Add variant type for metadata Add persistent information for variant, including the path of variant sub-columns, persisting them to the segment footer and tablet schema of the rowset.
2023-09-14 14:21:53 +08:00
268c867679 [Improve](serde)replace function_cast from_string to serde (#24087)
Now we can not support streamload with column which is map/array nested map/array
serde can do this now , so we can replace it
Notice. if item data in complex type data is empty we just return error, instead of makeup default value , because now we can not define right default for complex type
2023-09-14 13:53:16 +08:00
d23d1870a2 [fix](Nereids): fix regression-test (#24329) 2023-09-14 13:21:50 +08:00
ed108d48fa [fix](invert index) fix query use char filter (#24268) 2023-09-14 11:42:47 +08:00
46f5988245 [fix](Nereids) set operation children output order not same (#24060)
we generate project for all set operation's children to ensure the order
of all children are not changed. However, some rules, such as
PushDownProjectThroughLimit could remove these projects involuntarily.
When it happen, the column order is wrong and lead to BE core dump.
This PR use a new variable in SetOperation to save the output order of
children of set operation. Then the children's output order could be
changed and never affect to SetOperation at all.
2023-09-14 11:09:58 +08:00
1a4929b59e [fix](planner) having clause analyze bug #24288 2023-09-14 09:54:09 +08:00
9b7f041bea [Bug](function) fix explode_json_array_int can't handle min/max values (#24284)
the json str get value maybe beyond max/min of Int64,
so add some check to limit the value, and return the max/min of Int64
2023-09-14 09:20:59 +08:00
93a9f1007c [fix](Nereids): fix regression test (#24336)
fix failed regression test by #23842
2023-09-14 01:55:09 +08:00
11afd321cb [fix](es catalog) fix issue with select and insert from es catalog core (#24318)
Issue Number: close #24315

The root cause of this issue is that Elasticsearch's long type allows inserting floats and strings. Doris did not handle these cases when doing type conversion. The current strategy is to take the integer before the decimal point if a float or string is found.
2023-09-13 23:07:31 +08:00
d5b490b2e7 [test](regression) add file cache regression test (#24192)
Add file cache regression test in tpch 1g on orc&parquet format.
tpch will run 3 times:
1. running without file cache
2. running with file cache for the first time
3. running with file cache for the second time

The file cache configuration is already added in `be/conf/be.conf` on the regression test environment, and the available capacity is 100MB. After running the tpch 1g test, the metrics introduced by https://github.com/apache/doris/pull/19177 is like:
```
doris_be_file_cache_normal_queue_curr_size{path="/mnt/datadisk1/gaoxin/file_cache"} 92808933
doris_be_file_cache_normal_queue_curr_elements{path="/mnt/datadisk1/gaoxin/file_cache"} 59
doris_be_file_cache_normal_queue_max_elements{path="/mnt/datadisk1/gaoxin/file_cache"} 102400
doris_be_file_cache_normal_queue_max_size{path="/mnt/datadisk1/gaoxin/file_cache"} 89128960
doris_be_file_cache_removed_elements{path="/mnt/datadisk1/gaoxin/file_cache"} 2132
doris_be_file_cache_segment_reader_cache_size{path="/mnt/datadisk1/gaoxin/file_cache"} 54
```
2023-09-13 22:59:01 +08:00
9847f7789f [Feature](Export) Export sql supports to export data of view and exrernal table (#24070)
Previously, EXPORT only supported the export of the olap table,
This pr supports the export of view table and external table.
2023-09-13 22:55:19 +08:00