Commit Graph

1783 Commits

Author SHA1 Message Date
3e686f4306 [feature](javaudf)support no input java udf (#24457)
* support no input java udf

* add license
2023-09-25 18:56:44 +08:00
90791f0b19 [fix](nereids)fix bug of exists subquery with limit clause (#24630)
create table t1(c1 int, c2 int);
create table t2(c1 int, c2 int);
insert into t1 values (1,1);
insert into t2 values (1,1);

select * from t1 where exists (select * from t2 where t1.c1 = t2.c1 limit 0);

the result should be empty set.
2023-09-25 17:15:08 +08:00
xfz
1b95ce1d93 [feature](json-function) add json_insert, json_replace, json_set functions (#24384)
[feature](json-function) add three json funcitons
2023-09-25 12:52:29 +08:00
cdc8189697 [Impro](regression-test) More test cases for function round (#24791) 2023-09-25 10:39:45 +08:00
9cd9e195d8 [test](load) add some s3 load regression test (#24399) 2023-09-23 22:40:45 +08:00
ce79711b0d [FIX](serde) fix map/array deserialize string with quote pair (#24808) 2023-09-23 01:12:20 +08:00
ac55d45f79 [Fix](topn opt) fix heap use after free when shrink in fetch phase (#24774) 2023-09-22 19:48:05 +08:00
35c0697a60 [fix](planner)SelectStmt's constructor should initialize originSelectList member (#24755)
resetSelectList method will use originSelectList to recover the origin select list. If the originSelectList is lost in constructor, the resetSelectList will fail to recover and make the analyze process fail.
2023-09-22 19:31:29 +08:00
263506f8ab [refactor](pipelineX) add MultiCast operator (#24656) 2023-09-22 15:41:14 +08:00
74bba4bdaf [enhancement](regression-test) Add routine load case (#24536) 2023-09-22 14:55:01 +08:00
320fc1481a [fix](Nereids) some expression not cast in in predicate (#24680)
1. should use castIfNotSameType in InPredicate and CaseWhen
2. StringLikeLiteral should override equals to ignore type
2023-09-22 12:58:33 +08:00
22616d125d [function](bitmap) add function alias bitmap_andnot and bitmap_andnot_count (#24771) 2023-09-22 12:18:31 +08:00
090be20ca4 [cases](regresstests) add negative case for agg table and fix agg table support replace typ… #24715
add negative case for agg table
fix agg table support replace agg type for complex type , and Now We only support complex type with agg state for replace only
fix test output
2023-09-22 09:05:20 +08:00
58ab25ccaa Revert "[Feature](merge-on-write)Support ignore mode for merge-on-write unique table (#21773)" (#24731)
This reverts commit 3ee89aea35726197cb7e94bb4f2c36bc9d50da84.
2023-09-21 21:01:28 +08:00
a48b19ceb6 [feature](Outfile) select into outfile supports to export struct/map/array type data to orc file format (#24350)
We do not support nested complex type in this pr.
2023-09-21 20:15:18 +08:00
7630fe7b7b [bug](node)fix dense_rank function in partition sort node return wrong rows (#24727) 2023-09-21 19:13:30 +08:00
5b590bbfcf [feat](Nereids) add lambda func array_last and array_first (#24682) 2023-09-21 12:23:34 +08:00
Pxl
01b7cb8db7 [Bug](materialized-view) fix failed insert into mv meet default null (#24545)
fix failed insert into mv meet default null
2023-09-21 10:47:29 +08:00
0d20a61587 [fix](nereids)left outer join estimation (#24462)
A left join B on A.x=B.x and A.y=B.y
B.x and B.y make result tuple number scale out.
suppose A is scaled out by B.x N1 time, and scaled out by B.y N2 time, and N1 < N2.
we should choose N1 as the final scale out factor, not N2.

this pr impact on tpcds_sf100 59/17/29/25/47/40/54

before
query59 77295 75279 75230 75230
query17 22642 21566 21599 21566
query29 16508 16092 16006 16006
query25 20262 20571 21171 20571
query47 23571 23264 23107 23107
query40 3305 2849 3064 2849
query54 9052 8882 8715 8715
Total cold run time: 172635 ms
Total hot run time: 168044 ms

after
query59 56435 54717 53919 53919
query17 24167 22377 23237 22377
query29 16950 18325 16333 16333
query25 21478 22975 21358 21358
query47 24412 24611 23920 23920
query40 5491 4779 5176 4779
query54 8671 8664 8658 8658
Total cold run time: 157604 ms
Total hot run time: 151344 ms
2023-09-21 09:45:01 +08:00
4ca650f306 [improvement](jdbc catalog) Adjust function replacement order and add new function support (#24685) 2023-09-21 08:45:27 +08:00
2e85e0163d Revert "[feature](function) add json->operator convert to json_extract (#19899)" (#24679)
Revert "[feature](function) add json->operator convert to json_extract (#19899)"
because it conflict with lambda syntax
This reverts commit f54a068d82e88e8535f3ed55a4224886b752e46b.
2023-09-20 21:16:19 +08:00
294062d519 [bug](function) fix width bucket function return wrong result (#24673) 2023-09-20 20:37:22 +08:00
2a260be10c [improvement](jdbc catalog) when lower_case_table_names of jdbc catalog properties is set to true, use the real table name to query the jdbc data source (#24520) 2023-09-20 17:47:11 +08:00
81e65f4a12 [feature](function) Support SHA family functions (#24342) 2023-09-20 17:21:45 +08:00
5eb8fe3d6e [improvement](type) modify the inner type display of the Array/Map/Struct type (#24459)
In the old code, when using desc command to view the table schema
It will display as follows
```
ARRAY<TINYINT(4)>
ARRAY<SMALLINT(6)>
ARRAY<INT(11)>
ARRAY<BIGINT(20)>
ARRAY<LARGEINT(40)>
```
However, for normal integer type displays, the width is not displayed
So, I changed it to the following
```
ARRAY<TINYINT>
ARRAY<SMALLINT>
ARRAY<INT>
ARRAY<BIGINT>
ARRAY<LARGEINT>
```
2023-09-20 17:03:03 +08:00
a946f99b8c [Fix](regression-test) fix regression-test of export parquet file format (#24450) 2023-09-20 15:41:49 +08:00
cf4143f3d1 [testcase](map) update regress test for array/map nested insert into nagetive cases (#24534) 2023-09-20 15:07:29 +08:00
7c32b2b4ed [Fix](broker load) broker load with or predicate error fix #24157
Co-authored-by: wangqingtao6 <wangqingtao6@jd.com>
2023-09-20 14:56:32 +08:00
fc12362a6d [feature-wip](arrow-flight)(step2) FE support Arrow Flight server (#24314)
This is a POC, the design documentation will be updated soon
2023-09-20 14:42:54 +08:00
a3361df7b9 [Feat](Nereids) support json and jsonb datatype (#24156)
Feature:
support jsonb and json type in nereids

Document:
this feature supports these two datatype in nereids optimizer like original planner, the sql reference is same as before
[JSON - Apache Doris](https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-reference/Data-Types/JSON)
2023-09-20 14:32:22 +08:00
e9435c14f8 [Improve](array-func)improve array union support multi params (#24327) 2023-09-20 14:29:48 +08:00
deafa2dd88 [fix](Nereids) fix row count unconsistent when join ordering (#24589)
In the context of reorder join, when a new plan is generated, it may include a project operation. In this case, the newly generated join root and the original join root will no longer be in the same group. To avoid inconsistencies in the statistics between these two groups, we keep the child group's row count unchanged when the parent group expression is a project operation.
2023-09-20 13:11:35 +08:00
c704497d02 [fix](csv_reader)Fixed bug when parsing multi-character delimiters. (#24572)
Fixed bug when parsing multi-character delimiters.
2023-09-20 12:41:35 +08:00
49f6eda843 [fix](nested_join) incorrect result of semi/anti mark join (#24616) 2023-09-20 10:41:06 +08:00
14bd290aec [feature](jsonb)support json_length and json_contains function (#24332) 2023-09-20 10:40:44 +08:00
e59aa49f28 [feature](datetime-func)support milliseconds_add/sub/diff and microseconds_diff (#24114) 2023-09-20 10:38:56 +08:00
a71d7f2beb [pipelineX](operator) support partition sort operator and distinct streaming agg operator (#24544) 2023-09-20 09:50:51 +08:00
527b284e90 [improvement](jdbc catalog) Extend conjunctExprToString to Support both 'AND' and 'OR' with Optimized DateLiteral Handling (#24537) 2023-09-19 23:11:44 +08:00
32c6f5f905 [opt](test) set longer timeout for hive query cache test case (#24569)
Sometimes the first run of query may be longer then former given threshold, which case test fail.
Also add a new session variable test_query_cache_hit

So that we can use it to test if cache is hit in regression test
2023-09-19 22:25:18 +08:00
c3bd2a22d4 [feature](Nereids) add many array functions (#24301)
Add function array_filter, array_sortby, array_last_index, array_first_index, array_orderby, array_count
2023-09-19 18:58:49 +08:00
c9f5142420 [Imporve](UNIX_TIMESTAMP) UNIX_TIMESTAMP func support 'yyyy-MM-dd HH:mm:ss' format (#24561)
UNIX_TIMESTAMP function data format parameter supports 'yyyy-MM-dd HH:mm:ss'
The implementation is the same as the date_format function
before:
```sql
mysql> select UNIX_TIMESTAMP('2023-09-18 00:00:00','yyyy-MM-dd HH:mm:ss');
+--------------------------------------------------------------+
| unix_timestamp('2023-09-18 00:00:00', 'yyyy-MM-dd HH:mm:ss') |
+--------------------------------------------------------------+
|                                                         NULL |
+--------------------------------------------------------------+
1 row in set (0.04 sec)
```
now:
```sql
mysql> select UNIX_TIMESTAMP('2023-09-18 00:00:00','yyyy-MM-dd HH:mm:ss');
+------------+
| 1694966400 |
+------------+
| 1694966400 |
+------------+
1 row in set (0.01 sec)
```
2023-09-19 18:41:59 +08:00
96f197114c [Improve](explode) improve explode func with array nested other type (#24455)
improve explode func with array nested other type
2023-09-18 16:05:30 +08:00
b4432ce577 [Feature](statistics)Support external table analyze partition (#24154)
Enable collect partition level stats for hive external table.
2023-09-18 14:59:26 +08:00
f3e350e8ec [Improvement](statistics)Improve statistics user experience (#24414)
Two improvements:
1. Move the `Job_id` column for the return info of `Analyze table` command to the first column. To keep consistent with `show analyze`.
```
mysql> analyze table hive.tpch100.region;
+--------+--------------+-------------------------+------------+--------------------------------+
| Job_Id | Catalog_Name | DB_Name                 | Table_Name | Columns                        |
+--------+--------------+-------------------------+------------+--------------------------------+
| 14403  | hive         | default_cluster:tpch100 | region     | [r_regionkey,r_comment,r_name] |
+--------+--------------+-------------------------+------------+--------------------------------+
1 row in set (0.03 sec)
```
2. Add `analyze_timeout` session variable, to control `analyze table/database with sync` timeout.
2023-09-18 13:36:41 +08:00
c56d3237e8 [opt](Nereids) remove canEliminate flag on LogicalProject (#24362)
since we have three infrastructure to ensure changing input column order
not lead to wrong result, we could remove this flag on LogicalProject to
eliminate project as mush as possible and let code clear.

1. output list in ResultSink node
2. regular children output in SetOperation node
3. producer to consumer slot id map in CteConsumer
2023-09-18 12:22:33 +08:00
7a8e3a6587 [fix](nereids) fix cte filter pushdown if the filters can be aggregated (#24489)
Current cte common filter extraction doesn't work if the filters can be aggregated, which will lead the common filter can't be pushed down inside cte. Consider the following case:
with main as (select c1 from t1) select * from (select m1.* from main m1, main m2 where m1.c1 = m2.c1) abc where c1 = 1;
The common c1=1 filter can't be pushed down.

This pr fixed the original extraction logic from set to list to make the logic works, and this will also resolve the tpcds query4/11's pattern works well also.
2023-09-18 11:26:55 +08:00
23a75d0277 [FIX](decimalv3) Fix decimalv3 with abnormal value same with mysql result (#24499)
Fix decimalv3 with abnormal value same with mysql result
2023-09-18 11:12:26 +08:00
4b5cea1ef8 [enhancement](fix)change ordinary type null value is \N,complex type null value is null (#24207) 2023-09-16 21:46:42 +08:00
88adab3114 [fix](Nereids): fix be core when array_map is not nullable (#24488)
fix be core when array_map is not nullable
2023-09-16 20:39:15 +08:00
ed8db3727c [feature](partial update) support MOW partial update for insert statement (#21597) 2023-09-16 17:11:59 +08:00