Commit Graph

3174 Commits

Author SHA1 Message Date
ff6fa33021 [opt](inverted index) mow supports index optimization #(#38180)
## Proposed changes

https://github.com/apache/doris/pull/37428
https://github.com/apache/doris/pull/37429

<!--Describe your changes.-->
2024-08-06 11:18:13 +08:00
bcea54147c [feature](inverted index) String type inverted index match function c… (#38872)
https://github.com/apache/doris/pull/38170
2024-08-06 09:06:05 +08:00
c7b59b38ef [fix](hist) Fix unstable result of aggregrate function hist #38608 (#38893)
cherry pick from #38608
2024-08-06 08:52:03 +08:00
65154f8abe [branch-2.1] (doris-future) Support auto partition name function (#38853)
cherry-pick https://github.com/apache/doris/pull/34258 to branch-2.1
2024-08-05 16:04:24 +08:00
607c0b82a9 [opt](serde)Optimize the filling of fixed values ​​into block columns without repeated deserialization. (#37377) (#38245) (#38810)
## Proposed changes
pick pr: #38575  and fix this pr bug :  #38245
2024-08-05 09:13:08 +08:00
2653087843 [pick](array-funcs)fix array with empty arg in be behavior (#38708)
## Proposed changes
backport: https://github.com/apache/doris/pull/36845
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-05 09:08:28 +08:00
1b3d4b4d31 [cherry-pick](branch-21)fix operator do_projections should use local_state intermediate_projections (#38612) (#38765)
## Proposed changes

cherry-pick from master https://github.com/apache/doris/pull/38612

<!--Describe your changes.-->
2024-08-05 09:07:16 +08:00
5d02c48715 [feature](hive)Support reading renamed Parquet Hive and Orc Hive tables. (#38432) (#38809)
bp #38432 

## Proposed changes
Add `hive_parquet_use_column_names` and `hive_orc_use_column_names`
session variables to read the table after rename column in `Hive`.

These two session variables are referenced from
`parquet_use_column_names` and `orc_use_column_names` of `Trino` hive
connector.

By default, these two session variables are true. When they are set to
false, reading orc/parquet will access the columns according to the
ordinal position in the Hive table definition.

For example:
```mysql
in Hive :
hive> create table tmp (a int , b string) stored as parquet;
hive> insert into table tmp values(1,"2");
hive> alter table tmp  change column  a new_a int;
hive> insert into table tmp values(2,"4");

in Doris :
mysql> set hive_parquet_use_column_names=true;
Query OK, 0 rows affected (0.00 sec)

mysql> select  * from tmp;
+-------+------+
| new_a | b    |
+-------+------+
|  NULL | 2    |
|     2 | 4    |
+-------+------+
2 rows in set (0.02 sec)

mysql> set hive_parquet_use_column_names=false;
Query OK, 0 rows affected (0.00 sec)

mysql> select  * from tmp;
+-------+------+
| new_a | b    |
+-------+------+
|     1 | 2    |
|     2 | 4    |
+-------+------+
2 rows in set (0.02 sec)
```

You can use `set
parquet.column.index.access/orc.force.positional.evolution = true/false`
in hive 3 to control the results of reading the table like these two
session variables. However, for the rename struct inside column parquet
table, the effects of hive and doris are different.
2024-08-05 09:06:49 +08:00
79b07d0b8a [fix](routine load) fix enclose and escape can not set in routine load job (#38402) (#38825)
pick (#38402)
2024-08-04 22:17:12 +08:00
8e4fad99a1 [test](routine load) add routine load case with timestamp as offset(#38567) (#38822)
pick (#38567)
2024-08-04 22:05:19 +08:00
7bdc508ac7 [Bug](fix) fix coredump case in (not null, null) execpt (not null, not null) case (#38756)
## Proposed changes

Issue Number: close #38612

<!--Describe your changes.-->
2024-08-04 10:44:10 +08:00
64b69ed1ba [branch-2.1] Picks "[opt](merge-on-write) Skip the alignment process of some rowsets in partial update #38487" (#38682)
## Proposed changes

picks https://github.com/apache/doris/pull/38487
2024-08-02 20:05:31 +08:00
556f0fc784 [pick](json-keys) support json_keys function (#38631)
## Proposed changes
backport: https://github.com/apache/doris/pull/36411
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-02 19:10:00 +08:00
9b07cd2069 [pick](json-serde)pick jsonb string deserialize with spec char (#38711)
## Proposed changes
backport: https://github.com/apache/doris/pull/37176
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-02 13:37:41 +08:00
7bcda89881 [pick](case) fix one_nested_types cases (#38723)
## Proposed changes
backport: https://github.com/apache/doris/pull/38410
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-02 12:07:14 +08:00
f21d7e3833 [test](inverted index)Add cases for inverted index format v2 (#38132)(#38443) (#38222)
## Proposed changes

backport #38132 #38443
2024-08-02 12:04:26 +08:00
84d9b2fcf4 [pick](nestedtypes) support nested type with agg replace_if_not_null (#38719)
## Proposed changes
backport: https://github.com/apache/doris/pull/38304
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-02 11:18:33 +08:00
0da388ade5 [fix](inverted index) fix match_phrase_ edge query result error #38327 (#38740) 2024-08-01 23:17:53 +08:00
4d980b8235 [feature](http action)Add http action to show nested inverted index file (#38272) (#38672)
backport #38272
2024-08-01 19:30:59 +08:00
28998300d4 [Bug](fix) fix ubsan use int32_t pointer access bool value (#38621)
## Proposed changes

Issue Number: close #38617

<!--Describe your changes.-->
2024-08-01 13:52:12 +08:00
057ee1905f [bugfix](hudi)add timetravel for nereids for 2.1 (#38324) (#38582)
## Proposed changes

bp #38324
2024-08-01 11:37:57 +08:00
338fa32303 [pick](simdjson) fix simdjson with object array when jsonroot is not empty (#38633)
## Proposed changes
backport: https://github.com/apache/doris/pull/38490
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-01 11:04:54 +08:00
41fa7bc9fd [bugfix](paimon)Fixed the reading of timestamp with time zone type data for 2.1 (#37716) (#38592)
bp: #37716
2024-08-01 10:23:06 +08:00
184b8cbbe4 [pick](json)fix jsonb deseriaze (#38630)
## Proposed changes
backport: https://github.com/apache/doris/pull/37251
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-01 10:18:27 +08:00
6bd93b119f [pick](cast)Feature cast complexttype2 json (#38632)
## Proposed changes
backport: https://github.com/apache/doris/pull/36548
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-01 09:18:15 +08:00
4c330e3fc6 [Fix](test) fix pull up literal predicate regression (#38564)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-31 22:59:08 +08:00
b21b906306 [Fix](outfile) FE check the hdfs URI of outfile (#38602)
bp: #38203

1. Previously, if the root path of the HDFS URI started with two
slashes, the outfile would be successfully exported without errors, but
the exported path would not be the expected path.
Currently, we will delete repeated '/' which specified by users in FE.

2. move the test case for outfile HDFS from p2 to p0.
2024-07-31 22:46:37 +08:00
ef8a1918c3 [case][fix](iceberg)move rest cases from p2 to p0 and fix iceberg version issue for 2.1 (#37898) (#38589)
bp: #37898
2024-07-31 22:41:56 +08:00
96413e679d [branch-2.1](mtmv) Support read sync materialized view in async materialized view (#38462)
## Proposed changes

pick #37396

<!--Describe your changes.-->

---------

Co-authored-by: liutang123 <liulijia@gmail.com>
2024-07-31 22:32:28 +08:00
66ebf709ba [Fix](inverted index) fix fast execute for not_in expr #37745 (#38594)
cherry pick from #37745
2024-07-31 19:58:12 +08:00
7730aa2170 [Fix](inverted index) fix wrong no need read data when same column in inverted index and like function #36687 (#38581)
cherry pick from #36687
2024-07-31 19:41:39 +08:00
7357d7bd3b [Update](inverted index) Add column name to debug point for "no need to read data" optimization #37649 (#38579)
cherry pick from #37649
2024-07-31 19:17:46 +08:00
aa9bdd76d0 [Pick](Variant) pick some fix #38413 #38364 (#38512) 2024-07-31 11:03:31 +08:00
9d8b2e85ae [fix](partial-update) insert only without auto_inc column should not use partial update (#38229) (#38504)
cherry-pick #38229 to branch-2.1


## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-31 11:01:08 +08:00
9e696d72f1 [fix](nereids)check functionBuilders is not null before using it (#38535)
## Proposed changes

pick from master https://github.com/apache/doris/pull/38457

<!--Describe your changes.-->
2024-07-31 11:00:39 +08:00
94111da2a9 [Fix](nereids) fix normalize repeat alias rewrite (#38166) (#38454)
cherry-pick #38166 to branch-2.1
2024-07-31 10:59:15 +08:00
017dad8c54 [fix](type)support runtime predicate for time type (#38258) (#38465)
## Proposed changes
https://github.com/apache/doris/pull/38258
Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-31 10:27:36 +08:00
742b98185c [test](Nereids) Add all hint tpcds tests (#38081) (#38368)
cherry-pick:#38081
add all cases control by leading hint of tpcds
some cases are not supported because:
scalar/in/exists subquery: 6, 9, 28, 35, 41, 45, 58, 69, 95 set
operation to joins: 14, 23, 49, 56, 60, 66, 74, 75, 76, 77, 80, 87

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-31 10:24:10 +08:00
43ec98a30b [Feat](nereids) add pull up literal when infer predicates (#37314) (#38156)
cherry-pick from master #37314
2024-07-30 17:19:18 +08:00
2f6b2dbdc4 [opt](Nereids) add where Null rule to create empty relation as where false (#38135) (#38361)
pick from master #38135 

explain shape plan select * from table2 where Null; explain shape plan
select * from table2 where false; in this case, null literal can be
regard as same as false literal
2024-07-26 14:50:06 +08:00
e2bb86e7f8 [fix](inverted index) fixed in_list condition not indexed on pipelinex (#38178)
## Proposed changes

https://github.com/apache/doris/pull/36565
https://github.com/apache/doris/pull/37842
https://github.com/apache/doris/pull/37921
https://github.com/apache/doris/pull/37386

<!--Describe your changes.-->
2024-07-25 14:42:34 +08:00
73fc55b203 [Pick](Variant) fix some issue by RQG (#38336)
#38318 
#38291
2024-07-25 12:19:07 +08:00
70cde39fe0 [cherry-pick](branch-21) fix conv function get wrong result as parse overflow (#38001) (#38309)
## Proposed changes

cherry-pick from https://github.com/apache/doris/pull/38001

<!--Describe your changes.-->
2024-07-25 12:06:46 +08:00
e9052e2180 [cherry-pick](branch-21) fix mod function cause core dump (#37999) (#38308)
## Proposed changes
cherry-pick from master https://github.com/apache/doris/pull/37999

<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-25 12:06:21 +08:00
57864e8554 [cherry-pick](branch-21) fix collect_set function core dump without arena pool (#38234) (#38307)
## Proposed changes

cherry-pick from master #38234

<!--Describe your changes.-->
2024-07-25 12:05:52 +08:00
bc7fc4106d [branch-2.1](function) fix FE impl of some time functions (#37746) (#38316)
pick https://github.com/apache/doris/pull/37746

before:
```sql
mysql> select date_ceil("2020-12-12 12:12:12.123", interval 2 second);
+-----------------------+
| '2020-12-12 12:12:12' |
+-----------------------+
| 2020-12-12 12:12:12   |
+-----------------------+
1 row in set (0.10 sec)

mysql> select CONVERT_TZ('9999-12-31 23:59:59.999999', 'Pacific/Galapagos', 'Pacific/Galapagos');
+------+
| NULL |
+------+
| NULL |
+------+
1 row in set (0.09 sec)

mysql [(none)]>select CONVERT_TZ('9999-12-31 23:59:59.999999', 'Pacific/Galapagos', 'Pacific/GalapaGoS');
+-----------------------------------------------------------------------------------------------------------+
| convert_tz(cast('9999-12-31 23:59:59.999999' as DATETIMEV2(6)), 'Pacific/Galapagos', 'Pacific/GalapaGoS') |
+-----------------------------------------------------------------------------------------------------------+
| 9999-12-31 23:59:59.999999                                                                                |
+-----------------------------------------------------------------------------------------------------------+
1 row in set (0.08 sec) --- gone to BE
```
after:
```sql
mysql> select date_ceil("2020-12-12 12:12:12.123", interval 2 second);
+------------------------------+
| '2020-12-12 12:12:14.000000' |
+------------------------------+
| 2020-12-12 12:12:14          |
+------------------------------+
1 row in set (0.11 sec)

mysql> select CONVERT_TZ('9999-12-31 23:59:59.999999', 'Pacific/Galapagos', 'Pacific/Galapagos');
+-----------------------------------------------------------------------------------------------------------+
| convert_tz(cast('9999-12-31 23:59:59.999999' as DATETIMEV2(6)), 'Pacific/Galapagos', 'Pacific/Galapagos') |
+-----------------------------------------------------------------------------------------------------------+
| 9999-12-31 23:59:59.999999                                                                                |
+-----------------------------------------------------------------------------------------------------------+
1 row in set (0.23 sec)

mysql> select CONVERT_TZ('9999-12-31 23:59:59.999999', 'Pacific/Galapagos', 'Pacific/GalapaGoS');
+------------------------------+
| '9999-12-31 23:59:59.999999' |
+------------------------------+
| 9999-12-31 23:59:59.999999   |
+------------------------------+
1 row in set (0.11 sec) --- finished in FE
```
2024-07-25 11:38:27 +08:00
21b3fc3d1e [branch-2.1](function) fix coredump for MULTI_MATCH_ANY (#37959) (#38314)
pick https://github.com/apache/doris/pull/37959

[INVALID_ARGUMENT][E33] Compile regexp expression failed. got Embedded
start anchors not supported.. some expressions may be illegal
2024-07-25 11:34:22 +08:00
79a6496bb6 [branch-2.1](function) fix wrong result when convert_tz is out of bound (#37358) (#38313)
## Proposed changes

pick https://github.com/apache/doris/pull/37358

before:
```sql
mysql> select CONVERT_TZ(cast('0000-01-01 00:00:00.00001'  as DATETIMEV1), cast('Asia/Shanghai' as VARCHAR(65533)), cast('America/Los_Angeles' as VARCHAR(65533)));
+---------------------------------------------------------------------------------------------------------------------------------------------------+
| convert_tz(cast('0000-01-01 00:00:00.00001' as DATETIME), cast('Asia/Shanghai' as VARCHAR(65533)), cast('America/Los_Angeles' as VARCHAR(65533))) |
+---------------------------------------------------------------------------------------------------------------------------------------------------+
| q535-12-31 08:01:19                                                                                                                               |
+---------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.12 sec)
```
now:
```sql
mysql> select CONVERT_TZ(cast('0000-01-01 00:00:00.00001'  as DATETIMEV1), cast('Asia/Shanghai' as VARCHAR(65533)), cast('America/Los_Angeles' as VARCHAR(65533)));
+---------------------------------------------------------------------------------------------------------------------------------------------------+
| convert_tz(cast('0000-01-01 00:00:00.00001' as DATETIME), cast('Asia/Shanghai' as VARCHAR(65533)), cast('America/Los_Angeles' as VARCHAR(65533))) |
+---------------------------------------------------------------------------------------------------------------------------------------------------+
| NULL                                                                                                                                              |
+---------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.09 sec)
```
2024-07-25 11:32:44 +08:00
8ad4390edb [fix](nereids) refine row count estimation for mark join (#38270) (#38297)
pick from master #38270
2024-07-25 10:19:13 +08:00
e23c1339a8 [fix](group commit) Fix prepare stmt setNull return too many filtered rows error (#38262) (#38276)
Pick https://github.com/apache/doris/pull/38262
2024-07-24 19:02:59 +08:00