Commit Graph

3439 Commits

Author SHA1 Message Date
e175c63d41 [Fix](inverted index) Fix wrong need read data opt when enable_common_expr_pushdown is disabled #40689 (#41101)
cherry pick from #40689
2024-09-23 14:21:30 +08:00
0e5c4281dc [fix](function) fix Substring/SubReplace error result with input utf8… (#40954)
… string (#40929)
https://github.com/apache/doris/pull/40929
```

mysql [(none)]>select sub_replace("你好世界","a",1);
+-------------------------------------+
| sub_replace('你好世界', 'a', 1)     |
+-------------------------------------+
| �a�好世界                             |
+-------------------------------------+

mysql [(none)]>select SUBSTRING('中文测试',5);
+------------------------------------------+
| substring('中文测试', 5, 2147483647)     |
+------------------------------------------+
| 中文测试                                 |
+------------------------------------------+
1 row in set (0.04 sec)

now
mysql [(none)]>select sub_replace("你好世界","a",1);
+-------------------------------------+
| sub_replace('你好世界', 'a', 1)     |
+-------------------------------------+
| 你a世界                             |
+-------------------------------------+
1 row in set (0.05 sec)

mysql [(none)]>select SUBSTRING('中文测试',5);
+------------------------------------------+
| substring('中文测试', 5, 2147483647)     |
+------------------------------------------+
|                                          |
+------------------------------------------+
1 row in set (0.13 sec)
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-09-23 10:44:03 +08:00
7d64c8cbc6 [branch-2.1] Picks "[opt](autoinc) Remove some restrictions on schema change on table that has auto-increment column #40280" (#41096)
picks https://github.com/apache/doris/pull/40280
2024-09-23 09:30:15 +08:00
9dc55f90eb [opt](nereids) set lower bound for range-selectivity(2.1) (#41061)
## Proposed changes
pick #40089
Issue Number: close #xxx

<!--Describe your changes.-->
2024-09-22 07:32:22 +08:00
9877a08834 [feature](function) support ngram_search function #38226 (#40893)
https://github.com/apache/doris/pull/38226 
mysql [test]>select ngram_search('123456789' , '12345' , 3);
+---------------------------------------+
| ngram_search('123456789', '12345', 3) |
+---------------------------------------+
|                                   0.6 |
+---------------------------------------+
1 row in set (0.01 sec)

mysql [test]>select ngram_search("abababab","babababa",2);
+-----------------------------------------+
| ngram_search('abababab', 'babababa', 2) |
+-----------------------------------------+
|                                       1 |
+-----------------------------------------+
1 row in set (0.01 sec)
```

doc https://github.com/apache/doris-website/pull/899

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-09-21 20:34:44 +08:00
d5115a21b5 [pick](ShortCircuit) Conjuncts outof key columns's order should be handled (#41071)
#37900
2024-09-21 20:34:05 +08:00
c744eb87c5 [fix](regression)fix some regression test (#40928) (#41046)
bp #40928
2024-09-20 18:17:44 +08:00
1259fe2bd5 [fix](covar) Fix covar nullable on branch-2.1 (#40841)
covar should not be always nullable.

This fix on branch-2.1 makes covar same with master on FE.
2024-09-20 17:35:27 +08:00
8e860a26a7 [fix](systable) fix unstable case for partitions table (#40553) (#41043)
bp #40553
2024-09-20 17:13:30 +08:00
6539c8fd35 [fix](decimal) throw overflow exception if result is NaN of Infinit when converting from decimal to float (#40290) (#41007)
## Proposed changes

Issue Number: close #xxx

BP #40290
2024-09-20 14:05:51 +08:00
e0fac66223 [branch-2.1](fix) fix snappy decompressor bug (#40862)
## Proposed changes
Hadoop snappycodec source :

https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/codec/SnappyCodec.cc
Example:
OriginData(The original data will be divided into several large data
block.) :
     large data block1 | large data block2 | large data block3 | ....
The large data block will be divided into several small data block.
Suppose a large data block is divided into three small blocks:
large data block1: | small block1 | small block2 | small block3 |
CompressData: <A [B1 compress(small block1) ] [B2 compress(small block1)
] [B3 compress(small block1)]>

A : original length of the current block of large data block.
sizeof(A) = 4 bytes.
A = length(small block1) + length(small block2) + length(small block3)
Bx : length of  small data block bx.
sizeof(Bx) = 4 bytes.
Bx = length(compress(small blockx))
2024-09-20 11:57:14 +08:00
64880a10d6 [branch-2.1] Picks "[Fix](partial update) Fix partial update failed when merge_type=MERGE #40730" (#40951)
picks https://github.com/apache/doris/pull/40730
2024-09-20 00:02:17 +08:00
5f583fa329 [branch-2.1][test](jdbc catalog) add oceanbase ce jdbc catalog test (#40978)
pick #34972)
2024-09-19 22:11:24 +08:00
b8bc9b699c [fix](scan) Incorrect scan keys lead to wrong query results. (#40814) (#40971)
## Proposed changes
pick #40814
```
mysql [doris_14555]>select * from table_9436528_3;
+------+------+------+------+------------------------+--------------------+------+
| col1 | col2 | col3 | col5 | col4                   | col6               | col7 |
+------+------+------+------+------------------------+--------------------+------+
| -100 |    1 |  -82 |    1 | 2024-02-16 04:37:37.00 | -1299962421.904282 | NULL |
| -100 |    1 |   92 |    1 | 2024-02-16 04:37:37.00 |   23423423.0324234 | NULL |
| -100 |    0 |  -82 |    0 | 2023-11-11 10:49:43.00 |   840968969.872149 | NULL |
```
wrong result:
```
mysql [doris_14555]>select * from table_9436528_3 where col1 <= -100 and col2 in (true, false) and col3 = -82;
+------+------+------+------+------------------------+--------------------+------+
| col1 | col2 | col3 | col5 | col4                   | col6               | col7 |
+------+------+------+------+------------------------+--------------------+------+
| -100 |    1 |  -82 |    1 | 2024-02-16 04:37:37.00 | -1299962421.904282 | NULL |
| -100 |    1 |   92 |    1 | 2024-02-16 04:37:37.00 |   23423423.0324234 | NULL |
+------+------+------+------+------------------------+--------------------+------+
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-09-19 22:01:02 +08:00
8302261dd2 [Fix](nereids) set all nullable aggregate function to alwaysnullable in window expression (#40693) (#40809)
cherry-pick from master #40693
2024-09-19 15:19:06 +08:00
f483a7605c [fix](nestedtypes) fix nested type with is_exclusive (#40434)
is_exclusive in column_array/map/struct has wrong semantic , we should
make sure it's nested column is also is_exclusive which can make
behavior right in some operator like join.

## Proposed changes
backport: https://github.com/apache/doris/pull/40398
Issue Number: close #xxx

<!--Describe your changes.-->
2024-09-19 12:28:51 +08:00
84f0b1fbfe [feature](view) support create or replace view stmt (#40715) (#40915)
pick #40715 to branch-2.1
2024-09-19 01:10:43 +08:00
774efe78e6 [fix](regression)fix maxcompute p0 case (#40933)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-09-19 01:09:53 +08:00
4511d3e900 [cherry-pick](branch2.1) fix unstable case of partitions (#40886)
## Proposed changes

Issue Number: close #xxx
backport #40861
2024-09-18 09:49:24 +08:00
148f385901 [fix](tests) Fix export p2 2.1 (#40852)
bp: #40198
2024-09-15 21:38:43 +08:00
f3b1f1c19b [fix](encrypt) wrong mode arg of encrypt and decrypt function make BE crash (#40726) (#40868)
pick #40726 to branch-2.1
2024-09-15 21:31:00 +08:00
ea4d166edb [branch-2.1] Picks "[opt](partial update) Allow to only specify key columns in partial update #40736" (#40863)
picks https://github.com/apache/doris/pull/40736
2024-09-15 18:57:43 +08:00
cecd214345 [branch-2.1](Column) refactor ColumnNullable to provide flags safety (#40769) (#40848)
pick https://github.com/apache/doris/pull/40769

Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
2024-09-14 16:27:43 +08:00
963415ce45 [test](case) add some test case for encrypt/decrypt functions (#40427) (#40847)
pick #40427 to branch-2.1

Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
2024-09-14 16:26:39 +08:00
9a79edca84 [cherry-pick](branch-21) fix partition_topn not reset output rows after do_partition_topn_sort (#40761) (#40792)
## Proposed changes

cherry-pick from master https://github.com/apache/doris/pull/40761

<!--Describe your changes.-->
2024-09-14 11:15:56 +08:00
f9b79c613a [Fix](Job)Replaying logs should not modify the original information of the job (#40474) (#40808)
…

## Proposed changes
```
        JobExecutionConfiguration jobConfig = new JobExecutionConfiguration();
        jobConfig.setExecuteType(JobExecuteType.INSTANT);
        setJobConfig(jobConfig);
```
- Replaying logs should not modify the original information of the job
- Use the new optimizer to check whether the executed statement is legal

(cherry picked from commit de90051162de7004cf171bbf4d21bd95ff9f3540)

## Proposed changes

Issue Number: #40474
2024-09-13 20:47:57 +08:00
51c8b62d1c [opt](Nereids) fix several insert into related issues (#40467) (#40755)
pick from master #40467

- http_stream TVF should always generate one fragment plan
- http_stream TVF plan should not check root as scan node
- distinguish group_commit TVF with normal insert statement
- index and generate slot should based on type cast base slot
- agg_state could cast from nullable to non-nullable
- colocated and bucket scan range compute should only on scan node
2024-09-13 10:19:56 +08:00
4b7b43b5ca [bugfix](hive/iceberg)align with Hive insert overwrite table functionality (#39840) (#40724)
bp #39840
2024-09-12 19:20:15 +08:00
0f8176dee0 [fix](nereids) build agg for random distributed agg table in bindRelation phase (#40181) (#40702)
pick from master #40181
2024-09-12 14:08:50 +08:00
3604d63184 [Branch 2.1] backport systable PR (#34384,#40153,#40456,#40455,#40568) (#40687)
backport
https://github.com/apache/doris/pull/40568
https://github.com/apache/doris/pull/40455
https://github.com/apache/doris/pull/40456
https://github.com/apache/doris/pull/40153
https://github.com/apache/doris/pull/34384

Test result:
2024-09-11 11:00:45.618 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:00:45.619 INFO [suite-thread-1] (Suite.groovy:359) -
Execute sql: REVOKE SELECT_PRIV ON
test_partitions_schema_db.duplicate_table FROM partitions_user
2024-09-11 11:00:45.625 INFO [suite-thread-1] (SuiteContext.groovy:299)
- Create new connection for user 'partitions_user'
2024-09-11 11:00:45.632 INFO [suite-thread-1] (Suite.groovy:1162) -
Execute tag: select_check_5, sql: select
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME
from information_schema.partitions where
table_schema="test_partitions_schema_db" order by
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME
2024-09-11 11:00:45.644 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:00:45.645 INFO [suite-thread-1] (ScriptContext.groovy:120)
- Run test_partitions_schema in
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy
succeed
2024-09-11 11:00:45.652 INFO [main] (RegressionTest.groovy:259) - Start
to run single scripts
2024-09-11 11:01:10.321 INFO [main] (RegressionTest.groovy:380) -
Success suites:

/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy:
group=default,p0, name=test_partitions_schema
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:459) - All
suites success.
 ____   _    ____ ____  _____ ____
|  _ \ / \  / ___/ ___|| ____|  _ \
| |_) / _ \ \___ \___ \|  _| | | | |
|  __/ ___ \ ___) |__) | |___| |_| |
|_| /_/   \_\____/____/|_____|____/

2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:410) - Test 1
suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:119) - Test
finished


2024-09-11 11:03:00.712 INFO [suite-thread-1] (Suite.groovy:1162) -
Execute tag: select_check_5, sql: select * from
information_schema.table_options ORDER BY
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,TABLE_MODEL,TABLE_MODEL_KEY,DISTRIBUTE_KEY,DISTRIBUTE_TYPE,BUCKETS_NUM,PARTITION_NUM;
2024-09-11 11:03:00.729 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:03:00.731 INFO [suite-thread-1] (ScriptContext.groovy:120)
- Run test_table_options in
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy
succeed
2024-09-11 11:03:04.817 INFO [main] (RegressionTest.groovy:259) - Start
to run single scripts
2024-09-11 11:03:28.741 INFO [main] (RegressionTest.groovy:380) -
Success suites:

/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy:
group=default,p0, name=test_table_options
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:459) - All
suites success.
 ____   _    ____ ____  _____ ____
|  _ \ / \  / ___/ ___|| ____|  _ \
| |_) / _ \ \___ \___ \|  _| | | | |
|  __/ ___ \ ___) |__) | |___| |_| |
|_| /_/   \_\____/____/|_____|____/

2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:410) - Test 1
suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:119) - Test
finished


*************************** 7. row ***************************
             PartitionId: 18035
           PartitionName: p100
          VisibleVersion: 2
      VisibleVersionTime: 2024-09-11 10:59:28
                   State: NORMAL
            PartitionKey: col_1
Range: [types: [INT]; keys: [83647]; ..types: [INT]; keys: [2147483647];
)
         DistributionKey: pk
                 Buckets: 10
          ReplicationNum: 1
           StorageMedium: HDD
            CooldownTime: 9999-12-31 15:59:59
     RemoteStoragePolicy: 
LastConsistencyCheckTime: NULL
                DataSize: 2.872 KB
              IsInMemory: false
       ReplicaAllocation: tag.location.default: 1
               IsMutable: true
      SyncWithBaseTables: true
            UnsyncTables: NULL
        CommittedVersion: 2
                RowCount: 4
7 rows in set (0.01 sec)

---------

Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
2024-09-12 11:50:09 +08:00
361a59dec8 [feature](aes_encrypt) support GCM mode for aes_encrypt and aes_decrypt (#40004) (#40672)
pick #40004 to branch-2.1
2024-09-11 23:28:28 +08:00
ebe031c019 [fix](inverted index) Fix match_regexp to correctly handle empty string patterns (#40659)
https://github.com/apache/doris/pull/39503
2024-09-11 18:10:33 +08:00
8708fae420 [fix](ES Catalog)Support parse single value for array column (#40614) (#40660)
bp #40614
2024-09-11 17:26:48 +08:00
d554f600bc [branch-2.1](partition) Support use Auto and Dynamic partition at the same time (#39580) (#40649)
pick https://github.com/apache/doris/pull/39580
2024-09-11 15:35:20 +08:00
3246baa451 [branch-2.1](function) Refine crypto functions signature to fix wrong result(#40285) (#40648)
pick https://github.com/apache/doris/pull/40285
2024-09-11 15:32:19 +08:00
86647df45b [fix] (inverted index) fix error result in complex compound expr (#40630)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-09-11 15:27:40 +08:00
db8fb66dba [fix](mtmv)fix nested mtmv not refresh (#40433) (#40560)
pick: https://github.com/apache/doris/pull/40433
2024-09-10 11:51:41 +08:00
8eda15ae16 [opt](routine load) support routine load perceived schema change (#39412) (#40508)
pick #39412

At present, if the table structure changes, the routine load cannot
perceive it. As a long-running load, it should be able to perceive the
changes in the table structure.
2024-09-10 11:05:58 +08:00
f69063ea87 [Fix](Variant) use uinque id to access column reader (#39841) (#40269)
#39841
#40295
2024-09-09 18:01:12 +08:00
8f37eccbf2 [Cherry-pick](branch-2.1) Pick "[Featrue](default value) Support bitmap_empty default value (#40364)" (#40487)
## Proposed changes

Pick #40364 

<!--Describe your changes.-->
2024-09-09 16:57:38 +08:00
44a7efff4f [branch-2.1] Picks "[Opt](delete) Skip newly inserted rows check in non-strict mode partial update if the row's delete sign is marked #40322" (#40383)
picks https://github.com/apache/doris/pull/40322
2024-09-09 16:32:24 +08:00
314f6ae823 [fix](ES Catalog)Fix int parse error when querying by doc_values (#40385) (#40521)
bp #40385
2024-09-09 14:29:21 +08:00
c32d9a129a [test](mtmv) SSB mv rewrite test use little data set for test performance (#40188) (#40437)
## Proposed changes

commitId: 0baa9366
pr: https://github.com/apache/doris/pull/40188
2024-09-09 11:23:47 +08:00
a67f20f073 [opt](mtmv) Support to contain select constant clause when create async materialized view (#40244) (#40435)
## Proposed changes
commitId: 518a0fc0
pr: https://github.com/apache/doris/pull/40244
2024-09-09 11:23:15 +08:00
ecb75c2e7d [fix](mtmv) Mtmv support set both immediate and starttime (#39573) (#40418)
pick: https://github.com/apache/doris/pull/39573
2024-09-09 11:13:51 +08:00
2023eab11e [Fix](ShortCircuit) consider delete sign flag when hits row (#40300) (#40408)
https://github.com/apache/doris/pull/40300
2024-09-09 10:04:05 +08:00
962c382077 [fix](jdbc catalog) Fix type recognition error when using query tvf to query doris (#40481)
pick  (#40122)

Using string to match Doris type will not work with query tvf, so use
field matching instead
2024-09-06 19:30:32 +08:00
8104b992d1 [fix](ES Catalog)Do not extract doc_values of field with ignore_above setting (#40314) (#40464)
bp #40314
2024-09-06 16:25:30 +08:00
cb0613e249 [fix] (inverted index) fix error result in compound query (#40425)
## Proposed changes

`select count() from table where a + b > 0 or b > 0`


![image](https://github.com/user-attachments/assets/df56bb36-660d-4b4f-8e38-4eebcaa09e51)



- When _execute_predicates_except_leafnode_of_andnode is executed, an
Expr tree is traversed from bottom to top. When it reaches the leaf node
b, the information of this column b is placed into new_predicate_info.

- However, this step is skipped directly at an ADD node, which leads to
the GT node at the upper level generating a sign equivalent to b > 0,
the same as the sign on the right side b > 0.

- This causes the compound OR calculation to assume that both GT
conditions below have been evaluated, thus prematurely computing this
EXPR, when in fact, the ADD node has not been evaluated.

- If the SQL is written as SELECT COUNT(*) FROM table WHERE b + a > 0 OR
b > 0, the calculation would be correct because the sign generated by
this > node would be equivalent to a > 0, which is different from b > 0
on the right side.
2024-09-06 10:27:59 +08:00
0928c9c6ed [fix](unary function) Fix wrong result of asin, acos and sqrt when processing invalid input #40267 (#40358)
cherry pick from #40267
2024-09-05 19:51:01 +08:00