Commit Graph

1334 Commits

Author SHA1 Message Date
443e87e203 2.1.7-rc03 (#43338)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-11-06 15:21:23 +08:00
0efca04183 2.1.7-rc02 (#43285)
### What problem does this PR solve?
<!--
You need to clearly describe your PR in this part:

1. What problem was fixed (it's best to include specific error reporting
information). How it was fixed.
2. Which behaviors were modified. What was the previous behavior, what
is it now, why was it modified, and what possible impacts might there
be.
3. What features were added. Why this function was added.
4. Which codes were refactored and why this part of the code was
refactored.
5. Which functions were optimized and what is the difference before and
after the optimization.

The description of the PR needs to enable reviewers to quickly and
clearly understand the logic of the code modification.
-->

<!--
If there are related issues, please fill in the issue number.
- If you want the issue to be closed after the PR is merged, please use
"close #12345". Otherwise, use "ref #12345"
-->
Issue Number: close #xxx

<!--
If this PR is followup a preivous PR, for example, fix the bug that
introduced by a related PR,
link the PR here
-->
Related PR: #xxx

Problem Summary:

### Check List (For Committer)

- Test <!-- At least one of them must be included. -->

    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No colde files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:

    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?

    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

- Release note

    <!-- bugfix, feat, behavior changed need a release note -->
    <!-- Add one line release note for this PR. -->
    None

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-11-05 20:17:25 +08:00
6c3d42e09a [cherry-pick](branch-21) cherry-pick pr about (#42488) (#42099) (#42055) (#42916)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-31 14:14:19 +08:00
4a62d9e44b Revert "[2.1][improvement](jdbc catalog) Add catalog property to enable jdbc connection pool" (#42481)
Reverts apache/doris#42255

We have found that after closing the connection pool, there will be
class loading problems and connection release problems for some data
sources. We will remove this function first and re-add it after solving
and testing it completely.
2024-10-25 19:37:36 +08:00
bde8e2d474 [2.1][improvement](jdbc catalog) Add catalog property to enable jdbc connection pool (#42255)
pick (#41992)

We initially introduced jdbc connection pool to improve the connection
performance of jdbc catalog, but we always found that connection pool
would bring some unexpected errors, so we chose to add a catalog
property: `enable_connection_pool` to choose whether to enable the jdbc
connection pool of jdbc catalog, and the default false.However, the
created catalog will still open the connection pool when it is upgraded,
and only the newly created catalog will be false

And we conducted performance tests on this, the performance loss is
within the expected range.

- Enable connection pool: mysqlslap -uroot -h127.0.0.1 -P9030
--concurrency=1 --iterations=100 --query='SELECT * FROM mysql.test.test
limit 1;' --create-schema=mysql --delimiter=";" --verbose
Benchmark
        Average number of seconds to run all queries: 0.008 seconds
        Minimum number of seconds to run all queries: 0.004 seconds
        Maximum number of seconds to run all queries: 0.133 seconds
        Number of clients running queries: 1
        Average number of queries per client: 1

- Disable connection pool: mysqlslap -uroot -h127.0.0.1 -P9030
--concurrency=1 --iterations=100 --query='SELECT * FROM
mysql_no_pool.test.test limit 1;' --create-schema=mysql --delimiter=";"
--verbose
Benchmark
        Average number of seconds to run all queries: 0.054 seconds
        Minimum number of seconds to run all queries: 0.047 seconds
        Maximum number of seconds to run all queries: 0.184 seconds
        Number of clients running queries: 1
        Average number of queries per client: 1
2024-10-22 23:28:28 +08:00
bbd4970ed8 [feature](jdbc catalog) support gbase jdbc catalog #41027 #41587 (#42123)
cherry pick from #41027 #41587

---------

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
2024-10-21 16:52:23 +08:00
968e33f07e [cherry-pick](branch-21) pick (#39057) (#41352) (#41958)
## Proposed changes

pick from master (#39057) (#41352)

<!--Describe your changes.-->

---------

Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com>
2024-10-17 14:30:40 +08:00
e6545a36a3 [improvement](iceberg)Parallelize splits for count(*) for 2.1 (#41169) (#41880)
bp: #41169
2024-10-16 10:52:06 +08:00
d97642e9b5 [cherry-pick](branch-21) fix tablet sink shuffle without project not match the output tuple (#40299)(#41293) (#41327)
## Proposed changes

cherry-pick from master  (#40299)(#41293)

<!--Describe your changes.-->
2024-10-15 00:12:23 +08:00
4888c632f4 [cherry-pick](branch2.1) support escape.delim and serialization.null.format for hive text (#41684)
## Proposed changes
pick from master:
https://github.com/apache/doris/pull/40291
2024-10-15 00:08:23 +08:00
f112af0fd2 [pick](branch-2.1) pick #41555 #41592 #38204 (#41781)
pick #41555 #41592 #38204
2024-10-14 14:05:08 +08:00
4ac07fe918 [Feature](json) Support json_search function in 2.1 (#41590)
cherry-pick #40948 

Like mysql, json_search returns the path which point to a json string
witch match the pattern.
`SELECT JSON_SEARCH('["A",[{"B":"1"}],{"C":"AB"},{"D":"BC"}]', 'one',
'A_') as res;`
```
+----------+
| res      |
+----------+
| "$[2].C" |
+----------+
```

Co-authored-by: liutang123 <liulijia@gmail.com>
2024-10-11 16:33:07 +08:00
8c0f73cb90 [Enhancement](MaxCompute)Refactoring maxCompute catalog using Storage API.(#40225 , #40888 ,#41386 ) (#41610)
bp #40225 , #40888 ,#41386

## Proposed changes
Among them, #40225 is the new api of mc,
#40888 is used to fix the bug when reading null between the new and old
apis,
#41386 is used for compatibility between the new and old versions
2024-10-11 11:55:41 +08:00
3120bfb6e3 [fix](pipelinex) fix fragment instance progress reports (part 2) (#40694) (#41641)
backport #40694
2024-10-10 17:49:41 +08:00
0b4552f74b [cherry-pick](branch-2.1) pick hive text write from master (#40537)
## Proposed changes
pick prs:
https://github.com/apache/doris/pull/38549
https://github.com/apache/doris/pull/40183
https://github.com/apache/doris/pull/40315

---------

Co-authored-by: Calvin Kirs <kirs@apache.org>
2024-09-27 20:57:07 +08:00
eb13cd4154 [branch-2.1] Picks "[Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update #40272" (#40964)
picks https://github.com/apache/doris/pull/40272
2024-09-26 22:54:27 +08:00
1459517568 [improvement](binlog) filter dropped indexes #41246 (#41300)
cherry pick from #41246
2024-09-26 08:38:28 +08:00
5b3b2cec80 [feat](metatable) support table$partitions for hive table (#40774) (#41230)
bp #40774
and pick part of #34552, add `isPartitionedTable()` interface in `TableIf`
2024-09-25 09:52:07 +08:00
0d38a9a36d [feature](restore) support atomic restore (#41107)
Cherry-pick #40353, #40734, #40817, #40876, #40921, #41017, #41083
2024-09-24 09:41:41 +08:00
549bc3e288 [fix](pipelinex) fix fragment instance progress reports (#40325) (#40987)
backport #40325
2024-09-19 23:58:38 +08:00
55e92da7e7 2.1.7-rc01 (#40851)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-09-14 13:58:53 +08:00
0b1d517caa [improvement](statistics)Return -1 to neredis if report olap table row count for new table is not done for all tablets. (#40457) (#40838)
backport: https://github.com/apache/doris/pull/40457
2024-09-14 13:19:35 +08:00
e3db3a2a49 [minor](thrift) Change field ID (#40845)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-09-14 11:08:11 +08:00
3395cd5ce9 [PipelineX](improvement) Prepare tasks in parallel (#40270)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-09-13 13:34:29 +08:00
3604d63184 [Branch 2.1] backport systable PR (#34384,#40153,#40456,#40455,#40568) (#40687)
backport
https://github.com/apache/doris/pull/40568
https://github.com/apache/doris/pull/40455
https://github.com/apache/doris/pull/40456
https://github.com/apache/doris/pull/40153
https://github.com/apache/doris/pull/34384

Test result:
2024-09-11 11:00:45.618 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:00:45.619 INFO [suite-thread-1] (Suite.groovy:359) -
Execute sql: REVOKE SELECT_PRIV ON
test_partitions_schema_db.duplicate_table FROM partitions_user
2024-09-11 11:00:45.625 INFO [suite-thread-1] (SuiteContext.groovy:299)
- Create new connection for user 'partitions_user'
2024-09-11 11:00:45.632 INFO [suite-thread-1] (Suite.groovy:1162) -
Execute tag: select_check_5, sql: select
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME
from information_schema.partitions where
table_schema="test_partitions_schema_db" order by
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME
2024-09-11 11:00:45.644 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:00:45.645 INFO [suite-thread-1] (ScriptContext.groovy:120)
- Run test_partitions_schema in
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy
succeed
2024-09-11 11:00:45.652 INFO [main] (RegressionTest.groovy:259) - Start
to run single scripts
2024-09-11 11:01:10.321 INFO [main] (RegressionTest.groovy:380) -
Success suites:

/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy:
group=default,p0, name=test_partitions_schema
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:459) - All
suites success.
 ____   _    ____ ____  _____ ____
|  _ \ / \  / ___/ ___|| ____|  _ \
| |_) / _ \ \___ \___ \|  _| | | | |
|  __/ ___ \ ___) |__) | |___| |_| |
|_| /_/   \_\____/____/|_____|____/

2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:410) - Test 1
suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:119) - Test
finished


2024-09-11 11:03:00.712 INFO [suite-thread-1] (Suite.groovy:1162) -
Execute tag: select_check_5, sql: select * from
information_schema.table_options ORDER BY
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,TABLE_MODEL,TABLE_MODEL_KEY,DISTRIBUTE_KEY,DISTRIBUTE_TYPE,BUCKETS_NUM,PARTITION_NUM;
2024-09-11 11:03:00.729 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:03:00.731 INFO [suite-thread-1] (ScriptContext.groovy:120)
- Run test_table_options in
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy
succeed
2024-09-11 11:03:04.817 INFO [main] (RegressionTest.groovy:259) - Start
to run single scripts
2024-09-11 11:03:28.741 INFO [main] (RegressionTest.groovy:380) -
Success suites:

/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy:
group=default,p0, name=test_table_options
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:459) - All
suites success.
 ____   _    ____ ____  _____ ____
|  _ \ / \  / ___/ ___|| ____|  _ \
| |_) / _ \ \___ \___ \|  _| | | | |
|  __/ ___ \ ___) |__) | |___| |_| |
|_| /_/   \_\____/____/|_____|____/

2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:410) - Test 1
suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:119) - Test
finished


*************************** 7. row ***************************
             PartitionId: 18035
           PartitionName: p100
          VisibleVersion: 2
      VisibleVersionTime: 2024-09-11 10:59:28
                   State: NORMAL
            PartitionKey: col_1
Range: [types: [INT]; keys: [83647]; ..types: [INT]; keys: [2147483647];
)
         DistributionKey: pk
                 Buckets: 10
          ReplicationNum: 1
           StorageMedium: HDD
            CooldownTime: 9999-12-31 15:59:59
     RemoteStoragePolicy: 
LastConsistencyCheckTime: NULL
                DataSize: 2.872 KB
              IsInMemory: false
       ReplicaAllocation: tag.location.default: 1
               IsMutable: true
      SyncWithBaseTables: true
            UnsyncTables: NULL
        CommittedVersion: 2
                RowCount: 4
7 rows in set (0.01 sec)

---------

Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
2024-09-12 11:50:09 +08:00
361a59dec8 [feature](aes_encrypt) support GCM mode for aes_encrypt and aes_decrypt (#40004) (#40672)
pick #40004 to branch-2.1
2024-09-11 23:28:28 +08:00
4e453dc1bb Revert "[improvement](statistics)Return -1 to neredis if report olap table row count for new table is not done for all tablets. (#40457)" (#40616)
Reverts apache/doris#40540
2024-09-10 17:17:13 +08:00
e43e6e2bba [improvement](statistics)Return -1 to neredis if report olap table row count for new table is not done for all tablets. (#40457) (#40540)
backport: https://github.com/apache/doris/pull/40457
2024-09-10 12:55:53 +08:00
a3eba2aad5 [fix](auth) ordinary users can see the processes of other users (#39747) (#40415)
pick: https://github.com/apache/doris/pull/39747
2024-09-09 11:13:18 +08:00
a963709fed [opt](scanner) Control the degree of parallelism of scanner when only limit involved #39927 (#40357)
cherry pick from #39927
2024-09-09 10:42:19 +08:00
653e315ba5 2.1.6-rc04 (#40491)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-09-06 19:41:37 +08:00
b4beec8ea8 [fix](OrcWriter) fix be core when upgrading BE without upgrading FE (#40303)
bp: #40282
2024-09-04 10:24:41 +08:00
92752b90e7 [feature](metacache) add system table catalog_meta_cache_statistics #40155 (#40210)
bp #40155
2024-09-02 23:23:35 +08:00
70daa1f85d [opt](inverted index) Controls whether the in_list can execute fast_execute. (#40141)
https://github.com/apache/doris/pull/40022
2024-08-30 10:32:43 +08:00
ca07a00c93 Revert "[branch-2.1](hive) support hive write text table (#38549) (#4… (#40157)
…0063)"

This reverts commit c6df7c21a3c09ae1664deabacb88dfcea9d94b68.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-08-30 10:25:38 +08:00
c6df7c21a3 [branch-2.1](hive) support hive write text table (#38549) (#40063)
1. Support write hive text table
2. Add SessionVariable `hive_text_compression` to write compressed hive
text table
3. Supported compression type: gzip, bzip2, snappy, lz4, zstd

pick from https://github.com/apache/doris/pull/38549
2024-08-29 16:50:40 +08:00
173aafc86f [Enhancement] add information_schema.table_properties #38745 (#38746) (#39886)
bp #38746

---------

Co-authored-by: Vallish Pai <vallishpai@gmail.com>
2024-08-27 17:22:19 +08:00
8dc3f3f347 [fix](inverted index) Fix Session Variable Compatibility (#39939) 2024-08-27 10:17:47 +08:00
d563621f6e [enhancement](thrift) add value number to thrift definition for enum type (#39880) (#39896)
Issue Number: close #xxx
 cherry pick : https://github.com/apache/doris/pull/39880

---------

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-08-26 08:07:57 +08:00
44b80fb03b [fix](inverted index) Fix Session Variable Compatibility (#39884)
https://github.com/apache/doris/pull/39889
2024-08-25 08:42:36 +08:00
e0534c9bfc [bugfix](thrift) the definition number should consistent with master branch (#39879)
## Proposed changes
introduced by pr https://github.com/apache/doris/pull/35103

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-08-25 00:22:19 +08:00
2dea859bdb [debug](rpc) debug rpc time consumption problem (#39852)
## Proposed changes

Issue Number: close #xxx

Add detail RPC time info for each channel, sorted by max rpc time of
channels:
```
                     DATA_STREAM_SINK_OPERATOR  (id=1,dst_id=1):
                          -  Partitioner:  Crc32HashPartitioner(64)
                          -  BlocksProduced:  74
                          -  BrpcSendTime:  2.689us
                          -  BrpcSendTime.Wait:  0ns
                          -  BytesSent:  89.35  KB
                          -  CloseTime:  680.152us
                          -  CompressTime:  0ns
                          -  ExecTime:  160.663ms
                          -  InitTime:  263.608us
                          -  InputRows:  32.512K  (32512)
                          -  LocalBytesSent:  0.00  
                          -  LocalSendTime:  0ns
                          -  LocalSentRows:  0
                          -  MemoryUsage:  
                              -  PeakMemoryUsage:  80.00  KB
                          -  MergeBlockTime:  0ns
                          -  OpenTime:  4.113ms
                          -  OverallThroughput:  0.0  /sec
                          -  PendingFinishDependency:  41.179ms
                          -  RowsProduced:  32.512K  (32512)
                          -  RpcAvgTime:  11.850ms
                          -  RpcCount:  10
                          -  RpcMaxTime:  86.891ms
                          -  RpcMinTime:  15.200ms
                          -  RpcSumTime:  118.503ms
                          -  SerializeBatchTime:  13.517ms
                          -  SplitBlockDistributeByChannelTime:  38.923ms
                          -  SplitBlockHashComputeTime:  2.659ms
                          -  UncompressedRowBatchSize:  135.19  KB
                          -  WaitForDependencyTime:  0ns
                              -  WaitForRpcBufferQueue:  0ns
                        RpcInstanceDetails:
                              -  Instance  85d4f75b72a9ea61:  Count:  4,  MaxTime:  36.238ms,  MinTime:  12.107ms,  AvgTime:  21.722ms,  SumTime:  86.891ms
                              -  Instance  85d4f75b72a9ea91:  Count:  3,  MaxTime:  11.107ms,  MinTime:  2.431ms,  AvgTime:  5.470ms,  SumTime:  16.412ms
                              -  Instance  85d4f75b72a9eac1:  Count:  3,  MaxTime:  7.554ms,  MinTime:  3.160ms,  AvgTime:  5.066ms,  SumTime:  15.200m
```
2024-08-24 19:59:39 +08:00
14a2a66106 [fix](paimon) fix not able to read paimon data from hdfs with HA (#39806) (#39876)
bp #39806
2024-08-24 17:51:15 +08:00
76596e5f73 [fix](thrift) fix wrong order of field 27 and 28 in TCreateTabletReq thrift (#39873)
## Proposed changes

fix wrong order of field 27 and 28 in `TCreateTabletReq` thrift
introduced by #32418 and 0cde0cbf19011bc8d421add4734d7cd57308973f.

`TCreateTabletReq` is used for creating tablet, so this PR will fix
problem creating tablet when upgrading 2.0.x -> 2.1.4/2.1.5 or
2.1.4/2.1.5 -> 3.0.x, BUT will cause problem creating tablet when
upgrading 2.1.4/2.1.5 -> 2.1.6+.

master and branch-2.0 
```
    27: optional i64 time_series_compaction_level_threshold = 1
    28: optional TInvertedIndexStorageFormat inverted_index_storage_format = TInvertedIndexStorageFormat.V1
```

branch-2.1 (affects 2.1.4 and 2.1.5)
```
    27: optional TInvertedIndexStorageFormat inverted_index_storage_format = TInvertedIndexStorageFormat.V1
    28: optional i64 time_series_compaction_level_threshold = 1
```
2024-08-24 16:02:08 +08:00
6ceb574aa0 [branch-2.1]Pick IO limit/workload group usage table (#39839) 2024-08-23 18:51:47 +08:00
7a7292ad5a [branch-2.1][Refactor]use async to get be resource (#38389) (#39826)
pick #38389
2024-08-23 17:16:19 +08:00
0eadfbefc6 [Fix](branch-2.1) Fix wrong thrift index introduced by #37830 (#39824)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-23 15:05:35 +08:00
8bbd3db1bc [branch-2.1](thrift) fix TLoadTxnBeginRequest backend_id's field no (#39823)
Make backend_id's field no the same with master branch.

For upgrading, change backend_id's field no is safe, because old fe can
torrent with TLoadTxnBeginRequest not setting backend id.

backend_id was introduce by #36437
2024-08-23 12:24:55 +08:00
1c566253a8 [Pick][Improment]Query queued by be memory (#37559) (#39733)
pick #37559
2024-08-22 15:14:47 +08:00
ed9794a0fe [Pick][Improment]publish workload to BE by tag (#38486) (#39730)
A workload group's tag property may be three cases as below: 1 empty
string, null or '', it could be published to all BE. 2 a value match
some BE' location, then the workload group could only be published to
the BE with same tag.
3 not an empty string, but some invalid string which can not math any
BE's location, then it could not be published any BE.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-22 00:48:16 +08:00