4a62d9e44b
Revert "[2.1][improvement](jdbc catalog) Add catalog property to enable jdbc connection pool" ( #42481 )
...
Reverts apache/doris#42255
We have found that after closing the connection pool, there will be
class loading problems and connection release problems for some data
sources. We will remove this function first and re-add it after solving
and testing it completely.
2024-10-25 19:37:36 +08:00
bde8e2d474
[2.1][improvement](jdbc catalog) Add catalog property to enable jdbc connection pool ( #42255 )
...
pick (#41992 )
We initially introduced jdbc connection pool to improve the connection
performance of jdbc catalog, but we always found that connection pool
would bring some unexpected errors, so we chose to add a catalog
property: `enable_connection_pool` to choose whether to enable the jdbc
connection pool of jdbc catalog, and the default false.However, the
created catalog will still open the connection pool when it is upgraded,
and only the newly created catalog will be false
And we conducted performance tests on this, the performance loss is
within the expected range.
- Enable connection pool: mysqlslap -uroot -h127.0.0.1 -P9030
--concurrency=1 --iterations=100 --query='SELECT * FROM mysql.test.test
limit 1;' --create-schema=mysql --delimiter=";" --verbose
Benchmark
Average number of seconds to run all queries: 0.008 seconds
Minimum number of seconds to run all queries: 0.004 seconds
Maximum number of seconds to run all queries: 0.133 seconds
Number of clients running queries: 1
Average number of queries per client: 1
- Disable connection pool: mysqlslap -uroot -h127.0.0.1 -P9030
--concurrency=1 --iterations=100 --query='SELECT * FROM
mysql_no_pool.test.test limit 1;' --create-schema=mysql --delimiter=";"
--verbose
Benchmark
Average number of seconds to run all queries: 0.054 seconds
Minimum number of seconds to run all queries: 0.047 seconds
Maximum number of seconds to run all queries: 0.184 seconds
Number of clients running queries: 1
Average number of queries per client: 1
2024-10-22 23:28:28 +08:00
bbd4970ed8
[feature](jdbc catalog) support gbase jdbc catalog #41027 #41587 ( #42123 )
...
cherry pick from #41027 #41587
---------
Co-authored-by: zy-kkk <zhongyk10@gmail.com >
2024-10-21 16:52:23 +08:00
968e33f07e
[cherry-pick](branch-21) pick ( #39057 ) ( #41352 ) ( #41958 )
...
## Proposed changes
pick from master (#39057 ) (#41352 )
<!--Describe your changes.-->
---------
Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com >
2024-10-17 14:30:40 +08:00
e6545a36a3
[improvement](iceberg)Parallelize splits for count(*) for 2.1 ( #41169 ) ( #41880 )
...
bp: #41169
2024-10-16 10:52:06 +08:00
d97642e9b5
[cherry-pick](branch-21) fix tablet sink shuffle without project not match the output tuple ( #40299 )( #41293 ) ( #41327 )
...
## Proposed changes
cherry-pick from master (#40299 )(#41293 )
<!--Describe your changes.-->
2024-10-15 00:12:23 +08:00
4888c632f4
[cherry-pick](branch2.1) support escape.delim and serialization.null.format for hive text ( #41684 )
...
## Proposed changes
pick from master:
https://github.com/apache/doris/pull/40291
2024-10-15 00:08:23 +08:00
8c0f73cb90
[Enhancement](MaxCompute)Refactoring maxCompute catalog using Storage API.( #40225 , #40888 ,#41386 ) ( #41610 )
...
bp #40225 , #40888 ,#41386
## Proposed changes
Among them, #40225 is the new api of mc,
#40888 is used to fix the bug when reading null between the new and old
apis,
#41386 is used for compatibility between the new and old versions
2024-10-11 11:55:41 +08:00
3120bfb6e3
[fix](pipelinex) fix fragment instance progress reports (part 2) ( #40694 ) ( #41641 )
...
backport #40694
2024-10-10 17:49:41 +08:00
0b4552f74b
[cherry-pick](branch-2.1) pick hive text write from master ( #40537 )
...
## Proposed changes
pick prs:
https://github.com/apache/doris/pull/38549
https://github.com/apache/doris/pull/40183
https://github.com/apache/doris/pull/40315
---------
Co-authored-by: Calvin Kirs <kirs@apache.org >
2024-09-27 20:57:07 +08:00
1459517568
[improvement](binlog) filter dropped indexes #41246 ( #41300 )
...
cherry pick from #41246
2024-09-26 08:38:28 +08:00
5b3b2cec80
[feat](metatable) support table$partitions for hive table ( #40774 ) ( #41230 )
...
bp #40774
and pick part of #34552 , add `isPartitionedTable()` interface in `TableIf`
2024-09-25 09:52:07 +08:00
0d38a9a36d
[feature](restore) support atomic restore ( #41107 )
...
Cherry-pick #40353 , #40734 , #40817 , #40876 , #40921 , #41017 , #41083
2024-09-24 09:41:41 +08:00
549bc3e288
[fix](pipelinex) fix fragment instance progress reports ( #40325 ) ( #40987 )
...
backport #40325
2024-09-19 23:58:38 +08:00
0b1d517caa
[improvement](statistics)Return -1 to neredis if report olap table row count for new table is not done for all tablets. ( #40457 ) ( #40838 )
...
backport: https://github.com/apache/doris/pull/40457
2024-09-14 13:19:35 +08:00
e3db3a2a49
[minor](thrift) Change field ID ( #40845 )
...
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-09-14 11:08:11 +08:00
3395cd5ce9
[PipelineX](improvement) Prepare tasks in parallel ( #40270 )
...
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-09-13 13:34:29 +08:00
3604d63184
[Branch 2.1] backport systable PR (#34384,#40153,#40456,#40455,#40568) ( #40687 )
...
backport
https://github.com/apache/doris/pull/40568
https://github.com/apache/doris/pull/40455
https://github.com/apache/doris/pull/40456
https://github.com/apache/doris/pull/40153
https://github.com/apache/doris/pull/34384
Test result:
2024-09-11 11:00:45.618 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:00:45.619 INFO [suite-thread-1] (Suite.groovy:359) -
Execute sql: REVOKE SELECT_PRIV ON
test_partitions_schema_db.duplicate_table FROM partitions_user
2024-09-11 11:00:45.625 INFO [suite-thread-1] (SuiteContext.groovy:299)
- Create new connection for user 'partitions_user'
2024-09-11 11:00:45.632 INFO [suite-thread-1] (Suite.groovy:1162) -
Execute tag: select_check_5, sql: select
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME
from information_schema.partitions where
table_schema="test_partitions_schema_db" order by
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME
2024-09-11 11:00:45.644 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:00:45.645 INFO [suite-thread-1] (ScriptContext.groovy:120)
- Run test_partitions_schema in
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy
succeed
2024-09-11 11:00:45.652 INFO [main] (RegressionTest.groovy:259) - Start
to run single scripts
2024-09-11 11:01:10.321 INFO [main] (RegressionTest.groovy:380) -
Success suites:
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy:
group=default,p0, name=test_partitions_schema
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:459) - All
suites success.
____ _ ____ ____ _____ ____
| _ \ / \ / ___/ ___|| ____| _ \
| |_) / _ \ \___ \___ \| _| | | | |
| __/ ___ \ ___) |__) | |___| |_| |
|_| /_/ \_\____/____/|_____|____/
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:410) - Test 1
suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:119) - Test
finished
2024-09-11 11:03:00.712 INFO [suite-thread-1] (Suite.groovy:1162) -
Execute tag: select_check_5, sql: select * from
information_schema.table_options ORDER BY
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,TABLE_MODEL,TABLE_MODEL_KEY,DISTRIBUTE_KEY,DISTRIBUTE_TYPE,BUCKETS_NUM,PARTITION_NUM;
2024-09-11 11:03:00.729 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:03:00.731 INFO [suite-thread-1] (ScriptContext.groovy:120)
- Run test_table_options in
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy
succeed
2024-09-11 11:03:04.817 INFO [main] (RegressionTest.groovy:259) - Start
to run single scripts
2024-09-11 11:03:28.741 INFO [main] (RegressionTest.groovy:380) -
Success suites:
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy:
group=default,p0, name=test_table_options
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:459) - All
suites success.
____ _ ____ ____ _____ ____
| _ \ / \ / ___/ ___|| ____| _ \
| |_) / _ \ \___ \___ \| _| | | | |
| __/ ___ \ ___) |__) | |___| |_| |
|_| /_/ \_\____/____/|_____|____/
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:410) - Test 1
suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:119) - Test
finished
*************************** 7. row ***************************
PartitionId: 18035
PartitionName: p100
VisibleVersion: 2
VisibleVersionTime: 2024-09-11 10:59:28
State: NORMAL
PartitionKey: col_1
Range: [types: [INT]; keys: [83647]; ..types: [INT]; keys: [2147483647];
)
DistributionKey: pk
Buckets: 10
ReplicationNum: 1
StorageMedium: HDD
CooldownTime: 9999-12-31 15:59:59
RemoteStoragePolicy:
LastConsistencyCheckTime: NULL
DataSize: 2.872 KB
IsInMemory: false
ReplicaAllocation: tag.location.default: 1
IsMutable: true
SyncWithBaseTables: true
UnsyncTables: NULL
CommittedVersion: 2
RowCount: 4
7 rows in set (0.01 sec)
---------
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com >
2024-09-12 11:50:09 +08:00
4e453dc1bb
Revert "[improvement](statistics)Return -1 to neredis if report olap table row count for new table is not done for all tablets. ( #40457 )" ( #40616 )
...
Reverts apache/doris#40540
2024-09-10 17:17:13 +08:00
e43e6e2bba
[improvement](statistics)Return -1 to neredis if report olap table row count for new table is not done for all tablets. ( #40457 ) ( #40540 )
...
backport: https://github.com/apache/doris/pull/40457
2024-09-10 12:55:53 +08:00
a3eba2aad5
[fix](auth) ordinary users can see the processes of other users ( #39747 ) ( #40415 )
...
pick: https://github.com/apache/doris/pull/39747
2024-09-09 11:13:18 +08:00
a963709fed
[opt](scanner) Control the degree of parallelism of scanner when only limit involved #39927 ( #40357 )
...
cherry pick from #39927
2024-09-09 10:42:19 +08:00
b4beec8ea8
[fix](OrcWriter) fix be core when upgrading BE without upgrading FE ( #40303 )
...
bp: #40282
2024-09-04 10:24:41 +08:00
92752b90e7
[feature](metacache) add system table catalog_meta_cache_statistics #40155 ( #40210 )
...
bp #40155
2024-09-02 23:23:35 +08:00
70daa1f85d
[opt](inverted index) Controls whether the in_list can execute fast_execute. ( #40141 )
...
https://github.com/apache/doris/pull/40022
2024-08-30 10:32:43 +08:00
ca07a00c93
Revert "[branch-2.1](hive) support hive write text table ( #38549 ) (#4… ( #40157 )
...
…0063)"
This reverts commit c6df7c21a3c09ae1664deabacb88dfcea9d94b68.
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-08-30 10:25:38 +08:00
c6df7c21a3
[branch-2.1](hive) support hive write text table ( #38549 ) ( #40063 )
...
1. Support write hive text table
2. Add SessionVariable `hive_text_compression` to write compressed hive
text table
3. Supported compression type: gzip, bzip2, snappy, lz4, zstd
pick from https://github.com/apache/doris/pull/38549
2024-08-29 16:50:40 +08:00
173aafc86f
[Enhancement] add information_schema.table_properties #38745 ( #38746 ) ( #39886 )
...
bp #38746
---------
Co-authored-by: Vallish Pai <vallishpai@gmail.com >
2024-08-27 17:22:19 +08:00
8dc3f3f347
[fix](inverted index) Fix Session Variable Compatibility ( #39939 )
2024-08-27 10:17:47 +08:00
d563621f6e
[enhancement](thrift) add value number to thrift definition for enum type ( #39880 ) ( #39896 )
...
Issue Number: close #xxx
cherry pick : https://github.com/apache/doris/pull/39880
---------
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-08-26 08:07:57 +08:00
44b80fb03b
[fix](inverted index) Fix Session Variable Compatibility ( #39884 )
...
https://github.com/apache/doris/pull/39889
2024-08-25 08:42:36 +08:00
e0534c9bfc
[bugfix](thrift) the definition number should consistent with master branch ( #39879 )
...
## Proposed changes
introduced by pr https://github.com/apache/doris/pull/35103
<!--Describe your changes.-->
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-08-25 00:22:19 +08:00
2dea859bdb
[debug](rpc) debug rpc time consumption problem ( #39852 )
...
## Proposed changes
Issue Number: close #xxx
Add detail RPC time info for each channel, sorted by max rpc time of
channels:
```
DATA_STREAM_SINK_OPERATOR (id=1,dst_id=1):
- Partitioner: Crc32HashPartitioner(64)
- BlocksProduced: 74
- BrpcSendTime: 2.689us
- BrpcSendTime.Wait: 0ns
- BytesSent: 89.35 KB
- CloseTime: 680.152us
- CompressTime: 0ns
- ExecTime: 160.663ms
- InitTime: 263.608us
- InputRows: 32.512K (32512)
- LocalBytesSent: 0.00
- LocalSendTime: 0ns
- LocalSentRows: 0
- MemoryUsage:
- PeakMemoryUsage: 80.00 KB
- MergeBlockTime: 0ns
- OpenTime: 4.113ms
- OverallThroughput: 0.0 /sec
- PendingFinishDependency: 41.179ms
- RowsProduced: 32.512K (32512)
- RpcAvgTime: 11.850ms
- RpcCount: 10
- RpcMaxTime: 86.891ms
- RpcMinTime: 15.200ms
- RpcSumTime: 118.503ms
- SerializeBatchTime: 13.517ms
- SplitBlockDistributeByChannelTime: 38.923ms
- SplitBlockHashComputeTime: 2.659ms
- UncompressedRowBatchSize: 135.19 KB
- WaitForDependencyTime: 0ns
- WaitForRpcBufferQueue: 0ns
RpcInstanceDetails:
- Instance 85d4f75b72a9ea61: Count: 4, MaxTime: 36.238ms, MinTime: 12.107ms, AvgTime: 21.722ms, SumTime: 86.891ms
- Instance 85d4f75b72a9ea91: Count: 3, MaxTime: 11.107ms, MinTime: 2.431ms, AvgTime: 5.470ms, SumTime: 16.412ms
- Instance 85d4f75b72a9eac1: Count: 3, MaxTime: 7.554ms, MinTime: 3.160ms, AvgTime: 5.066ms, SumTime: 15.200m
```
2024-08-24 19:59:39 +08:00
14a2a66106
[fix](paimon) fix not able to read paimon data from hdfs with HA ( #39806 ) ( #39876 )
...
bp #39806
2024-08-24 17:51:15 +08:00
76596e5f73
[fix](thrift) fix wrong order of field 27 and 28 in TCreateTabletReq thrift ( #39873 )
...
## Proposed changes
fix wrong order of field 27 and 28 in `TCreateTabletReq` thrift
introduced by #32418 and 0cde0cbf19011bc8d421add4734d7cd57308973f.
`TCreateTabletReq` is used for creating tablet, so this PR will fix
problem creating tablet when upgrading 2.0.x -> 2.1.4/2.1.5 or
2.1.4/2.1.5 -> 3.0.x, BUT will cause problem creating tablet when
upgrading 2.1.4/2.1.5 -> 2.1.6+.
master and branch-2.0
```
27: optional i64 time_series_compaction_level_threshold = 1
28: optional TInvertedIndexStorageFormat inverted_index_storage_format = TInvertedIndexStorageFormat.V1
```
branch-2.1 (affects 2.1.4 and 2.1.5)
```
27: optional TInvertedIndexStorageFormat inverted_index_storage_format = TInvertedIndexStorageFormat.V1
28: optional i64 time_series_compaction_level_threshold = 1
```
2024-08-24 16:02:08 +08:00
6ceb574aa0
[branch-2.1]Pick IO limit/workload group usage table ( #39839 )
2024-08-23 18:51:47 +08:00
7a7292ad5a
[branch-2.1][Refactor]use async to get be resource ( #38389 ) ( #39826 )
...
pick #38389
2024-08-23 17:16:19 +08:00
0eadfbefc6
[Fix](branch-2.1) Fix wrong thrift index introduced by #37830 ( #39824 )
...
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-08-23 15:05:35 +08:00
8bbd3db1bc
[branch-2.1](thrift) fix TLoadTxnBeginRequest backend_id's field no ( #39823 )
...
Make backend_id's field no the same with master branch.
For upgrading, change backend_id's field no is safe, because old fe can
torrent with TLoadTxnBeginRequest not setting backend id.
backend_id was introduce by #36437
2024-08-23 12:24:55 +08:00
1c566253a8
[Pick][Improment]Query queued by be memory ( #37559 ) ( #39733 )
...
pick #37559
2024-08-22 15:14:47 +08:00
ed9794a0fe
[Pick][Improment]publish workload to BE by tag ( #38486 ) ( #39730 )
...
A workload group's tag property may be three cases as below: 1 empty
string, null or '', it could be published to all BE. 2 a value match
some BE' location, then the workload group could only be published to
the BE with same tag.
3 not an empty string, but some invalid string which can not math any
BE's location, then it could not be published any BE.
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-08-22 00:48:16 +08:00
a55e109e97
[pick][Improment]Add schema table workload_group_privileges ( #38436 ) ( #39708 )
...
pick #38436
2024-08-22 00:44:43 +08:00
1e47d11560
[Improvement](runtime-filter) send RUNTIME_BLOOM_FILTER_MAX_SIZE to backends ( #39686 )
...
…ackends (#38972 )
## Proposed changes
pick from #38972
2024-08-22 00:37:25 +08:00
0bfcee1251
[opt](file-cache) support system table file_cache_statistics ( #39552 )
...
1. Add new system table: `file_cache_statistics`
This table is used for viewing metrics related to file cache on BE side
```
mysql> select * from information_schema.file_cache_statistics limit 10;
+-------+---------------+----------------------------+--------------------------------+--------------------+
| BE_ID | BE_IP | CACHE_PATH | METRIC_NAME | METRIC_VALUE |
+-------+---------------+----------------------------+--------------------------------+--------------------+
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_curr_elements | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_curr_size | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_max_elements | 102400 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_max_size | 21474836480 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio |
0.8539634687001242 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio_1h | 0
|
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio_5m | 0
|
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_curr_elements | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_curr_size | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_max_elements | 102400 |
+-------+---------------+----------------------------+--------------------------------+--------------------+
```
It will show metrics of file caches on each BE.
2. Add new metrics `hits_ratio_1h` and `hits_ratio_5m` for file cache
This 2 metrics will show the hit ratio of file cache in recent 1 hour or
5 minutes.
So that we can know recent hit ratio instead of global historical hit
ratio.
2024-08-21 10:03:39 +08:00
a3fd13fee6
[fix](catalog) set timeout for split fetch ( #39346 ) ( #39624 )
...
bp #39346
2024-08-20 21:59:55 +08:00
830f250a80
[opt](query cancel) cancel query if it has pipeline task leakage #39223 ( #39537 )
...
pick #39223 with some modifications. Optimization will only be applied
to pipeline x.
2024-08-19 14:33:59 +08:00
7e9aa2b9ac
[feature](restore) Support clean_tables/clean_partitions properties for restore job #39028 ( #39363 )
...
cherry pick from #39028
2024-08-15 09:58:26 +08:00
60eeec3754
[fix] (inverted index) Fix match function without inverted index ( #38989 ) ( #39220 )
...
## Proposed changes
pick from #38989
2024-08-13 10:55:54 +08:00
8cb5aa64f4
[test](inverted index) add an Inverted Index Testing Switch ( #38077 ) ( #38947 )
...
https://github.com/apache/doris/pull/38077
2024-08-07 11:25:36 +08:00
fc0222a64c
[opt](info) processlist schema table support show all fe ( #38701 ) ( #38953 )
...
pick #38701
2024-08-07 11:01:46 +08:00