90d6985f91
[Fix](bug) Is null predicate get error query result ( #41704 )
...
cherry-pick #41668
2024-10-12 13:18:14 +08:00
34429bfa0e
[Chore](inverted index) remove useless code of compound filters for inverted index #40258 ( #41448 )
...
cherry pick from #40258
2024-09-29 17:27:29 +08:00
0b4552f74b
[cherry-pick](branch-2.1) pick hive text write from master ( #40537 )
...
## Proposed changes
pick prs:
https://github.com/apache/doris/pull/38549
https://github.com/apache/doris/pull/40183
https://github.com/apache/doris/pull/40315
---------
Co-authored-by: Calvin Kirs <kirs@apache.org >
2024-09-27 20:57:07 +08:00
eb13cd4154
[branch-2.1] Picks "[Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update #40272 " ( #40964 )
...
picks https://github.com/apache/doris/pull/40272
2024-09-26 22:54:27 +08:00
c6a6adb3a4
[Fix](topn) avoid missmatched row count when upgrading ( #40999 )
...
#41000
2024-09-21 08:46:57 +08:00
8e860a26a7
[fix](systable) fix unstable case for partitions table ( #40553 ) ( #41043 )
...
bp #40553
2024-09-20 17:13:30 +08:00
e0fac66223
[branch-2.1](fix) fix snappy decompressor bug ( #40862 )
...
## Proposed changes
Hadoop snappycodec source :
https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/codec/SnappyCodec.cc
Example:
OriginData(The original data will be divided into several large data
block.) :
large data block1 | large data block2 | large data block3 | ....
The large data block will be divided into several small data block.
Suppose a large data block is divided into three small blocks:
large data block1: | small block1 | small block2 | small block3 |
CompressData: <A [B1 compress(small block1) ] [B2 compress(small block1)
] [B3 compress(small block1)]>
A : original length of the current block of large data block.
sizeof(A) = 4 bytes.
A = length(small block1) + length(small block2) + length(small block3)
Bx : length of small data block bx.
sizeof(Bx) = 4 bytes.
Bx = length(compress(small blockx))
2024-09-20 11:57:14 +08:00
b8bc9b699c
[fix](scan) Incorrect scan keys lead to wrong query results. ( #40814 ) ( #40971 )
...
## Proposed changes
pick #40814
```
mysql [doris_14555]>select * from table_9436528_3;
+------+------+------+------+------------------------+--------------------+------+
| col1 | col2 | col3 | col5 | col4 | col6 | col7 |
+------+------+------+------+------------------------+--------------------+------+
| -100 | 1 | -82 | 1 | 2024-02-16 04:37:37.00 | -1299962421.904282 | NULL |
| -100 | 1 | 92 | 1 | 2024-02-16 04:37:37.00 | 23423423.0324234 | NULL |
| -100 | 0 | -82 | 0 | 2023-11-11 10:49:43.00 | 840968969.872149 | NULL |
```
wrong result:
```
mysql [doris_14555]>select * from table_9436528_3 where col1 <= -100 and col2 in (true, false) and col3 = -82;
+------+------+------+------+------------------------+--------------------+------+
| col1 | col2 | col3 | col5 | col4 | col6 | col7 |
+------+------+------+------+------------------------+--------------------+------+
| -100 | 1 | -82 | 1 | 2024-02-16 04:37:37.00 | -1299962421.904282 | NULL |
| -100 | 1 | 92 | 1 | 2024-02-16 04:37:37.00 | 23423423.0324234 | NULL |
+------+------+------+------+------------------------+--------------------+------+
```
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-09-19 22:01:02 +08:00
b52b572ade
[branch-2.1](memory) When Load ends, check memory tracker value returns is equal to 0 ( #40850 )
...
pick
#38960
#39908
#40043
#40092
#40016
#40439
---------
Co-authored-by: hui lai <1353307710@qq.com >
Co-authored-by: yiguolei <676222867@qq.com >
2024-09-15 23:47:53 +08:00
7851563829
[fix](brpc_client_cache) resolve hostname in DNS cache before passing to brpc ( #40074 ) ( #40786 )
...
backport #40074
2024-09-13 14:28:01 +08:00
3604d63184
[Branch 2.1] backport systable PR (#34384,#40153,#40456,#40455,#40568) ( #40687 )
...
backport
https://github.com/apache/doris/pull/40568
https://github.com/apache/doris/pull/40455
https://github.com/apache/doris/pull/40456
https://github.com/apache/doris/pull/40153
https://github.com/apache/doris/pull/34384
Test result:
2024-09-11 11:00:45.618 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:00:45.619 INFO [suite-thread-1] (Suite.groovy:359) -
Execute sql: REVOKE SELECT_PRIV ON
test_partitions_schema_db.duplicate_table FROM partitions_user
2024-09-11 11:00:45.625 INFO [suite-thread-1] (SuiteContext.groovy:299)
- Create new connection for user 'partitions_user'
2024-09-11 11:00:45.632 INFO [suite-thread-1] (Suite.groovy:1162) -
Execute tag: select_check_5, sql: select
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME
from information_schema.partitions where
table_schema="test_partitions_schema_db" order by
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME
2024-09-11 11:00:45.644 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:00:45.645 INFO [suite-thread-1] (ScriptContext.groovy:120)
- Run test_partitions_schema in
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy
succeed
2024-09-11 11:00:45.652 INFO [main] (RegressionTest.groovy:259) - Start
to run single scripts
2024-09-11 11:01:10.321 INFO [main] (RegressionTest.groovy:380) -
Success suites:
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy:
group=default,p0, name=test_partitions_schema
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:459) - All
suites success.
____ _ ____ ____ _____ ____
| _ \ / \ / ___/ ___|| ____| _ \
| |_) / _ \ \___ \___ \| _| | | | |
| __/ ___ \ ___) |__) | |___| |_| |
|_| /_/ \_\____/____/|_____|____/
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:410) - Test 1
suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:119) - Test
finished
2024-09-11 11:03:00.712 INFO [suite-thread-1] (Suite.groovy:1162) -
Execute tag: select_check_5, sql: select * from
information_schema.table_options ORDER BY
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,TABLE_MODEL,TABLE_MODEL_KEY,DISTRIBUTE_KEY,DISTRIBUTE_TYPE,BUCKETS_NUM,PARTITION_NUM;
2024-09-11 11:03:00.729 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:03:00.731 INFO [suite-thread-1] (ScriptContext.groovy:120)
- Run test_table_options in
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy
succeed
2024-09-11 11:03:04.817 INFO [main] (RegressionTest.groovy:259) - Start
to run single scripts
2024-09-11 11:03:28.741 INFO [main] (RegressionTest.groovy:380) -
Success suites:
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy:
group=default,p0, name=test_table_options
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:459) - All
suites success.
____ _ ____ ____ _____ ____
| _ \ / \ / ___/ ___|| ____| _ \
| |_) / _ \ \___ \___ \| _| | | | |
| __/ ___ \ ___) |__) | |___| |_| |
|_| /_/ \_\____/____/|_____|____/
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:410) - Test 1
suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:119) - Test
finished
*************************** 7. row ***************************
PartitionId: 18035
PartitionName: p100
VisibleVersion: 2
VisibleVersionTime: 2024-09-11 10:59:28
State: NORMAL
PartitionKey: col_1
Range: [types: [INT]; keys: [83647]; ..types: [INT]; keys: [2147483647];
)
DistributionKey: pk
Buckets: 10
ReplicationNum: 1
StorageMedium: HDD
CooldownTime: 9999-12-31 15:59:59
RemoteStoragePolicy:
LastConsistencyCheckTime: NULL
DataSize: 2.872 KB
IsInMemory: false
ReplicaAllocation: tag.location.default: 1
IsMutable: true
SyncWithBaseTables: true
UnsyncTables: NULL
CommittedVersion: 2
RowCount: 4
7 rows in set (0.01 sec)
---------
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com >
2024-09-12 11:50:09 +08:00
8708fae420
[fix](ES Catalog)Support parse single value for array column ( #40614 ) ( #40660 )
...
bp #40614
2024-09-11 17:26:48 +08:00
314f6ae823
[fix](ES Catalog)Fix int parse error when querying by doc_values ( #40385 ) ( #40521 )
...
bp #40385
2024-09-09 14:29:21 +08:00
92752b90e7
[feature](metacache) add system table catalog_meta_cache_statistics #40155 ( #40210 )
...
bp #40155
2024-09-02 23:23:35 +08:00
ca07a00c93
Revert "[branch-2.1](hive) support hive write text table ( #38549 ) (#4… ( #40157 )
...
…0063)"
This reverts commit c6df7c21a3c09ae1664deabacb88dfcea9d94b68.
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-08-30 10:25:38 +08:00
c6df7c21a3
[branch-2.1](hive) support hive write text table ( #38549 ) ( #40063 )
...
1. Support write hive text table
2. Add SessionVariable `hive_text_compression` to write compressed hive
text table
3. Supported compression type: gzip, bzip2, snappy, lz4, zstd
pick from https://github.com/apache/doris/pull/38549
2024-08-29 16:50:40 +08:00
131238ff71
[fix](file-cache) change metric_value column in file_cache_statistics table to string ( #40083 )
...
Make it more flexible
followup #39552
2024-08-29 16:39:22 +08:00
173aafc86f
[Enhancement] add information_schema.table_properties #38745 ( #38746 ) ( #39886 )
...
bp #38746
---------
Co-authored-by: Vallish Pai <vallishpai@gmail.com >
2024-08-27 17:22:19 +08:00
6ceb574aa0
[branch-2.1]Pick IO limit/workload group usage table ( #39839 )
2024-08-23 18:51:47 +08:00
a55e109e97
[pick][Improment]Add schema table workload_group_privileges ( #38436 ) ( #39708 )
...
pick #38436
2024-08-22 00:44:43 +08:00
0bfcee1251
[opt](file-cache) support system table file_cache_statistics ( #39552 )
...
1. Add new system table: `file_cache_statistics`
This table is used for viewing metrics related to file cache on BE side
```
mysql> select * from information_schema.file_cache_statistics limit 10;
+-------+---------------+----------------------------+--------------------------------+--------------------+
| BE_ID | BE_IP | CACHE_PATH | METRIC_NAME | METRIC_VALUE |
+-------+---------------+----------------------------+--------------------------------+--------------------+
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_curr_elements | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_curr_size | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_max_elements | 102400 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_max_size | 21474836480 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio |
0.8539634687001242 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio_1h | 0
|
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio_5m | 0
|
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_curr_elements | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_curr_size | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_max_elements | 102400 |
+-------+---------------+----------------------------+--------------------------------+--------------------+
```
It will show metrics of file caches on each BE.
2. Add new metrics `hits_ratio_1h` and `hits_ratio_5m` for file cache
This 2 metrics will show the hit ratio of file cache in recent 1 hour or
5 minutes.
So that we can know recent hit ratio instead of global historical hit
ratio.
2024-08-21 10:03:39 +08:00
43cc8d648d
[fix](ES Catalog)Check isArray before parse json to array ( #39104 ) ( #39273 )
...
## Proposed changes
bp #39104
2024-08-13 15:13:40 +08:00
fc0222a64c
[opt](info) processlist schema table support show all fe ( #38701 ) ( #38953 )
...
pick #38701
2024-08-07 11:01:46 +08:00
9d23ccf1f2
[Improvement](schema scan) Use async scanner for schema scanners (#38… ( #38666 )
...
…403)
2024-08-01 16:05:24 +08:00
017dad8c54
[fix](type)support runtime predicate for time type ( #38258 ) ( #38465 )
...
## Proposed changes
https://github.com/apache/doris/pull/38258
Issue Number: close #xxx
<!--Describe your changes.-->
2024-07-31 10:27:36 +08:00
e2bb86e7f8
[fix](inverted index) fixed in_list condition not indexed on pipelinex ( #38178 )
...
## Proposed changes
https://github.com/apache/doris/pull/36565
https://github.com/apache/doris/pull/37842
https://github.com/apache/doris/pull/37921
https://github.com/apache/doris/pull/37386
<!--Describe your changes.-->
2024-07-25 14:42:34 +08:00
10c5c336d8
[branch-2.1](arrow-flight-sql) Add config arrow_flight_result_sink_buffer_size_rows ( #38223 )
...
pick #38221
2024-07-24 15:15:39 +08:00
e5339a4014
[feature](ES Catalog)Support control scroll level by config #37180 ( #37290 )
...
## Proposed changes
backport #37180
2024-07-15 16:41:38 +08:00
f8cee439b6
[feature](ES Catalog) map nested/object type in ES to JSON type in Doris ( #37101 ) ( #37182 )
...
backport #37101
2024-07-05 10:48:32 +08:00
02fad48870
[Fix](upgrade) Fix fields not handled correctly during upgrade and downgrade ( #36691 )
...
master version is #36690
2024-06-22 14:23:04 +08:00
445d42a57d
[fix](topn-opt) remove redundant check for fetch phase ( #36676 )
...
#36629
Issue Number: close #xxx
<!--Describe your changes.-->
2024-06-21 22:28:38 +08:00
bd47d5a681
[branch-2.1](auto-partition) Fix auto partition load failure in multi replica ( #36586 )
...
this pr
1. picked #35630 , which was reverted #36098 before.
2. picked #36344 from master
these two pr fixed existing bug about auto partition load.
---------
Co-authored-by: Kaijie Chen <ckj@apache.org >
2024-06-20 17:51:18 +08:00
dda25cceb6
[Bug](information-schema) fix some bug of information_schema.PROCESSLIST ( #36447 )
...
## Proposed changes
pick from #36409
2024-06-18 16:45:48 +08:00
3b23eee37c
Revert "[fix](auto-partition) fix auto partition load lost data in multi sender ( #35287 )" ( #36098 )
...
Reverts apache/doris#35630 because it brought some more damaging bugs.
we will fix it and merge in next version
2024-06-11 17:11:42 +08:00
75a6f28f2e
[cherry-pick]Add query type when report ( #35918 )
...
pick #34978
2024-06-11 10:51:59 +08:00
b5a35b9cef
[FIX] Pick array inverted index bugfix ( #35837 )
...
here with some array with inverted index bugfix:
see also:
https://github.com/apache/doris/pull/34766
https://github.com/apache/doris/pull/35086
https://github.com/apache/doris/pull/34683
https://github.com/apache/doris/pull/34076
2024-06-06 09:54:14 +08:00
fe1a4c4136
[Feature](IP) support ipv4/ipv6 with inverted index and conjuncts for query ( #35734 )
...
support data type ipv4/ipv6 with inverted index
and then we can query like "> or < or >= or <= or in/not in " this
conjuncts expr for ip with inverted index speeding up
2024-06-03 23:24:03 +08:00
c2fc485327
[fix](auto-partition) fix auto partition load lost data in multi sender ( #35287 ) ( #35630 )
...
## Proposed changes
Change `use_cnt` mechanism for incremental (auto partition) channels and
streams, it's now dynamically counted.
Use `close_wait()` of regular partitions as a synchronize point to make
sure all sinks are in close phase before closing any incremental (auto
partition) channels and streams.
Add dummy (fake) partition and tablet if there is no regular partition
in the auto partition table.
Backport #35287
Co-authored-by: zhaochangle <zhaochangle@selectdb.com >
2024-05-31 10:27:03 +08:00
b91d2caab8
[Feature](iceberg-writer) Implements iceberg sink basic functionality for inserting into table. ( #35587 )
...
backport #34929
2024-05-29 16:40:54 +08:00
f38ecd349c
[enhancement](memory) return error if allocate memory failed during add rows method ( #35085 )
...
* return error when add rows failed
* f
---------
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-05-22 00:53:34 +08:00
fb28d0b185
[BUG] fix scan range boundary handling is incorrect ( #34832 )
...
fix scan range boundary handling is incorrect
Co-authored-by: shizhiqiang03 <shizhiqiang03@meituan.com >
2024-05-21 13:00:50 +08:00
c0fd98abe5
[Fix](tvf) Fix that tvf reading empty files in compressed formats. ( #34926 )
...
1. Fix the issue with tvf reading empty compressed files.
2. move two test cases (`test_local_tvf_compression` and `test_s3_tvf_compression`) from p2 to p0
2024-05-21 12:59:31 +08:00
42425808a1
[Cherry-Pick](branch-2.1) Pick "Fix multiple replica partial update auto inc data inconsistency problem #34788 " ( #35056 )
...
* [Fix](auto inc) Fix multiple replica partial update auto inc data inconsistency problem (#34788 )
* **Problem:** For tables with auto-increment columns, updating partial columns can cause data inconsistency among replicas.
**Cause:** Previously, the implementation for updating partial columns in tables with auto-increment columns was done independently on each BE (Backend), leading to potential inconsistencies in the auto-increment column values generated by each BE.
**Solution:** Before distributing blocks, determine if the update involves partial columns of a table with an auto-increment column. If so, add the auto-increment column to the last column of the block. After distributing to each BE, each BE will check if the data key for the partial column update exists. If it exists, the previous auto-increment column value is used; if not, the auto-increment column value from the last column of the block is used. This ensures that the auto-increment column values are consistent across different BEs.
* 2
* [Fix](regression-test) Fix auto inc partial update unstable regression test (#34940 )
2024-05-20 15:43:46 +08:00
9b5028785d
[fix](prepare) fix datetimev2 return err when binary_row_format ( #34662 )
...
fix datetimev2 return err when binary_row_format. before pr, Backend return datetimev2 alwary by to_string.
fix datatimev2 return metadata loss scale.
2024-05-18 18:37:41 +08:00
cc00666be6
[opt](inverted index) add inlist condition handling to compound ( #34134 )
...
1. Previously, the compound did not support the inlist condition, which could impact performance if an inverted index was created.
2024-05-10 14:35:47 +08:00
39fdc9ba0c
[refactor](executor)Rename workload schedule policy #34497
2024-05-08 08:35:20 +08:00
299d069da9
Fix alter policy failed ( #33910 )
2024-04-22 22:33:24 +08:00
03c3419265
[Refactor](executor)Add workload schedule policy table ( #33729 )
2024-04-20 20:06:34 +08:00
25358564ca
[Fix](compile) Fix gcc compile on master ( #33864 )
...
This is imported by #33511 . wrongly used
ColumnStr<T> ();
which violate C++20 standard(see https://wg21.cmeerw.net/cwg/issue2237 ) but still supported by clang up until now(see llvm/llvm-project#58112 )
2024-04-19 23:41:37 +08:00
74590e4836
[refine](node) Remove the cse DCHECK from the constructor ( #33856 )
...
It's possible that a failure in the fe caused the check to fail, and at that moment, it may not be possible to retrieve the corresponding query ID from be.out.
2024-04-19 23:41:37 +08:00