cecd214345
[branch-2.1](Column) refactor ColumnNullable to provide flags safety ( #40769 ) ( #40848 )
...
pick https://github.com/apache/doris/pull/40769
Co-authored-by: Jerry Hu <mrhhsg@gmail.com >
2024-09-14 16:27:43 +08:00
0b1d517caa
[improvement](statistics)Return -1 to neredis if report olap table row count for new table is not done for all tablets. ( #40457 ) ( #40838 )
...
backport: https://github.com/apache/doris/pull/40457
2024-09-14 13:19:35 +08:00
f16615a1fc
[branch-2.1](memory) Allocator support address sanitizers ( #40836 )
...
pick
#33396
#33862
#33853
#33732
#33841
#33933
#34901
#35014
---------
Co-authored-by: yiguolei <676222867@qq.com >
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-09-14 12:12:44 +08:00
341b2e0693
[enhancement](compaction) Abort compaction tasks when correspoding tablet states have been changed ( #40271 ) ( #40828 )
...
## Proposed changes
pick: #40271
1. Change the standard for cumu compaction capability. Tablets under
state `running` or `not ready` are capable to do cumulative compaction.
2. Abort a compaction task at the beginning when the tablet is no more
capable to do compaction.
2024-09-14 11:19:31 +08:00
c79621fff1
[enhancement](schema-change) Make the schema change memory space adaptive ( #40822 )
...
pick: #34350
2024-09-14 11:17:41 +08:00
9a79edca84
[cherry-pick](branch-21) fix partition_topn not reset output rows after do_partition_topn_sort ( #40761 ) ( #40792 )
...
## Proposed changes
cherry-pick from master https://github.com/apache/doris/pull/40761
<!--Describe your changes.-->
2024-09-14 11:15:56 +08:00
523f0baa80
[fix](scan) Fix incorrect query results due to data race of compaction and parallel scanners building ( #40552 ) ( #40829 )
...
pick: #40552
Capture rowset splits and delete predicates atomicly in
`ParallelScannerBuilder::_load` as a single read source.
In this way, we could prevent reading stale rowsets with the delete
predicates eliminated by (base) compaction.
2024-09-14 11:08:55 +08:00
873f70c262
[fix] (compaction) fix compaction score in time series policy ( #40242 ) ( #40779 )
...
## Proposed changes
pick from master #40242
<!--Describe your changes.-->
2024-09-13 14:35:16 +08:00
7851563829
[fix](brpc_client_cache) resolve hostname in DNS cache before passing to brpc ( #40074 ) ( #40786 )
...
backport #40074
2024-09-13 14:28:01 +08:00
3395cd5ce9
[PipelineX](improvement) Prepare tasks in parallel ( #40270 )
...
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-09-13 13:34:29 +08:00
431e2e1af9
[Enhancement](be-logger) Support custom date time format functionality in be log. ( #40727 )
...
## Proposed changes
backport #40347
2024-09-13 10:02:12 +08:00
ed76b0c5f5
[fix](compile) failed on MacOS because size_t is not uint64_t on MacOS ( #39296 ) ( #40720 )
...
pick #39296 to branch-2.1
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com >
2024-09-12 18:47:59 +08:00
3604d63184
[Branch 2.1] backport systable PR (#34384,#40153,#40456,#40455,#40568) ( #40687 )
...
backport
https://github.com/apache/doris/pull/40568
https://github.com/apache/doris/pull/40455
https://github.com/apache/doris/pull/40456
https://github.com/apache/doris/pull/40153
https://github.com/apache/doris/pull/34384
Test result:
2024-09-11 11:00:45.618 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:00:45.619 INFO [suite-thread-1] (Suite.groovy:359) -
Execute sql: REVOKE SELECT_PRIV ON
test_partitions_schema_db.duplicate_table FROM partitions_user
2024-09-11 11:00:45.625 INFO [suite-thread-1] (SuiteContext.groovy:299)
- Create new connection for user 'partitions_user'
2024-09-11 11:00:45.632 INFO [suite-thread-1] (Suite.groovy:1162) -
Execute tag: select_check_5, sql: select
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME
from information_schema.partitions where
table_schema="test_partitions_schema_db" order by
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME
2024-09-11 11:00:45.644 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:00:45.645 INFO [suite-thread-1] (ScriptContext.groovy:120)
- Run test_partitions_schema in
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy
succeed
2024-09-11 11:00:45.652 INFO [main] (RegressionTest.groovy:259) - Start
to run single scripts
2024-09-11 11:01:10.321 INFO [main] (RegressionTest.groovy:380) -
Success suites:
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy:
group=default,p0, name=test_partitions_schema
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:459) - All
suites success.
____ _ ____ ____ _____ ____
| _ \ / \ / ___/ ___|| ____| _ \
| |_) / _ \ \___ \___ \| _| | | | |
| __/ ___ \ ___) |__) | |___| |_| |
|_| /_/ \_\____/____/|_____|____/
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:410) - Test 1
suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:119) - Test
finished
2024-09-11 11:03:00.712 INFO [suite-thread-1] (Suite.groovy:1162) -
Execute tag: select_check_5, sql: select * from
information_schema.table_options ORDER BY
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,TABLE_MODEL,TABLE_MODEL_KEY,DISTRIBUTE_KEY,DISTRIBUTE_TYPE,BUCKETS_NUM,PARTITION_NUM;
2024-09-11 11:03:00.729 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:03:00.731 INFO [suite-thread-1] (ScriptContext.groovy:120)
- Run test_table_options in
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy
succeed
2024-09-11 11:03:04.817 INFO [main] (RegressionTest.groovy:259) - Start
to run single scripts
2024-09-11 11:03:28.741 INFO [main] (RegressionTest.groovy:380) -
Success suites:
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy:
group=default,p0, name=test_table_options
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:459) - All
suites success.
____ _ ____ ____ _____ ____
| _ \ / \ / ___/ ___|| ____| _ \
| |_) / _ \ \___ \___ \| _| | | | |
| __/ ___ \ ___) |__) | |___| |_| |
|_| /_/ \_\____/____/|_____|____/
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:410) - Test 1
suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:119) - Test
finished
*************************** 7. row ***************************
PartitionId: 18035
PartitionName: p100
VisibleVersion: 2
VisibleVersionTime: 2024-09-11 10:59:28
State: NORMAL
PartitionKey: col_1
Range: [types: [INT]; keys: [83647]; ..types: [INT]; keys: [2147483647];
)
DistributionKey: pk
Buckets: 10
ReplicationNum: 1
StorageMedium: HDD
CooldownTime: 9999-12-31 15:59:59
RemoteStoragePolicy:
LastConsistencyCheckTime: NULL
DataSize: 2.872 KB
IsInMemory: false
ReplicaAllocation: tag.location.default: 1
IsMutable: true
SyncWithBaseTables: true
UnsyncTables: NULL
CommittedVersion: 2
RowCount: 4
7 rows in set (0.01 sec)
---------
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com >
2024-09-12 11:50:09 +08:00
361a59dec8
[feature](aes_encrypt) support GCM mode for aes_encrypt and aes_decrypt ( #40004 ) ( #40672 )
...
pick #40004 to branch-2.1
2024-09-11 23:28:28 +08:00
ebe031c019
[fix](inverted index) Fix match_regexp to correctly handle empty string patterns ( #40659 )
...
https://github.com/apache/doris/pull/39503
2024-09-11 18:10:33 +08:00
8708fae420
[fix](ES Catalog)Support parse single value for array column ( #40614 ) ( #40660 )
...
bp #40614
2024-09-11 17:26:48 +08:00
279c58fbc7
[branch-2.1](auto-partition) Re-add deduplication to auto partition rpc ( #40580 ) ( #40652 )
...
pick https://github.com/apache/doris/pull/40580
2024-09-11 15:34:17 +08:00
86647df45b
[fix] (inverted index) fix error result in complex compound expr ( #40630 )
...
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-09-11 15:27:40 +08:00
b0ab02bcf4
[fix](scanner) Fix deadlock when scanner submit failed #40495 ( #40640 )
...
cherry pick from #40495
2024-09-11 10:15:29 +08:00
d6f459903d
[opt](pipeline) Distribute data evenly for passthrough local exchange ( #40637 )
...
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-09-11 09:58:14 +08:00
4e453dc1bb
Revert "[improvement](statistics)Return -1 to neredis if report olap table row count for new table is not done for all tablets. ( #40457 )" ( #40616 )
...
Reverts apache/doris#40540
2024-09-10 17:17:13 +08:00
7d80630757
[opt](scanner profile) More counter for scanner #40144 ( #40571 )
...
cherry pick from #40144
2024-09-10 16:08:55 +08:00
e43e6e2bba
[improvement](statistics)Return -1 to neredis if report olap table row count for new table is not done for all tablets. ( #40457 ) ( #40540 )
...
backport: https://github.com/apache/doris/pull/40457
2024-09-10 12:55:53 +08:00
6961c95eca
[chore](error msg) More convenient error msg when function not found. #40296 ( #40570 )
...
cherry pick from #40296
2024-09-10 12:03:12 +08:00
185353e890
[Fix](inverted index) gc TEMP colum when next_batch in segment iterator ( #40563 )
...
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-09-10 09:56:03 +08:00
354967c09f
[branch-2.1](memory) pick reserve memory and workload group ( #40543 )
...
1. pick #38494
2. pick #39862
3. remove vdata_stream_test, master has been removed
2024-09-09 21:16:06 +08:00
f69063ea87
[Fix](Variant) use uinque id to access column reader ( #39841 ) ( #40269 )
...
#39841
#40295
2024-09-09 18:01:12 +08:00
8f37eccbf2
[Cherry-pick](branch-2.1) Pick "[Featrue](default value) Support bitmap_empty default value ( #40364 )" ( #40487 )
...
## Proposed changes
Pick #40364
<!--Describe your changes.-->
2024-09-09 16:57:38 +08:00
44a7efff4f
[branch-2.1] Picks "[Opt](delete) Skip newly inserted rows check in non-strict mode partial update if the row's delete sign is marked #40322 " ( #40383 )
...
picks https://github.com/apache/doris/pull/40322
2024-09-09 16:32:24 +08:00
314f6ae823
[fix](ES Catalog)Fix int parse error when querying by doc_values ( #40385 ) ( #40521 )
...
bp #40385
2024-09-09 14:29:21 +08:00
facce8b4d5
[fix](move-memtable) multi replica tables should tolerate minority failures ( #38003 ) ( #40477 )
...
backport #38003
2024-09-09 11:30:46 +08:00
e0b22b5104
[enhancement](schema-change) Log out end version before truncating new tablet data ( #39924 ) ( #40239 )
...
## Proposed changes
As title.
2024-09-09 10:46:41 +08:00
a963709fed
[opt](scanner) Control the degree of parallelism of scanner when only limit involved #39927 ( #40357 )
...
cherry pick from #39927
2024-09-09 10:42:19 +08:00
1c91fbc167
[fix](multi table) do not use strlen to calculate the length of msg ( #40367 ) ( #40511 )
...
pick #40367
Meet code dump when using single stream multi table load:
```
SUMMARY: AddressSanitizer: heap-buffer-overflow /root/doris/be/src/io/fs/multi_table_pipe.cpp:99:22 in doris::io::MultiTablePipe::dispatch(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, char const*, unsigned long, doris::Status (doris::io::KafkaConsumerPipe::*)(char const*, unsigned long))
```
1. It is hard to guaranteed that msg is a C-style string ending in '\0'
character. If not, it may cause the core dump to access memory out of
bounds.
2. It is not need to calculate the length of msg twice.
Therefore, deleting the logic that using strlen to calculate the length
of msg.
2024-09-09 10:35:59 +08:00
2023eab11e
[Fix](ShortCircuit) consider delete sign flag when hits row ( #40300 ) ( #40408 )
...
https://github.com/apache/doris/pull/40300
2024-09-09 10:04:05 +08:00
c1abaa4679
[Bug](map) fix wrong result on map_agg with streaming agg ( #40471 )
...
pick from #40454
2024-09-06 19:29:38 +08:00
0e057c49e8
[fix](table-func) fix explode-func with old pipeline ( #40482 )
...
## Proposed changes
if we use 2.0 fe and 2.1 be where pipeline use old logic may meet a core
like
```
22:44:04 F20240905 22:31:46.818060 25429 assert_cast.h:45] Bad cast from type:doris::vectorized::ColumnVector<int>* to doris::vectorized::ColumnVector<double>*
22:44:04 *** Check failure stack trace: ***
22:44:04 @ 0x560836b66586 google::LogMessage::SendToLog()
22:44:04 @ 0x560836b62fd0 google::LogMessage::Flush()
22:44:04 @ 0x560836b66dc9 google::LogMessageFatal::~LogMessageFatal()
22:44:04 @ 0x5608197f8013 assert_cast<>()
22:44:04 @ 0x5608220349af doris::vectorized::VExplodeJsonArrayTableFunction<>::_insert_values_into_column()
22:44:04 @ 0x5608220345d9 doris::vectorized::VExplodeJsonArrayTableFunction<>::get_value()
22:44:04 @ 0x560822007812 doris::vectorized::VTableFunctionNode::_get_expanded_block()
22:44:04 @ 0x560822009506 doris::vectorized::VTableFunctionNode::pull()
22:44:04 @ 0x5608365c4cc4 _ZNSt5_BindIFMN5doris8ExecNodeEFNS0_6StatusEPNS0_12RuntimeStateEPNS0_10vectorized5BlockEPbEPNS5_18VTableFunctionNodeESt12_PlaceholderILi1EESD_ILi2EESD_ILi3EEEE6__callIS2_JOS4_OS7_OS8_EJLm0ELm1ELm2ELm3EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE
22:44:04 @ 0x5608365c47b6 std::_Function_handler<>::_M_invoke()
22:44:04 @ 0x560810bcb5b0 doris::ExecNode::get_next_after_projects()
22:44:04 @ 0x5608365bf958 doris::pipeline::StatefulOperator<>::get_block()
22:44:04 @ 0x5608366bfe9d doris::pipeline::PipelineTask::execute()
22:44:04 @ 0x560836b3de7d doris::pipeline::TaskScheduler::_do_work()
22:44:04 @ 0x56081115a470 doris::ThreadPool::dispatch_thread()
22:44:04 @ 0x5608111399f9 doris::Thread::supervise_thread()
22:44:04 @ 0x7f43991edac3 (unknown)
```
Issue Number: close #xxx
<!--Describe your changes.-->
2024-09-06 19:29:09 +08:00
cb0613e249
[fix] (inverted index) fix error result in compound query ( #40425 )
...
## Proposed changes
`select count() from table where a + b > 0 or b > 0`

- When _execute_predicates_except_leafnode_of_andnode is executed, an
Expr tree is traversed from bottom to top. When it reaches the leaf node
b, the information of this column b is placed into new_predicate_info.
- However, this step is skipped directly at an ADD node, which leads to
the GT node at the upper level generating a sign equivalent to b > 0,
the same as the sign on the right side b > 0.
- This causes the compound OR calculation to assume that both GT
conditions below have been evaluated, thus prematurely computing this
EXPR, when in fact, the ADD node has not been evaluated.
- If the SQL is written as SELECT COUNT(*) FROM table WHERE b + a > 0 OR
b > 0, the calculation would be correct because the sign generated by
this > node would be equivalent to a > 0, which is different from b > 0
on the right side.
2024-09-06 10:27:59 +08:00
f64a728741
[enhance](variant) throw exception when field type is not supported in cast elimation ( #40448 )
...
#40388
2024-09-06 09:56:57 +08:00
7e27bb1ae6
[fix](window_funnel) fix wrong result of fixed mode ( #40460 )
...
BP #40459
2024-09-06 09:48:24 +08:00
87ac378c4a
[branch-2.1](be-ut) wait lazy open in ut ( #40453 )
...
## Proposed changes
LRUFileCache test need to wait lazy open done
2024-09-06 09:47:47 +08:00
0928c9c6ed
[fix](unary function) Fix wrong result of asin, acos and sqrt when processing invalid input #40267 ( #40358 )
...
cherry pick from #40267
2024-09-05 19:51:01 +08:00
d9fa59be4d
[Chore](runtime-filter) avoid dcheck fail when rf merge failed ( #39172 ) ( #40409 )
...
pick from #39172
2024-09-05 14:50:47 +08:00
26feaab711
[fix](delete_predicate) fix wrong data after upgrade from v2.0 version ( #40400 )
...
pick https://github.com/apache/doris/pull/40401 to branch-2.1
related issue:
https://github.com/apache/doris/issues/40390
related pr:
https://github.com/apache/doris/pull/22442
2024-09-05 14:46:03 +08:00
cc20ecd738
Revert "[fix](compaction) fix the longest continuous rowsets cannot be selected when missing rowsets ( #38728 ) ( #39262 )" ( #40375 )
...
This reverts commit c9949f24e5c15e9529285f0e99b7ffdb1095558b.
This pr may increase the probability of full clone failure, so revert it
first.
2024-09-05 00:01:03 +08:00
653daeb8cb
Revert "[enhancement](index compaction) Enable index compaction by default ( #36812 )" ( #40351 )
...
Reverts apache/doris#38676
2024-09-04 14:16:09 +08:00
b4beec8ea8
[fix](OrcWriter) fix be core when upgrading BE without upgrading FE ( #40303 )
...
bp: #40282
2024-09-04 10:24:41 +08:00
0e9fa3dff7
[fix](decimaltype) handle exception with tablet init ( #40263 )
...
## Proposed changes
to avoid be core like
```
terminate called after throwing an instance of 'doris::Exception'
what(): [E6] meet invalid precision: real_precision=28, max_decimal_precision=27, min_decimal_precision=1
0# doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&) at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/common/exception
.cpp:0
1# doris::Exception::Exception<unsigned int const&, unsigned long>(int, std::basic_string_view<char, std::char_traits<char> > const&, unsigned int const&, unsigned long&
&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
2# doris::vectorized::DataTypeDecimal<doris::vectorized::Decimal<__int128> >::DataTypeDecimal(unsigned int, unsigned int, unsigned int, unsigned int) at /home/zcp/repo_c
enter/doris_branch-3.0/doris/be/src/vec/data_types/data_type_decimal.h:0
3# doris::vectorized::DataTypeFactory::create_data_type(doris::TypeDescriptor const&, bool) at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/vec/data_types/data_ty
pe_factory.cpp:0
4# doris::vectorized::DataTypeFactory::create_data_type(doris::TypeDescriptor const&, bool) at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/vec/data_types/data_ty
pe_factory.cpp:0
5# doris::vectorized::DataTypeFactory::create_data_type(doris::TypeDescriptor const&, bool) at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/vec/data_types/data_ty
pe_factory.cpp:0
6# doris::SlotDescriptor::get_empty_mutable_column() const at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base
.h:1295
7# doris::VOlapTablePartitionParam::VOlapTablePartitionParam(std::shared_ptr<doris::OlapTableSchemaParam>&, doris::TOlapTablePartitionParam const&) at /home/zcp/repo_cen
ter/doris_branch-3.0/doris/be/src/vec/common/cow.h:154
8# doris::vectorized::VTabletWriter::_init(doris::RuntimeState*, doris::RuntimeProfile*) at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/vec/sink/writer/vtablet_w
riter.cpp:1177
9# doris::vectorized::VTabletWriter::open(doris::RuntimeState*, doris::RuntimeProfile*) at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/common/status.h:488
10# doris::vectorized::AsyncResultWriter::process_block(doris::RuntimeState*, doris::RuntimeProfile*) at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/common/status
.h:488
11# std::_Function_handler<void (), doris::vectorized::AsyncResultWriter::start_writer(doris::RuntimeState*, doris::RuntimeProfile*)::$_0>::_M_invoke(std::_Any_data const
&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/atomicity.h:98
12# doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/util/threadpool.cpp:0
13# doris::Thread::supervise_thread(void*) at /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562
14# ?
```
Issue Number: close #xxx
<!--Describe your changes.-->
2024-09-03 14:38:16 +08:00
67a2daf33c
[fix](execution) fix wrong check when blocking result sink ( #40289 )
...
Otherwise, when client stop fetch result data, the result sink will
still adding batch to the result queue,
which causing BE OOM.
2024-09-03 14:17:42 +08:00
41271ecba0
[fix](ES Catalog)Do not push down limit to ES when predicates can not be processed by ES. ( #40111 ) ( #40265 )
...
bp #40111
2024-09-03 11:17:24 +08:00