Commit Graph

7755 Commits

Author SHA1 Message Date
e66ffc1b6d [branch-2.1](arrow-flight-sql) Fix pipelineX Unknown result sink type (#37540)
pick ##35804
2024-07-11 12:30:46 +08:00
9f4e7346fb [fix](compaction) fixing the inaccurate statistics of concurrent compaction tasks (#37318) (#37496) 2024-07-10 22:23:25 +08:00
741807bb22 [performance](move-memtable) only call _select_streams when necessary (#35576) (#37406)
cherry-pick #35576
2024-07-10 22:20:23 +08:00
afcc6170f6 [fix](txn_manager) Add ingested rowsets to unused rowsets when removing txn (#37417)
Generally speaking, as long as a rowset has a version, it can be
considered not to be in a pending state. However, if the rowset was
created through ingesting binlogs, it will have a version but should
still be considered in a pending state because the ingesting txn has not
yet been committed.

This PR updates the condition for determining the pending state. If a
rowset is COMMITTED, the txn should be allowed to roll back even if a
version exists.

Cherry-pick #36551
2024-07-10 14:25:44 +08:00
5247e0ff3a [fix](clone) Increase robustness for clone #36642 (#37413)
cherry pick from #36642
2024-07-09 17:18:14 +08:00
7cda8db020 [fix](load) The NodeChannel should be canceled when failed to add block #37500 (#37527)
cherry pick from #37500
2024-07-09 17:01:04 +08:00
f7f0c20f00 [branch-2.1](cgroup memory) Correct cgroup mem info cache (#37440)
pick #36966

Co-authored-by: Hongkun Xu <xuhongkun666@163.com>
2024-07-09 16:19:37 +08:00
005304953e [performance](load) do not copy input_block in memtable (#36939) (#37407)
cherry-pick #36939
2024-07-09 15:59:44 +08:00
81360cf897 [opt](test) shorten the external p0 running time (#37320) (#37473)
bp #37320
2024-07-09 15:35:15 +08:00
3337c1bbe3 [[enhancement](compaction) adjust compaction concurrency based on compaction score and workload (#37491)
adjust compaction concurrency based on compaction score and workload
#36672
fix null pointer when retrieving CPU load average #37171
2024-07-09 09:56:35 +08:00
1e3ab0ff8c [fix](group commit) Pick make group commit cancel in time (#36249) (#37404)
pick https://github.com/apache/doris/pull/36249/
2024-07-09 09:25:11 +08:00
1a25270918 [fix](group commit) Pick Fix the incorrect group commit count in log; fix the core in get_first_block (#36408) (#37405)
Pick https://github.com/apache/doris/pull/36408/
2024-07-09 09:24:43 +08:00
0a103aa11f [improve](json)improve json support empty keys #36762 (#37351) 2024-07-08 19:04:51 +08:00
5280e277e7 [chore](be) Acquire and check MD5 digest of the file to download (#37418)
Cherry-pick #35807, #36621, #36726
2024-07-08 18:55:35 +08:00
494b54a5a5 [enhancement](trash) support skip trash, update trash default expire time (#37170) (#37409)
cherry-pick #37170
2024-07-08 15:33:02 +08:00
c66df8d9e6 [branch-2.1](load) fix no error url if no partition can be found (#36831) (#37401)
## Proposed changes

pick #36831

before
```
Stream load result: {
    "TxnId": 2014,
    "Label": "83ba46bd-280c-4e22-b581-4eb126fd49cf",
    "Comment": "",
    "TwoPhaseCommit": "false",
    "Status": "Fail",
    "Message": "[DATA_QUALITY_ERROR]Encountered unqualified data, stop processing",
    "NumberTotalRows": 1,
    "NumberLoadedRows": 1,
    "NumberFilteredRows": 0,
    "NumberUnselectedRows": 0,
    "LoadBytes": 1669,
    "LoadTimeMs": 58,
    "BeginTxnTimeMs": 0,
    "StreamLoadPutTimeMs": 10,
    "ReadDataTimeMs": 0,
    "WriteDataTimeMs": 47,
    "CommitAndPublishTimeMs": 0
}
```

after
```
Stream load result: {
    "TxnId": 2014,
    "Label": "83ba46bd-280c-4e22-b581-4eb126fd49cf",
    "Comment": "",
    "TwoPhaseCommit": "false",
    "Status": "Fail",
    "Message": "[DATA_QUALITY_ERROR]too many filtered rows",
    "NumberTotalRows": 1,
    "NumberLoadedRows": 0,
    "NumberFilteredRows": 1,
    "NumberUnselectedRows": 0,
    "LoadBytes": 1669,
    "LoadTimeMs": 58,
    "BeginTxnTimeMs": 0,
    "StreamLoadPutTimeMs": 10,
    "ReadDataTimeMs": 0,
    "WriteDataTimeMs": 47,
    "CommitAndPublishTimeMs": 0,
    "ErrorURL": "http://XXXX:8040/api/_load_error_log?file=__shard_4/error_log_insert_stmt_c6461270125a615b-2873833fb48d56a3_c6461270125a615b_2873833fb48d56a3"
}
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-08 10:41:33 +08:00
dd18652861 [branch-2.1](routine-load) make get Kafka meta timeout configurable (#37399)
pick #36619
2024-07-08 10:39:17 +08:00
70f46c12b3 [improve](group commit) Pick Modify group commit case and modify cancel status (#35995) (#37398)
Pick https://github.com/apache/doris/pull/35995
2024-07-08 10:27:08 +08:00
a05406ecc9 [branch-2.1] Picks "[Fix](delete) Fix delete job timeout when executing delete from ... #37363" (#37374)
## Proposed changes

picks https://github.com/apache/doris/pull/37363
2024-07-07 18:33:17 +08:00
423483ed8f [branch-2.1](routine-load) optimize out of range error message (#37391)
## Proposed changes
pick #36450

before
```
ErrorReason{code=errCode = 105, msg='be 10002 abort task, task id: d846f3d3-7c9e-44a7-bee0-3eff8cd11c6f job id: 11310 with reason: [INTERNAL_ERROR]Offset out of range,

        0#  doris::Status doris::Status::Error<6, true>(std::basic_string_view<char, std::char_traits<char> >) at /mnt/disk1/laihui/doris/be/src/common/status.h:422
        1#  doris::Status doris::Status::InternalError<true>(std::basic_string_view<char, std::char_traits<char> >) at /mnt/disk1/laihui/doris/be/src/common/status.h:468
        2#  doris::KafkaDataConsumer::group_consume(doris::BlockingQueue<RdKafka::Message*>*, long) at /mnt/disk1/laihui/doris/be/src/runtime/routine_load/data_consumer.cpp:226
        3#  doris::KafkaDataConsumerGroup::actual_consume(std::shared_ptr<doris::DataConsumer>, doris::BlockingQueue<RdKafka::Message*>*, long, std::function<void (doris::Status const&)>) at /mnt/disk1/laihui/doris/be/src/runtime/routine_load/data_consumer_group.cpp:200
        4#  void std::__invoke_impl<void, void (doris::KafkaDataConsumerGroup::*&)(std::shared_ptr<doris::DataConsumer>, doris::BlockingQueue<RdKafka::Message*>*, long, std::function<void (doris::Status const&)>), doris::KafkaDataConsumerGroup*&, std::shared_ptr<doris::DataConsumer>&, doris::BlockingQueue<RdKafka::Message*>*&, long&, doris::KafkaDataConsumerGroup::start_all(std::shared_ptr<doris::StreamLoadContext>, std::shared_ptr<doris::io::KafkaConsumerPipe>)::$_0&>(std::__invoke_memfun_deref, void (doris::KafkaDataConsumerGroup::*&)(std::shared_ptr<doris::DataConsumer>, doris::BlockingQueue<RdKafka::Message*>*, long, std::function<void (doris::Status const&)>), doris::KafkaDataConsumerGroup*&, std::shared_ptr<doris::DataConsumer>&, doris::BlockingQueue<RdKafka::Message*>*&, long&, doris::KafkaDataConsumerGroup::start_all(std::shared_ptr<doris::StreamLoadContext>, std::shared_ptr<doris::io::KafkaConsumerPipe>)::$_0&) at /mnt/disk1/laihui/build/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74
...
```

now
```
ErrorReason{code=errCode = 105, msg='be 10002 abort task, task id: 3ba0c0f4-d13c-4dfa-90ce-3df922fd9340 job id: 11310 with reason: [INTERNAL_ERROR]Offset out of range, consume partition 0, consume offset 100, the offset used by job does not exist in kafka, please check the offset, using the Alter ROUTINE LOAD command to modify it, and resume the job'}
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-07 18:29:04 +08:00
89857d3780 [cherry-pick](branch-2.1) Pick "Use async group commit rpc call (#36499)" (#37380)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Pick #36499
2024-07-07 18:28:19 +08:00
7d423b3a6a [chery-pick](branch-2.1) Pick "[Fix](group commit) Fix group commit block queue mem estimate fault" (#37379)
Pick [Fix](group commit) Fix group commit block queue mem estimate faule
#35314

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

**Problem:** When `group commit=async_mode` and NULL data is imported
into a `variant` type column, it causes incorrect memory statistics for
group commit backpressure, leading to a stuck issue. **Cause:** In group
commit mode, blocks are first added to a queue in batches using `add
block`, and then blocks are retrieved from the queue using `get block`.
To track memory usage during backpressure, we add the block size to the
memory statistics during `add block` and subtract the block size from
the memory statistics during `get block`. However, for `variant` types,
during the `add block` write to WAL, serialization occurs, which can
merge types (e.g., merging `int` and `bigint` into `bigint`), thereby
changing the block size. This results in a discrepancy between the block
size during `get block` and `add block`, causing memory statistics to
overflow.
**Solution:** Record the block size at the time of `add block` and use
this recorded size during `get block` instead of the actual block size.
This ensures consistency in the memory addition and subtraction.

## Further comments

If this is a relatively large or complex change, kick off the discussion
at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why
you chose the solution you did and what alternatives you considered,
etc...

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-07 18:27:49 +08:00
61bc624938 [branch-2.1](move-memtable) fix move memtable core when use multi table load (#37370)
## Proposed changes

pick https://github.com/apache/doris/pull/35458
2024-07-07 18:25:00 +08:00
f2693152bb [fix](multi-table-load) fix be core when multi table load pipe finish fail (#37383)
pick #36269
2024-07-07 18:24:16 +08:00
c399a0e216 [opt](inverted index) reduce generation of the rowid_result if not necessary #35357 (#36569) 2024-07-06 21:33:03 +08:00
38b3870fe8 [branch-2.1] Picks "[fix](autoinc) Fix AutoIncrementGenerator and add more logs about auto-increment column #37306" (#37366)
## Proposed changes

picks https://github.com/apache/doris/pull/37306
2024-07-06 16:53:29 +08:00
a803e1493a [pipeline](fix) Set upstream operators always runnable once source op… (#37325)
…erator closed (#37297)

Some kinds of source operators has a 1-1 relationship with a sink
operator (such as AnalyticOperator). We must ensure AnalyticSinkOperator
will not be blocked if AnalyticSourceOperator already closed.

pick #37297
2024-07-05 13:54:34 +08:00
f8cee439b6 [feature](ES Catalog) map nested/object type in ES to JSON type in Doris (#37101) (#37182)
backport #37101
2024-07-05 10:48:32 +08:00
c8978fc9d1 [fix](HadoopLz4BlockCompression)Fixed the bug that HadoopLz4BlockCompression creates _decompressor every time it decompresses.(#37187) (#37299)
bp : #37187
2024-07-04 20:22:27 +08:00
Pxl
e2c2702dff [Bug](runtime-filter) fix some rf error problems (#37155)
## Proposed changes
pick from #37273
2024-07-04 20:03:46 +08:00
b272247a57 [pick]log thread num (#37258)
## Proposed changes

pick #37159
2024-07-04 15:27:52 +08:00
ceef9ee123 [feature](serde) support presto compatible output format (#37039) (#37253)
bp #37039
2024-07-04 13:56:05 +08:00
fb344b66ca [fix](hash join) fix numeric overflow when calculating hash table bucket size #37193 (#37213)
## Proposed changes

Bp #37193
2024-07-04 11:12:52 +08:00
4532ba990a [fix](pipeline) Avoid to close task twice (#36747) (#37115) 2024-07-04 10:02:56 +08:00
Pxl
70e1c563b3 [Chore](runtime-filter) enlarge sync filter size rpc timeout limit (#37103) (#37225)
pick from #37103
2024-07-03 21:02:26 +08:00
Pxl
ffc57c9ef4 [Bug](runtime-filter) fix brpc ctrl use after free (#37223)
part of #35186
2024-07-03 21:01:50 +08:00
97945af947 [fix](merge-on-write) when full clone failed, duplicate key might occur (#37001) (#37229)
cherry-pick #37001
2024-07-03 19:48:10 +08:00
0aeb768bf9 [Fix](export/outfile) Support compression when exporting data to Parquet / ORC. (#37167)
bp: #36490
2024-07-03 10:53:57 +08:00
bd24a8bdd9 [Fix](csv_reader) Add a session variable to control whether empty rows in CSV files are read as NULL values (#37153)
bp: #36668
2024-07-02 22:12:17 +08:00
e25717458e [opt](catalog) add some profile for parquet reader and change meta cache config (#37040) (#37146)
bp #37040
2024-07-02 20:58:43 +08:00
f5572ac732 [pick]reset memtable flush thread num (#37092)
## Proposed changes

pick #37028
2024-07-02 19:20:17 +08:00
239bc1a7e0 [fix](compile) fix compile failed on MacOS due to ambiguous std::abs (#37136)
cherry-pick #35125 to branch-2.1

Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
2024-07-02 17:45:33 +08:00
f5d0cdeeb4 [branch-2.1] avoid glog coredump when running with ASAN (#37134)
## Proposed changes

This is just a workround try avoid coredump like this:
```
#0 0x56414f0e8ed1 in __asan::CheckUnwind() crtstuff.c
    #1 0x56414f1009a2 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) crtstuff.c
    #2 0x56414f0ecbf3 in __asan::AsanThread::GetStackFrameAccessByAddr(unsigned long, __asan::AsanThread::StackFrameAccess*) crtstuff.c
    #3 0x56414f050d87 in __asan::AddressDescription::AddressDescription(unsigned long, unsigned long, bool) crtstuff.c
    #4 0x56414f052a73 in __asan::ErrorGeneric::ErrorGeneric(unsigned int, unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long) crtstuff.c
    #5 0x56414f0e6a9e in __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) crtstuff.c
    #6 0x56414f066885 in gmtime_r (/mnt/hdd01/ci/branch21-deploy/be/lib/doris_be+0x17ef3885) (BuildId: f58eb5e327529636)
    #7 0x564177940521 in google::LogMessage::Init(char const*, int, int, void (google::LogMessage::*)()) crtstuff.c
    #8 0x564151de36fc in doris::Status doris::ThriftRpcHelper::rpc(std::__cxx11::basic_string, std::allocator> const&, int, std::function&)>, int) /home/zcp/repo_center/doris_branch-2.1/doris/be/src/util/thrift_rpc_helper.cpp:76:13
    #9 0x56417603cda7 in doris::vectorized::VRowDistribution::automatic_create_partition() /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/sink/vrow_distribution.cpp:99:5
    #10 0x56417614cffa in doris::vectorized::VTabletWriter::_send_new_partition_batch() /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/sink/writer/vtablet_writer.cpp:1346:9
....
```
2024-07-02 17:45:04 +08:00
d0eea3886d [fix](multi-catalog) Revert #36575 and check nullptr of data column (#37086)
Revert #36575, because `VScanner::get_block` will check
`DCHECK(block->rows() == 0)`, so block should be cleared when `eof =
true`.
2024-07-02 15:32:52 +08:00
7443e8fcf2 [cherry-pick](branch-2.1) fix single compaction test p2 #34568 #36881 (#37075) 2024-07-02 15:22:04 +08:00
859a7c80b8 [fix](parquet) prevent parquet page reader print much warning logs (#37012)
bp #37011
2024-07-02 14:33:01 +08:00
6789f5bc80 [fix](null safe equal join) fix coredump if both sides of the conjunct is not nullable #36263 (#37073) 2024-07-02 11:01:55 +08:00
e686e85f27 [opt](split) add max wait time of getting splits (#36842)
bp: #36843
2024-07-01 22:05:25 +08:00
72c20d3ccc [branch-2.1](function) fix date_format and from_unixtime core when meet long format string (#35883) (#36158)
pick #35883
2024-07-01 20:35:31 +08:00
798d9d6fc6 [pick21][opt](mow) reduce memory usage for mow table compaction (#36865) (#36968)
cherry-pick https://github.com/apache/doris/pull/36865 to branch-2.1
2024-07-01 15:33:18 +08:00