Commit Graph

7932 Commits

Author SHA1 Message Date
ef031c5fb2 [branch-2.1](memory) Fix reserve memory compatible with memory GC and logging (#37682)
pick
#36307
#36412
2024-07-12 11:43:26 +08:00
4dc933bb28 [cherry-pick] (branch-2.1) fix query errors caused by ignore_above (#37685)
## Proposed changes
pick from master #37679
2024-07-12 09:31:45 +08:00
87912de93f [fix](scan) catch exceptions thrown in scanner (#36101) (#37408)
## Proposed changes

pick #36101

The uncaught exceptions thrown in the scanner will cause the BE to
crash.
2024-07-12 08:49:39 +08:00
79a208259e [cherry-pick] (branch-2.1) Remove the check for inverted index file exists #36945 (#37423) 2024-07-11 21:35:52 +08:00
217eac790b [pick](Variant) pick some refactor and fix #34925 #36317 #36201 #36793 (#37526) 2024-07-11 21:25:34 +08:00
cf2fb6945a [branch-2.1](memory) Refactor LRU cache policy memory tracking (#37658)
pick 
#36235
#35965
2024-07-11 21:04:01 +08:00
62e0230523 [branch-2.1](memory) Add ThreadMemTrackerMgr BE UT (#37654)
## Proposed changes

pick #35518
2024-07-11 21:03:49 +08:00
fed632bf4a [fix](move-memtable) check segment num when closing each tablet (#36753) (#37536)
cherry-pick #36753 and #37660
2024-07-11 20:33:44 +08:00
e66ffc1b6d [branch-2.1](arrow-flight-sql) Fix pipelineX Unknown result sink type (#37540)
pick ##35804
2024-07-11 12:30:46 +08:00
9f4e7346fb [fix](compaction) fixing the inaccurate statistics of concurrent compaction tasks (#37318) (#37496) 2024-07-10 22:23:25 +08:00
741807bb22 [performance](move-memtable) only call _select_streams when necessary (#35576) (#37406)
cherry-pick #35576
2024-07-10 22:20:23 +08:00
afcc6170f6 [fix](txn_manager) Add ingested rowsets to unused rowsets when removing txn (#37417)
Generally speaking, as long as a rowset has a version, it can be
considered not to be in a pending state. However, if the rowset was
created through ingesting binlogs, it will have a version but should
still be considered in a pending state because the ingesting txn has not
yet been committed.

This PR updates the condition for determining the pending state. If a
rowset is COMMITTED, the txn should be allowed to roll back even if a
version exists.

Cherry-pick #36551
2024-07-10 14:25:44 +08:00
0cdb371624 [branch-2.1](memory) Disable Arrow Jemalloc step 2 (#37556)
pick #37533
2024-07-10 11:34:18 +08:00
5247e0ff3a [fix](clone) Increase robustness for clone #36642 (#37413)
cherry pick from #36642
2024-07-09 17:18:14 +08:00
7cda8db020 [fix](load) The NodeChannel should be canceled when failed to add block #37500 (#37527)
cherry pick from #37500
2024-07-09 17:01:04 +08:00
f7f0c20f00 [branch-2.1](cgroup memory) Correct cgroup mem info cache (#37440)
pick #36966

Co-authored-by: Hongkun Xu <xuhongkun666@163.com>
2024-07-09 16:19:37 +08:00
005304953e [performance](load) do not copy input_block in memtable (#36939) (#37407)
cherry-pick #36939
2024-07-09 15:59:44 +08:00
81360cf897 [opt](test) shorten the external p0 running time (#37320) (#37473)
bp #37320
2024-07-09 15:35:15 +08:00
3337c1bbe3 [[enhancement](compaction) adjust compaction concurrency based on compaction score and workload (#37491)
adjust compaction concurrency based on compaction score and workload
#36672
fix null pointer when retrieving CPU load average #37171
2024-07-09 09:56:35 +08:00
1e3ab0ff8c [fix](group commit) Pick make group commit cancel in time (#36249) (#37404)
pick https://github.com/apache/doris/pull/36249/
2024-07-09 09:25:11 +08:00
1a25270918 [fix](group commit) Pick Fix the incorrect group commit count in log; fix the core in get_first_block (#36408) (#37405)
Pick https://github.com/apache/doris/pull/36408/
2024-07-09 09:24:43 +08:00
0a103aa11f [improve](json)improve json support empty keys #36762 (#37351) 2024-07-08 19:04:51 +08:00
5280e277e7 [chore](be) Acquire and check MD5 digest of the file to download (#37418)
Cherry-pick #35807, #36621, #36726
2024-07-08 18:55:35 +08:00
494b54a5a5 [enhancement](trash) support skip trash, update trash default expire time (#37170) (#37409)
cherry-pick #37170
2024-07-08 15:33:02 +08:00
c66df8d9e6 [branch-2.1](load) fix no error url if no partition can be found (#36831) (#37401)
## Proposed changes

pick #36831

before
```
Stream load result: {
    "TxnId": 2014,
    "Label": "83ba46bd-280c-4e22-b581-4eb126fd49cf",
    "Comment": "",
    "TwoPhaseCommit": "false",
    "Status": "Fail",
    "Message": "[DATA_QUALITY_ERROR]Encountered unqualified data, stop processing",
    "NumberTotalRows": 1,
    "NumberLoadedRows": 1,
    "NumberFilteredRows": 0,
    "NumberUnselectedRows": 0,
    "LoadBytes": 1669,
    "LoadTimeMs": 58,
    "BeginTxnTimeMs": 0,
    "StreamLoadPutTimeMs": 10,
    "ReadDataTimeMs": 0,
    "WriteDataTimeMs": 47,
    "CommitAndPublishTimeMs": 0
}
```

after
```
Stream load result: {
    "TxnId": 2014,
    "Label": "83ba46bd-280c-4e22-b581-4eb126fd49cf",
    "Comment": "",
    "TwoPhaseCommit": "false",
    "Status": "Fail",
    "Message": "[DATA_QUALITY_ERROR]too many filtered rows",
    "NumberTotalRows": 1,
    "NumberLoadedRows": 0,
    "NumberFilteredRows": 1,
    "NumberUnselectedRows": 0,
    "LoadBytes": 1669,
    "LoadTimeMs": 58,
    "BeginTxnTimeMs": 0,
    "StreamLoadPutTimeMs": 10,
    "ReadDataTimeMs": 0,
    "WriteDataTimeMs": 47,
    "CommitAndPublishTimeMs": 0,
    "ErrorURL": "http://XXXX:8040/api/_load_error_log?file=__shard_4/error_log_insert_stmt_c6461270125a615b-2873833fb48d56a3_c6461270125a615b_2873833fb48d56a3"
}
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-08 10:41:33 +08:00
dd18652861 [branch-2.1](routine-load) make get Kafka meta timeout configurable (#37399)
pick #36619
2024-07-08 10:39:17 +08:00
70f46c12b3 [improve](group commit) Pick Modify group commit case and modify cancel status (#35995) (#37398)
Pick https://github.com/apache/doris/pull/35995
2024-07-08 10:27:08 +08:00
a05406ecc9 [branch-2.1] Picks "[Fix](delete) Fix delete job timeout when executing delete from ... #37363" (#37374)
## Proposed changes

picks https://github.com/apache/doris/pull/37363
2024-07-07 18:33:17 +08:00
423483ed8f [branch-2.1](routine-load) optimize out of range error message (#37391)
## Proposed changes
pick #36450

before
```
ErrorReason{code=errCode = 105, msg='be 10002 abort task, task id: d846f3d3-7c9e-44a7-bee0-3eff8cd11c6f job id: 11310 with reason: [INTERNAL_ERROR]Offset out of range,

        0#  doris::Status doris::Status::Error<6, true>(std::basic_string_view<char, std::char_traits<char> >) at /mnt/disk1/laihui/doris/be/src/common/status.h:422
        1#  doris::Status doris::Status::InternalError<true>(std::basic_string_view<char, std::char_traits<char> >) at /mnt/disk1/laihui/doris/be/src/common/status.h:468
        2#  doris::KafkaDataConsumer::group_consume(doris::BlockingQueue<RdKafka::Message*>*, long) at /mnt/disk1/laihui/doris/be/src/runtime/routine_load/data_consumer.cpp:226
        3#  doris::KafkaDataConsumerGroup::actual_consume(std::shared_ptr<doris::DataConsumer>, doris::BlockingQueue<RdKafka::Message*>*, long, std::function<void (doris::Status const&)>) at /mnt/disk1/laihui/doris/be/src/runtime/routine_load/data_consumer_group.cpp:200
        4#  void std::__invoke_impl<void, void (doris::KafkaDataConsumerGroup::*&)(std::shared_ptr<doris::DataConsumer>, doris::BlockingQueue<RdKafka::Message*>*, long, std::function<void (doris::Status const&)>), doris::KafkaDataConsumerGroup*&, std::shared_ptr<doris::DataConsumer>&, doris::BlockingQueue<RdKafka::Message*>*&, long&, doris::KafkaDataConsumerGroup::start_all(std::shared_ptr<doris::StreamLoadContext>, std::shared_ptr<doris::io::KafkaConsumerPipe>)::$_0&>(std::__invoke_memfun_deref, void (doris::KafkaDataConsumerGroup::*&)(std::shared_ptr<doris::DataConsumer>, doris::BlockingQueue<RdKafka::Message*>*, long, std::function<void (doris::Status const&)>), doris::KafkaDataConsumerGroup*&, std::shared_ptr<doris::DataConsumer>&, doris::BlockingQueue<RdKafka::Message*>*&, long&, doris::KafkaDataConsumerGroup::start_all(std::shared_ptr<doris::StreamLoadContext>, std::shared_ptr<doris::io::KafkaConsumerPipe>)::$_0&) at /mnt/disk1/laihui/build/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74
...
```

now
```
ErrorReason{code=errCode = 105, msg='be 10002 abort task, task id: 3ba0c0f4-d13c-4dfa-90ce-3df922fd9340 job id: 11310 with reason: [INTERNAL_ERROR]Offset out of range, consume partition 0, consume offset 100, the offset used by job does not exist in kafka, please check the offset, using the Alter ROUTINE LOAD command to modify it, and resume the job'}
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-07 18:29:04 +08:00
89857d3780 [cherry-pick](branch-2.1) Pick "Use async group commit rpc call (#36499)" (#37380)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Pick #36499
2024-07-07 18:28:19 +08:00
7d423b3a6a [chery-pick](branch-2.1) Pick "[Fix](group commit) Fix group commit block queue mem estimate fault" (#37379)
Pick [Fix](group commit) Fix group commit block queue mem estimate faule
#35314

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

**Problem:** When `group commit=async_mode` and NULL data is imported
into a `variant` type column, it causes incorrect memory statistics for
group commit backpressure, leading to a stuck issue. **Cause:** In group
commit mode, blocks are first added to a queue in batches using `add
block`, and then blocks are retrieved from the queue using `get block`.
To track memory usage during backpressure, we add the block size to the
memory statistics during `add block` and subtract the block size from
the memory statistics during `get block`. However, for `variant` types,
during the `add block` write to WAL, serialization occurs, which can
merge types (e.g., merging `int` and `bigint` into `bigint`), thereby
changing the block size. This results in a discrepancy between the block
size during `get block` and `add block`, causing memory statistics to
overflow.
**Solution:** Record the block size at the time of `add block` and use
this recorded size during `get block` instead of the actual block size.
This ensures consistency in the memory addition and subtraction.

## Further comments

If this is a relatively large or complex change, kick off the discussion
at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why
you chose the solution you did and what alternatives you considered,
etc...

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-07 18:27:49 +08:00
61bc624938 [branch-2.1](move-memtable) fix move memtable core when use multi table load (#37370)
## Proposed changes

pick https://github.com/apache/doris/pull/35458
2024-07-07 18:25:00 +08:00
f2693152bb [fix](multi-table-load) fix be core when multi table load pipe finish fail (#37383)
pick #36269
2024-07-07 18:24:16 +08:00
c399a0e216 [opt](inverted index) reduce generation of the rowid_result if not necessary #35357 (#36569) 2024-07-06 21:33:03 +08:00
38b3870fe8 [branch-2.1] Picks "[fix](autoinc) Fix AutoIncrementGenerator and add more logs about auto-increment column #37306" (#37366)
## Proposed changes

picks https://github.com/apache/doris/pull/37306
2024-07-06 16:53:29 +08:00
a803e1493a [pipeline](fix) Set upstream operators always runnable once source op… (#37325)
…erator closed (#37297)

Some kinds of source operators has a 1-1 relationship with a sink
operator (such as AnalyticOperator). We must ensure AnalyticSinkOperator
will not be blocked if AnalyticSourceOperator already closed.

pick #37297
2024-07-05 13:54:34 +08:00
f8cee439b6 [feature](ES Catalog) map nested/object type in ES to JSON type in Doris (#37101) (#37182)
backport #37101
2024-07-05 10:48:32 +08:00
c8978fc9d1 [fix](HadoopLz4BlockCompression)Fixed the bug that HadoopLz4BlockCompression creates _decompressor every time it decompresses.(#37187) (#37299)
bp : #37187
2024-07-04 20:22:27 +08:00
Pxl
e2c2702dff [Bug](runtime-filter) fix some rf error problems (#37155)
## Proposed changes
pick from #37273
2024-07-04 20:03:46 +08:00
b272247a57 [pick]log thread num (#37258)
## Proposed changes

pick #37159
2024-07-04 15:27:52 +08:00
ceef9ee123 [feature](serde) support presto compatible output format (#37039) (#37253)
bp #37039
2024-07-04 13:56:05 +08:00
fb344b66ca [fix](hash join) fix numeric overflow when calculating hash table bucket size #37193 (#37213)
## Proposed changes

Bp #37193
2024-07-04 11:12:52 +08:00
4532ba990a [fix](pipeline) Avoid to close task twice (#36747) (#37115) 2024-07-04 10:02:56 +08:00
Pxl
70e1c563b3 [Chore](runtime-filter) enlarge sync filter size rpc timeout limit (#37103) (#37225)
pick from #37103
2024-07-03 21:02:26 +08:00
Pxl
ffc57c9ef4 [Bug](runtime-filter) fix brpc ctrl use after free (#37223)
part of #35186
2024-07-03 21:01:50 +08:00
97945af947 [fix](merge-on-write) when full clone failed, duplicate key might occur (#37001) (#37229)
cherry-pick #37001
2024-07-03 19:48:10 +08:00
0aeb768bf9 [Fix](export/outfile) Support compression when exporting data to Parquet / ORC. (#37167)
bp: #36490
2024-07-03 10:53:57 +08:00
bd24a8bdd9 [Fix](csv_reader) Add a session variable to control whether empty rows in CSV files are read as NULL values (#37153)
bp: #36668
2024-07-02 22:12:17 +08:00
e25717458e [opt](catalog) add some profile for parquet reader and change meta cache config (#37040) (#37146)
bp #37040
2024-07-02 20:58:43 +08:00
f5572ac732 [pick]reset memtable flush thread num (#37092)
## Proposed changes

pick #37028
2024-07-02 19:20:17 +08:00