Commit Graph

106 Commits

Author SHA1 Message Date
e2e9b9d8a3 [improve](insert-into) record rows info in log for check (#29581) 2024-01-05 17:28:07 +08:00
9434ee5710 [fix](load) fix memtracking orphan too large (#28600) 2023-12-19 12:41:19 +08:00
d7dd7b775b enhance performance for broken tablet checking under multi-core scenario with a coarse-grained read lock (#28552) 2023-12-19 12:33:34 +08:00
6da36e1077 [feature](merge-cloud) Refactor write path code by abstract base class (#26537)
Refactor write path code by abstract base class. Whether to use `StorageEngine` or `CloudStorageEngine` will be determined during compilation instead of runtime `config::cloud_mode` to avoid unexpected null pointer or undefined behavior issues caused by merging code.

Class that depend on `StorageEngine` but are shared by the cloud mode need to have an abstract base class. Common code should be extracted into the base class, while the code that depends on `StorageEngine` should be implemented in a `StorageEngine` mix-in class of the base class.
2023-12-08 14:50:36 +08:00
Pxl
e3d2425d47 [Improvement](join) remove insert_indices_from_join and special judge for -1 (#27779)
remove insert_indices_from_join and special judge for -1
2023-12-04 11:03:22 +08:00
a5565f68b2 [Refactor](opentelemetry) Remove opentelemetry (#26605) 2023-11-09 18:05:34 +08:00
47ba4aaf30 [Enhancement](load) add timer and partitions number limit (#26549)
add timer and partitions number limit
2023-11-08 11:22:40 +08:00
1a83a39aec Revert "[fix](auto-partition) Fix auto partition concurrent conflict (#26166)" (#26448)
This reverts commit f22611769944e78c28f1b0a1eeb7b7414a16e8db.
2023-11-06 16:39:19 +08:00
f226117699 [fix](auto-partition) Fix auto partition concurrent conflict (#26166) 2023-11-06 10:34:26 +08:00
642e5cdb69 [Fix](Status) Make Status [[nodiscard]] and handle returned Status correctly (#23395) 2023-09-29 22:38:52 +08:00
b9ddcbf729 [feature](merge-cloud) Rewrite code related to IOContext (#24269) 2023-09-15 19:57:58 +08:00
d3f1388717 [Feature](partitions) Support auto-partition (#24153)
Co-authored-by: zhangstar333 <2561612514@qq.com>
2023-09-12 15:23:15 +08:00
fdb7a44f57 Revert "[Feature](partitions) Support auto partition" (#24024)
* Revert "[Feature](partitions) Support auto partition (#23236)"

This reverts commit 6c544dd2011d731b8c9c51384c77bcf19c017981.

* Update config.h
2023-09-07 17:08:26 +08:00
6c544dd201 [Feature](partitions) Support auto partition (#23236)
Co-authored-by: zhangstar333 <2561612514@qq.com>
2023-09-06 16:26:45 +08:00
62c075bf7e [improvement](Block) Replace Block(const PBlock&) with deserialize because it has heavy operations in ctor (#23672) 2023-08-31 14:44:17 +08:00
7e7cfd17bf [fix](tablet sink) check data valid of tablet sink data (#23530)
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
2023-08-28 15:54:12 +08:00
0d7a61ae8c [fix](load) fix duplicate register of memtable writer in memory limiter (#23205) 2023-08-22 10:05:17 +08:00
6cf1efc997 [refactor](load) use smart pointers to manage writers in memtable memory limiter (#23019) 2023-08-16 16:34:57 +08:00
58e7952eea [refactor](load) use memtable writer in memtable memory limiter (#22780) 2023-08-10 17:08:47 +08:00
124c1b16cf [performance](load) remove unnecessary lock in TabletsChannel::add_batch (#22703)
This lock was introduced by lazy open in #18874.
It's unnecessary and costly to hold a lock while writing data to DeltaWriter in the first place.

However, since lazy open is reverted in #21821, we can completely omit this lock.
_tablet_writers is not supposed to be changed once we've reached TabletsChannel::add_batch.
2023-08-08 22:08:21 +08:00
0ca0c162b1 [fix][load] fix memtable reset cause nullptr (#22577) 2023-08-07 10:45:09 +08:00
781c1d5238 [log](load) add debug logs for potential duplicate tablet ids (#22485) 2023-08-02 20:38:41 +08:00
4d3e56e2e7 [fix][regression-test] change lazy open regression test name (#22404) 2023-08-01 20:26:10 +08:00
ee754307bb [refactor](load) refactor memtable flush actively (#21634) 2023-07-30 21:31:54 +08:00
a6311e2f95 fix erase (#22235) 2023-07-26 17:06:37 +08:00
86e80ae175 [enhancement](merge-on-write) support concurrent delete bitmap calc while close_wait (#21488) 2023-07-24 10:09:28 +08:00
2b2ac10e93 [feature](partial update) add failure tolerance for strict mode partial update stream load 2023-07-21 16:46:44 +08:00
e4c6b9893a [improve](load) add more profiles in tablets channel (#21838) 2023-07-21 13:59:15 +08:00
c6063ed92f [Revert](lazy open) revert lazy open and add case (#21821) 2023-07-18 19:41:33 +08:00
fb14950887 [refactor](load) split flush_segment_writer into two parts (#21372) 2023-07-06 11:13:34 +08:00
6d579d924d [fix](profile) delete useless profile add_child #20989 2023-06-20 23:21:52 +08:00
ce1b39e79d [fix](profile) avoid unnecessary refresh profile of TabletsChannel
Before, refresh the TabletsChannel profile in the LoadChannelMgr refresh memory statistics thread

This means that enable_profile=false will refresh and have performance loss in stress test
2023-06-20 21:09:43 +08:00
a3342cb088 [improvement](load) avoid producing small segment (#20852)
avoid producing small segment
2023-06-19 18:34:44 +08:00
e0d9f7f955 [enhancement](load) add some profile items for load (#20141) 2023-05-29 09:54:03 +08:00
e2b8c0004b [Fix](lazy_open) Fix dead lock in lazy open (#19652) 2023-05-15 23:18:33 +08:00
f8ef25bb10 [enhancement](load) lazy-open necessary partitions when load (#18874) 2023-05-14 16:09:55 +08:00
3899c08036 [optimize](compile) remove unused template param from load channel (#18980)
* [optimize](compile) remove unused template param from load channel



---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-24 23:36:47 +08:00
8e4710079d [improvement](profile) Insert into add LoadChannel runtime profile (#18908)
TabletSink and LoadChannel in BE are M: N relationship,
Every once in a while LoadChannel will randomly return its own runtime profile to a TabletSink, so usually all LoadChannel runtime profiles are saved on each TabletSink, and the timeliness of the same LoadChannel profile saved on different TabletSinks is different, and each TabletSink will periodically send fe reports all the LoadChannel profiles saved by itself, and ensures to update the latest LoadChannel profile according to the timestamp.
2023-04-24 09:41:57 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
fcd25b53bf [Optimize](Random distribution) Improve the performance of tablet sin… (#17389)
The current distribution model for Doris is as follows:

OlapTableSink seperate the original Block into serveral subblocks of each node(BE) by tablets distribution and distributes subblocks to storage engine of backends, then the storage engine will seperate the subblock into multiple tablets channel and each delta writer will handle partial of the block.

This model causes blocks to be split according to tablets, and the splitting process can be a relatively heavy operation. After splitting, the blocks are distributed to different DeltaWriters (Memtables) through RPCs to TabletChannels. The distribution operation on TabletChannels is also a relatively heavy operation. If the distribution property of the table is RANDOM distribution, then we have the opportunity to distribute the blocks according to the complete block during distribution. The advantage of doing so is to reduce memory copying and improve write locality, similar to appending the entire block to the memtable.

This optimze could save 10% ~ 20% CPU cost of RANDOM distribution table load when enable load_to_single_tablet
2023-03-10 10:52:40 +08:00
4692d6764c [refactor](remove string val) remove string val structure, it is same with string ref (#17461)
remove stringval, decimalv2val, bigintval
2023-03-08 10:42:20 +08:00
c071c327e7 [fix](load) fix add broken tablet core dump (#17104) 2023-02-24 23:59:03 +08:00
09d41c3479 [fix](log) clarify error msg for tablet writer write failure (#14078) (#16954) (#16950)
fmt::format dosen't support non-template object as args, even if it implements
`to_string()` or `operator<<`. so orignal code may cause `false` to be printed
instead of real cause of the failure. So to_string() need to be manually invoked.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-02-21 19:42:49 +08:00
4e64ff6329 [enhancement](load) avoid schema copy to reduce cpu usage (#16034) 2023-01-28 11:13:57 +08:00
5521c7a236 [fix](load) fix that tablet channel doesn't set received rows for verify the number of rows (#15961) 2023-01-16 19:46:59 +08:00
d857b4af1b [refactor](remove row batch) remove impala rowbatch structure (#15767)
* [refactor](remove row batch) remove impala rowbatch structure

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-01-11 09:37:35 +08:00
4380f1ec54 [Enhancement](load) reduce memory by memory size of global delta writer (#14491) 2023-01-03 20:05:21 +08:00
b23d068281 [refactor](remove-non-vec) Remove non vec load from memtable and delta writer (#15517)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-12-30 21:22:58 +08:00
a807978882 [refactor](non-vec) Remove rowbatch code from delta writer and some rowbatch related code (#15349)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-12-26 08:54:51 +08:00
03ea2866b7 [fix](load) add to error tablets when delta writer failed to close (#15118)
The result of load should be failed when all tablets delta writer failed to close on single node.
But the result returned to client is success.
The reason is that the committed tablets and error tablets are both empty, so publish will be success.
We should add it to error tablets when delta writer failed to close, then the transaction will be failed.
2022-12-19 14:22:25 +08:00