## Proposed changes
Sometimes if expr failed, then there will be a nullptr column left in
the block.
We should ignore the nullptr column, or exception will be thrown and
some profile will not computed correctly.
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
Co-authored-by: yiguolei <yiguolei@gmail.com>
Pick [Fix](group commit) Fix group commit block queue mem estimate faule
#35314
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
**Problem:** When `group commit=async_mode` and NULL data is imported
into a `variant` type column, it causes incorrect memory statistics for
group commit backpressure, leading to a stuck issue. **Cause:** In group
commit mode, blocks are first added to a queue in batches using `add
block`, and then blocks are retrieved from the queue using `get block`.
To track memory usage during backpressure, we add the block size to the
memory statistics during `add block` and subtract the block size from
the memory statistics during `get block`. However, for `variant` types,
during the `add block` write to WAL, serialization occurs, which can
merge types (e.g., merging `int` and `bigint` into `bigint`), thereby
changing the block size. This results in a discrepancy between the block
size during `get block` and `add block`, causing memory statistics to
overflow.
**Solution:** Record the block size at the time of `add block` and use
this recorded size during `get block` instead of the actual block size.
This ensures consistency in the memory addition and subtraction.
## Further comments
If this is a relatively large or complex change, kick off the discussion
at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why
you chose the solution you did and what alternatives you considered,
etc...
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
## Proposed changes
1. return error when bloom filter allocate memory failed
2. return error when deserialize a block, it may need a lot of memory.
---------
Co-authored-by: yiguolei <yiguolei@gmail.com>
Improve the performance under the tpch data set by reconstructing the join related code and the use of hash table
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: BiteTheDDDDt <pxl290@qq.com>
When executing broker load in ASAN mode, BE may crash with error:
```
F20231010 18:18:17.044978 185490 block.cpp:694] Check failed: d.column->use_count() == 1 (3 vs. 1)
*** Check failure stack trace: ***
@ 0x55e9d94c4e46 google::LogMessage::SendToLog()
@ 0x55e9d94c1410 google::LogMessage::Flush()
@ 0x55e9d94c5689 google::LogMessageFatal::~LogMessageFatal()
@ 0x55e9c509f80d doris::vectorized::Block::clear_column_data()
@ 0x55e9b6c170b3 doris::PlanFragmentExecutor::get_vectorized_internal()
@ 0x55e9b6c147e6 doris::PlanFragmentExecutor::open_vectorized_internal()
@ 0x55e9b6c12d9a doris::PlanFragmentExecutor::open()
@ 0x55e9b6c18426 doris::PlanFragmentExecutor::execute()
@ 0x55e9b6945cca doris::FragmentMgr::_exec_actual()
@ 0x55e9b696456c doris::FragmentMgr::exec_plan_fragment()::$_0::operator()()
```
It may happen when there is column maping like:
```
(k1,v2,v3,v4,v5,v6,v7,v8)
set (k2=v4,k3=v4,k4=v4)
```
in load stmt.
Case is covered by Baidu test cases
should check the pb's type is set, or the deserialize will core.
should not return unknown type because deserialize will core.
---------
Co-authored-by: yiguolei <yiguolei@gmail.com>
After the outer catch exception, faststring resize reserve build may throw a memory alloc failure exception from the Allocator.
Currently page body compress will catch memory alloc failure exception
disallow call new method explicitly
force to use create_shared or create_unique to use shared ptr
placement new is allowed
reference https://abseil.io/tips/42 to add factory method to all class.
I think we should follow this guide because if throw exception in new method, the program will terminate.
---------
Co-authored-by: yiguolei <yiguolei@gmail.com>
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.