Commit Graph

117 Commits

Author SHA1 Message Date
011baeb2d2 [bugfix](allocatebytes) ignore null ptr column in Block (#41093) (#41099)
## Proposed changes
Sometimes if expr failed, then there will be a nullptr column left in
the block.
We should ignore the nullptr column, or exception will be thrown and
some profile will not computed correctly.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-09-23 12:18:22 +08:00
12ed2951c4 [fix] (inverted index) remove tmp columns in block (#39369) (#39533) 2024-08-20 20:53:23 +08:00
005304953e [performance](load) do not copy input_block in memtable (#36939) (#37407)
cherry-pick #36939
2024-07-09 15:59:44 +08:00
7d423b3a6a [chery-pick](branch-2.1) Pick "[Fix](group commit) Fix group commit block queue mem estimate fault" (#37379)
Pick [Fix](group commit) Fix group commit block queue mem estimate faule
#35314

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

**Problem:** When `group commit=async_mode` and NULL data is imported
into a `variant` type column, it causes incorrect memory statistics for
group commit backpressure, leading to a stuck issue. **Cause:** In group
commit mode, blocks are first added to a queue in batches using `add
block`, and then blocks are retrieved from the queue using `get block`.
To track memory usage during backpressure, we add the block size to the
memory statistics during `add block` and subtract the block size from
the memory statistics during `get block`. However, for `variant` types,
during the `add block` write to WAL, serialization occurs, which can
merge types (e.g., merging `int` and `bigint` into `bigint`), thereby
changing the block size. This results in a discrepancy between the block
size during `get block` and `add block`, causing memory statistics to
overflow.
**Solution:** Record the block size at the time of `add block` and use
this recorded size during `get block` instead of the actual block size.
This ensures consistency in the memory addition and subtraction.

## Further comments

If this is a relatively large or complex change, kick off the discussion
at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why
you chose the solution you did and what alternatives you considered,
etc...

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-07 18:27:49 +08:00
d0eea3886d [fix](multi-catalog) Revert #36575 and check nullptr of data column (#37086)
Revert #36575, because `VScanner::get_block` will check
`DCHECK(block->rows() == 0)`, so block should be cleared when `eof =
true`.
2024-07-02 15:32:52 +08:00
f80b856405 [enhancement](oom) return error when bloom filter allocate memory failed (#35790)
## Proposed changes


1. return error when bloom filter allocate memory failed
2. return error when deserialize a block,  it may need a lot of memory.

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-06-03 18:22:11 +08:00
f03cee5e30 [enhancement](oom) add exception in olap data convertor when memory is not enough to prevent oom (#35761)
Issue Number: close #xxx

<!--Describe your changes.-->

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-06-02 21:12:53 +08:00
f38ecd349c [enhancement](memory) return error if allocate memory failed during add rows method (#35085)
* return error when add rows failed

* f

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-05-22 00:53:34 +08:00
1acd8e9fcb [fix](spill) incorrect result of hash join (#34450) 2024-05-08 10:06:32 +08:00
f6ec64c6ad [fix](exception) Fix Block noexcept method not throw exception (#34002) 2024-04-24 17:13:50 +08:00
8e19cdd745 [featrue](expr) support common subexpression elimination be part (#32673) 2024-04-10 11:56:21 +08:00
8b34915518 [Fix](compress) Fix occasional crushes when serializing blocks (#32672) 2024-03-23 06:20:45 +08:00
04a59d6071 [improve](distinct agg) add check of hash table to decide whether emplace value (#32063)
* [improve](distinct agg) add check of hash table to emplace value
2024-03-15 18:06:15 +08:00
a9ab094614 [Bug](fix) try to fix the coredump of streambyte decode of sse (#30190) 2024-01-23 10:07:51 +08:00
Pxl
3cf95d0fdf [Improvement](execute) optimize for ColumnNullable's serialize_vec/deserialize_vec (#28788)
optimize for ColumnNullable's serialize_vec/deserialize_vec
2024-01-12 11:59:52 +08:00
463a7ab212 [Performance](exec) opt the exchange performance (#29579) 2024-01-12 11:46:29 +08:00
7081139bdc [fix](block) fix be core while mutable block merge may cause different row size between columns in origin block (#27943) 2023-12-25 20:35:22 +08:00
2014396707 [fix](block) add block columns size dcheck (#28539) 2023-12-23 15:21:53 +08:00
fa0ad56817 [exec](compress) use FragmentTransmissionCompressionCodec control the exchange compress behavior (#28818) 2023-12-22 19:50:57 +08:00
7b96730e87 [fix](block) fix nullptr in MutableBlock::allocated_bytes (#28738) 2023-12-20 19:46:13 +08:00
fbe5a7c244 [improvement](decimalv2) support check overflow for decimalv2 arithmetics (#28456) 2023-12-18 10:54:25 +08:00
Pxl
e3d2425d47 [Improvement](join) remove insert_indices_from_join and special judge for -1 (#27779)
remove insert_indices_from_join and special judge for -1
2023-12-04 11:03:22 +08:00
Pxl
d969047b50 [Refactor](join) refactor of hash join (#27557)
Improve the performance under the tpch data set by reconstructing the join related code and the use of hash table

Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: BiteTheDDDDt <pxl290@qq.com>
2023-11-28 19:46:00 +08:00
b580ee91ce [fix](compile) fix macOS compile and format code (#27494) 2023-11-23 23:24:10 +08:00
2ea33518b0 [Opt](load) use batching to optimize auto partition (#26915)
use batching to optimize auto partition
2023-11-23 19:12:28 +08:00
a4d78682ff [Optimize](point query) clear names to reduce mem consumption and cpu cost related to block column name (#26931) 2023-11-17 10:18:21 +08:00
a5565f68b2 [Refactor](opentelemetry) Remove opentelemetry (#26605) 2023-11-09 18:05:34 +08:00
be27d4d921 [fix](broker-load) fix use_count() issue when doing broker load in debug mode (#25288)
When executing broker load in ASAN mode, BE may crash with error:
```
F20231010 18:18:17.044978 185490 block.cpp:694] Check failed: d.column->use_count() == 1 (3 vs. 1)
*** Check failure stack trace: ***
    @     0x55e9d94c4e46  google::LogMessage::SendToLog()
    @     0x55e9d94c1410  google::LogMessage::Flush()
    @     0x55e9d94c5689  google::LogMessageFatal::~LogMessageFatal()
    @     0x55e9c509f80d  doris::vectorized::Block::clear_column_data()
    @     0x55e9b6c170b3  doris::PlanFragmentExecutor::get_vectorized_internal()
    @     0x55e9b6c147e6  doris::PlanFragmentExecutor::open_vectorized_internal()
    @     0x55e9b6c12d9a  doris::PlanFragmentExecutor::open()
    @     0x55e9b6c18426  doris::PlanFragmentExecutor::execute()
    @     0x55e9b6945cca  doris::FragmentMgr::_exec_actual()
    @     0x55e9b696456c  doris::FragmentMgr::exec_plan_fragment()::$_0::operator()()
```

It may happen when there is column maping like:
```
(k1,v2,v3,v4,v5,v6,v7,v8)
set (k2=v4,k3=v4,k4=v4)
```

in load stmt.

Case is covered by Baidu test cases
2023-10-12 17:04:29 +08:00
d31d99bf34 [pipeline](load) opt the pipeline load code (#24708)
opt the pipeline load code
2023-09-21 15:20:31 +08:00
c3b3f0f00a [enhancement](serialize) add dcheck to ensure pb type is set (#24645)
should check the pb's type is set, or the deserialize will core.
should not return unknown type because deserialize will core.
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-09-20 10:42:28 +08:00
d3f1388717 [Feature](partitions) Support auto-partition (#24153)
Co-authored-by: zhangstar333 <2561612514@qq.com>
2023-09-12 15:23:15 +08:00
fdb7a44f57 Revert "[Feature](partitions) Support auto partition" (#24024)
* Revert "[Feature](partitions) Support auto partition (#23236)"

This reverts commit 6c544dd2011d731b8c9c51384c77bcf19c017981.

* Update config.h
2023-09-07 17:08:26 +08:00
6c544dd201 [Feature](partitions) Support auto partition (#23236)
Co-authored-by: zhangstar333 <2561612514@qq.com>
2023-09-06 16:26:45 +08:00
62c075bf7e [improvement](Block) Replace Block(const PBlock&) with deserialize because it has heavy operations in ctor (#23672) 2023-08-31 14:44:17 +08:00
1410a15a61 [fix](compaction) print column name when checking block ColumnPtr is nullptr on get block byte (#23338) 2023-08-29 17:24:48 +08:00
Pxl
477961dc21 [Chore](agg) refactor of hash map (#22958)
refactor of hash map
2023-08-18 17:59:30 +08:00
Pxl
3f55d5d4d5 [Chore](excution) change some log fatal and dcheck to exception (#22890)
change some log fatal and dcheck to exception
2023-08-15 10:45:00 +08:00
5584d7a5ba [Improve](point query) Improve lookup connection cache from DoubleBuffer to LRU cache for better item pruning (#22041) 2023-07-27 22:22:50 +08:00
36524f2b72 [improvement](functions) avoid copying of block in create_block_with_nested_columns (#21526)
avoid copying of block in create_block_with_nested_columns
2023-07-10 17:21:23 +08:00
Pxl
f7c724f8a3 [Bug](excution) avoid core dump on filter_block_internal and add debug information (#21433)
avoid core dump on filter_block_internal and add debug information
2023-07-03 18:10:30 +08:00
2e6d91aa99 [chore](block) temporarily disable DCHECK for column name equality in MutableBlock (#21116)
* tempororyly disable DCHECK for column name equality in MutableBlock::add_rows

* num columns EQ to LE
2023-06-26 10:49:27 +08:00
2c11ce0a02 [bugfix](topn) fix key topn merge block conflict with index predicate result columns (#20820) 2023-06-20 21:23:00 +08:00
93b53cf2f4 [improvement](exception-safe) create and prepare node/sink support exception safe (#20551) 2023-06-09 21:06:59 +08:00
068a32bc49 [Improvement](memory) faststring use Allocator #19762
After the outer catch exception, faststring resize reserve build may throw a memory alloc failure exception from the Allocator.

Currently page body compress will catch memory alloc failure exception
2023-05-18 15:00:49 +08:00
63a76ed115 [refactor](exceptionsafe) disallow call new method explicitly (#18830)
disallow call new method explicitly
force to use create_shared or create_unique to use shared ptr
placement new is allowed
reference https://abseil.io/tips/42 to add factory method to all class.
I think we should follow this guide because if throw exception in new method, the program will terminate.

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-21 09:13:24 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
a68af93d30 [fix](compile) Fix block.cpp compilation failure (#18797) 2023-04-19 08:49:23 +08:00
79c446c89f [enhancement](exception) Column filter/replicate supports exception safety (#18503) 2023-04-18 19:23:09 +08:00
Pxl
307170030c [Bug](materialized-view) fix core dump when create mv have case different with base table (#18206)
fix core dump when create mv have case different with base table
2023-03-31 12:32:09 +08:00
7ae51c856e [refactor](unify exception) unify exception definition and error code (#18006)
* [refactor](unify exception) unify exception definition and error code


---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-03-25 12:41:07 +08:00