Commit Graph

105 Commits

Author SHA1 Message Date
04a59d6071 [improve](distinct agg) add check of hash table to decide whether emplace value (#32063)
* [improve](distinct agg) add check of hash table to emplace value
2024-03-15 18:06:15 +08:00
a9ab094614 [Bug](fix) try to fix the coredump of streambyte decode of sse (#30190) 2024-01-23 10:07:51 +08:00
Pxl
3cf95d0fdf [Improvement](execute) optimize for ColumnNullable's serialize_vec/deserialize_vec (#28788)
optimize for ColumnNullable's serialize_vec/deserialize_vec
2024-01-12 11:59:52 +08:00
463a7ab212 [Performance](exec) opt the exchange performance (#29579) 2024-01-12 11:46:29 +08:00
7081139bdc [fix](block) fix be core while mutable block merge may cause different row size between columns in origin block (#27943) 2023-12-25 20:35:22 +08:00
2014396707 [fix](block) add block columns size dcheck (#28539) 2023-12-23 15:21:53 +08:00
fa0ad56817 [exec](compress) use FragmentTransmissionCompressionCodec control the exchange compress behavior (#28818) 2023-12-22 19:50:57 +08:00
7b96730e87 [fix](block) fix nullptr in MutableBlock::allocated_bytes (#28738) 2023-12-20 19:46:13 +08:00
fbe5a7c244 [improvement](decimalv2) support check overflow for decimalv2 arithmetics (#28456) 2023-12-18 10:54:25 +08:00
Pxl
e3d2425d47 [Improvement](join) remove insert_indices_from_join and special judge for -1 (#27779)
remove insert_indices_from_join and special judge for -1
2023-12-04 11:03:22 +08:00
Pxl
d969047b50 [Refactor](join) refactor of hash join (#27557)
Improve the performance under the tpch data set by reconstructing the join related code and the use of hash table

Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: BiteTheDDDDt <pxl290@qq.com>
2023-11-28 19:46:00 +08:00
b580ee91ce [fix](compile) fix macOS compile and format code (#27494) 2023-11-23 23:24:10 +08:00
2ea33518b0 [Opt](load) use batching to optimize auto partition (#26915)
use batching to optimize auto partition
2023-11-23 19:12:28 +08:00
a4d78682ff [Optimize](point query) clear names to reduce mem consumption and cpu cost related to block column name (#26931) 2023-11-17 10:18:21 +08:00
a5565f68b2 [Refactor](opentelemetry) Remove opentelemetry (#26605) 2023-11-09 18:05:34 +08:00
be27d4d921 [fix](broker-load) fix use_count() issue when doing broker load in debug mode (#25288)
When executing broker load in ASAN mode, BE may crash with error:
```
F20231010 18:18:17.044978 185490 block.cpp:694] Check failed: d.column->use_count() == 1 (3 vs. 1)
*** Check failure stack trace: ***
    @     0x55e9d94c4e46  google::LogMessage::SendToLog()
    @     0x55e9d94c1410  google::LogMessage::Flush()
    @     0x55e9d94c5689  google::LogMessageFatal::~LogMessageFatal()
    @     0x55e9c509f80d  doris::vectorized::Block::clear_column_data()
    @     0x55e9b6c170b3  doris::PlanFragmentExecutor::get_vectorized_internal()
    @     0x55e9b6c147e6  doris::PlanFragmentExecutor::open_vectorized_internal()
    @     0x55e9b6c12d9a  doris::PlanFragmentExecutor::open()
    @     0x55e9b6c18426  doris::PlanFragmentExecutor::execute()
    @     0x55e9b6945cca  doris::FragmentMgr::_exec_actual()
    @     0x55e9b696456c  doris::FragmentMgr::exec_plan_fragment()::$_0::operator()()
```

It may happen when there is column maping like:
```
(k1,v2,v3,v4,v5,v6,v7,v8)
set (k2=v4,k3=v4,k4=v4)
```

in load stmt.

Case is covered by Baidu test cases
2023-10-12 17:04:29 +08:00
d31d99bf34 [pipeline](load) opt the pipeline load code (#24708)
opt the pipeline load code
2023-09-21 15:20:31 +08:00
c3b3f0f00a [enhancement](serialize) add dcheck to ensure pb type is set (#24645)
should check the pb's type is set, or the deserialize will core.
should not return unknown type because deserialize will core.
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-09-20 10:42:28 +08:00
d3f1388717 [Feature](partitions) Support auto-partition (#24153)
Co-authored-by: zhangstar333 <2561612514@qq.com>
2023-09-12 15:23:15 +08:00
fdb7a44f57 Revert "[Feature](partitions) Support auto partition" (#24024)
* Revert "[Feature](partitions) Support auto partition (#23236)"

This reverts commit 6c544dd2011d731b8c9c51384c77bcf19c017981.

* Update config.h
2023-09-07 17:08:26 +08:00
6c544dd201 [Feature](partitions) Support auto partition (#23236)
Co-authored-by: zhangstar333 <2561612514@qq.com>
2023-09-06 16:26:45 +08:00
62c075bf7e [improvement](Block) Replace Block(const PBlock&) with deserialize because it has heavy operations in ctor (#23672) 2023-08-31 14:44:17 +08:00
1410a15a61 [fix](compaction) print column name when checking block ColumnPtr is nullptr on get block byte (#23338) 2023-08-29 17:24:48 +08:00
Pxl
477961dc21 [Chore](agg) refactor of hash map (#22958)
refactor of hash map
2023-08-18 17:59:30 +08:00
Pxl
3f55d5d4d5 [Chore](excution) change some log fatal and dcheck to exception (#22890)
change some log fatal and dcheck to exception
2023-08-15 10:45:00 +08:00
5584d7a5ba [Improve](point query) Improve lookup connection cache from DoubleBuffer to LRU cache for better item pruning (#22041) 2023-07-27 22:22:50 +08:00
36524f2b72 [improvement](functions) avoid copying of block in create_block_with_nested_columns (#21526)
avoid copying of block in create_block_with_nested_columns
2023-07-10 17:21:23 +08:00
Pxl
f7c724f8a3 [Bug](excution) avoid core dump on filter_block_internal and add debug information (#21433)
avoid core dump on filter_block_internal and add debug information
2023-07-03 18:10:30 +08:00
2e6d91aa99 [chore](block) temporarily disable DCHECK for column name equality in MutableBlock (#21116)
* tempororyly disable DCHECK for column name equality in MutableBlock::add_rows

* num columns EQ to LE
2023-06-26 10:49:27 +08:00
2c11ce0a02 [bugfix](topn) fix key topn merge block conflict with index predicate result columns (#20820) 2023-06-20 21:23:00 +08:00
93b53cf2f4 [improvement](exception-safe) create and prepare node/sink support exception safe (#20551) 2023-06-09 21:06:59 +08:00
068a32bc49 [Improvement](memory) faststring use Allocator #19762
After the outer catch exception, faststring resize reserve build may throw a memory alloc failure exception from the Allocator.

Currently page body compress will catch memory alloc failure exception
2023-05-18 15:00:49 +08:00
63a76ed115 [refactor](exceptionsafe) disallow call new method explicitly (#18830)
disallow call new method explicitly
force to use create_shared or create_unique to use shared ptr
placement new is allowed
reference https://abseil.io/tips/42 to add factory method to all class.
I think we should follow this guide because if throw exception in new method, the program will terminate.

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-21 09:13:24 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
a68af93d30 [fix](compile) Fix block.cpp compilation failure (#18797) 2023-04-19 08:49:23 +08:00
79c446c89f [enhancement](exception) Column filter/replicate supports exception safety (#18503) 2023-04-18 19:23:09 +08:00
Pxl
307170030c [Bug](materialized-view) fix core dump when create mv have case different with base table (#18206)
fix core dump when create mv have case different with base table
2023-03-31 12:32:09 +08:00
7ae51c856e [refactor](unify exception) unify exception definition and error code (#18006)
* [refactor](unify exception) unify exception definition and error code


---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-03-25 12:41:07 +08:00
Pxl
40ca250678 [Feature](materialized-view) support where clause on create materialized view (#17534)
support where clause on create materialized view
2023-03-22 11:25:13 +08:00
Pxl
401836f523 [Bug](planner) fix core dump when lateral view above union node and have predicate (#17912)
fix core dump when lateral view above union node and have predicate
2023-03-22 11:24:45 +08:00
dd53bc1c8d [unify type system](remove unused type desc) remove some code (#17921)
There are many type definitions in BE. Should unify the type system and simplify the development.



---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-03-19 14:05:02 +08:00
9b7596f1c6 [Feature](Dynamic schema table) step1 support schema change expression (#17494)
1. introduce a new type `VARIANT` to encapsulate dynamic generated columns for hidding the detail of types and names of newly generated columns
2. introduce a new expression `SchemaChangeExpr` for doing schema change for extensibility
2023-03-13 15:12:42 +08:00
f9baf9c556 [improvement](scan) Support pushdown execute expr ctx (#15917)
In the past, only simple predicates (slot=const), and, like, or (only bitmap index) could be pushed down to the storage layer. scan process:

Read part of the column first, and calculate the row ids with a simple push-down predicate.
Use row ids to read the remaining columns and pass them to the scanner, and the scanner filters the remaining predicates.
This pr will also push-down the remaining predicates (functions, nested predicates...) in the scanner to the storage layer for filtering. scan process:

Read part of the column first, and use the push-down simple predicate to calculate the row ids, (same as above)
Use row ids to read the columns needed for the remaining predicates, and use the pushed-down remaining predicates to reduce the number of row ids again.
Use row ids to read the remaining columns and pass them to the scanner.
2023-03-10 08:35:32 +08:00
caacee253d [fix](olap)Crashing caused by IS NULL expression (#17463)
Issue Number: close #17462
2023-03-07 15:32:52 +08:00
94e9a226a6 [Bug](Block compression) Fix bug if config::compress_rowbatches=false then the block column values could be empty (#17325) 2023-03-03 10:31:12 +08:00
1244eed1cd [Opt](exec) opt the dispose nullable column logic (#17192) 2023-03-01 23:25:40 +08:00
08adf914f9 [improvement](vec) avoid creating a new column while filtering mutable columns (#16850)
Currently, when filtering a column, a new column will be created to store the filtering result, which will cause some performance loss。 ssb-flat without pushdown expr from 19s to 15s.
2023-02-21 09:47:21 +08:00
37d1519316 [WIP](dynamic-table) support dynamic schema table (#16335)
Issue Number: close #16351

Dynamic schema table is a special type of table, it's schema change with loading procedure.Now we implemented this feature mainly for semi-structure data such as JSON, since JSON is schema self-described we could extract schema info from the original documents and inference the final type infomation.This speical table could reduce manual schema change operation and easily import semi-structure data and extends it's schema automatically.
2023-02-11 13:37:50 +08:00
737c73dcf0 [Improvement](topn) order by key topn query optimization (#15663) 2023-02-06 15:36:05 +08:00
90b12143a3 [refactor](remove unused code) remove runtime tuple structure and useless utils class (#16237) 2023-01-30 16:45:14 +08:00