Commit Graph

24 Commits

Author SHA1 Message Date
53ae24912f [vectorized](feature) support partition sort node (#19708) 2023-05-25 11:22:02 +08:00
8d7a9fd21b [refactor](exceptionsafe) add factory creator to some class (#18978)
make vexprecontext,vexpr,function,query context,runtimestate thread safe.


---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-24 10:32:11 +08:00
63a76ed115 [refactor](exceptionsafe) disallow call new method explicitly (#18830)
disallow call new method explicitly
force to use create_shared or create_unique to use shared ptr
placement new is allowed
reference https://abseil.io/tips/42 to add factory method to all class.
I think we should follow this guide because if throw exception in new method, the program will terminate.

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-21 09:13:24 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
79c446c89f [enhancement](exception) Column filter/replicate supports exception safety (#18503) 2023-04-18 19:23:09 +08:00
f800ba8f4c [Exec](opt) Optimize function call for const columns (#18212) 2023-03-31 11:36:21 +08:00
78abb40fdc [improvement](string) throw exception instead of log fatal if string column exceed total size limit (#17989)
Throw exception instead of log fatal if string column exceed total size limit, so that we can catch it and let query fail, instead of causing be exit.
2023-03-27 08:55:26 +08:00
aab8dad191 [fix](sort) fix bug of sort (#17151)
The logic of topn and full sort is wrong when there are both offsets and limits, the offset is not considered when doing the max heap optimization, which will lead to wrong result.
2023-02-27 10:55:12 +08:00
d390e63a03 [enhancement](stream receiver) make stream receiver exception safe (#16412)
make stream receiver exception safe
change get_block(block**) to get_block(block* , bool* eos) unify stream semantic
2023-02-07 12:44:20 +08:00
f94a78ab4a [Fix](topn) fix wrong nullable cast for RowId column and use heapsorter for two phase read (#16399)
convert_nullable_flags does not contain nullable info for RowID column, but valid_column_ids contain RowID column, nullable falg will be undefined for RowID column
2023-02-03 20:49:45 +08:00
a7b030778a [fix](sort) fix heap-use-after-free error if sort with limit and is spilled (#16267) 2023-01-31 09:59:03 +08:00
3894de49d2 [Enhancement](topn) support two phase read for topn query (#15642)
This PR optimize topn query like `SELECT * FROM tableX ORDER BY columnA ASC/DESC LIMIT N`.

TopN is is compose of SortNode and ScanNode, when user table is wide like 100+ columns the order by clause is just a few columns.But ScanNode need to scan all data from storage engine even if the limit is very small.This may lead to lots of read amplification.So In this PR I devide TopN query into two phase:
1. The first phase we just need to read `columnA`'s data from storage engine along with an extra RowId column called `__DORIS_ROWID_COL__`.The other columns are pruned from ScanNode.
2. The second phase I put it in the ExchangeNode beacuase it's the central node for topn nodes in the cluster.The ExchangeNode will spawn a RPC to other nodes using the RowIds(sorted and limited from SortNode) read from the first phase and read row by row from storage engine.

After the second phase read, Block will contain all the data needed for the query
2023-01-19 10:01:33 +08:00
049f8ad2f9 [Bug](sort)fix merge sorter might div zero when block bytes less than block rows (#15859)
If block bytes are bigger than the corresponding block's rows, then the avg_size_per_row would be zero. Which would end up diving zero in the following logic.
2023-01-13 18:33:40 +08:00
730571e386 [fix](sort spill) fix bug of failed to create spilled file (#15864)
Also increase buffered block size when it has started to spill.
2023-01-13 09:23:26 +08:00
8f31a36429 [feature] support spill to disk for sort node (#15624) 2023-01-11 08:40:58 +08:00
9d1f02c580 [Improvement](topn) runtime prune for topn query (#15558) 2023-01-05 20:10:12 +08:00
af54299b26 [Pipeline](projection) Support projection on pipeline engine (#15220) 2022-12-21 15:47:29 +08:00
8c0e13ab51 [improvement](profile) add detail memory counter for exec nodes (#14806)
* [improvement](profile) improve accuraccy of memory usage and add detail memory counter

* fix
2022-12-05 11:51:52 +08:00
dd11d5c0a5 [enhancement](memory) Support try catch bad alloc (#14135) 2022-11-13 11:22:56 +08:00
035657c5a1 [typo](comment) Fix a lot of spell errors in be comments (#14208)
fix typos.
2022-11-12 16:06:15 +08:00
ac037e57f5 [fix](sort)the sort expr's nullability property may not be right (#13328) 2022-10-18 22:09:02 +08:00
1ba9e4b568 [Improvement](sort) Reuse memory in sort node (#12921) 2022-09-28 09:44:35 +08:00
3cfaae0031 [Improvement](sort) Use heap sort to optimize sort node (#12700) 2022-09-21 10:01:52 +08:00
c05d736331 [Improvement](sort) fallback to partial sort small block if topN is small (#12604)
* [Improvement](sort) fallback to partial sort small block if topN is small
2022-09-16 10:20:17 +08:00