4ca0c0face
[fix](join) fix wrong result of right join ( #18365 )
...
When processing data in hash table for right join and full outer join, if the output data rows of one hash bucket excceeds batch size, the logic when continue processing this bucket is wrong, it should differentiate between different join types.
2023-04-06 10:55:58 +08:00
e5793249cd
[opt](hashtable) Modify default filled strategy to 75% ( #18242 )
2023-03-31 09:28:11 +08:00
d27201f331
[fix](nested_loop_join)got incorrect result from nested loop join without condition ( #18139 )
2023-03-28 16:20:05 +08:00
78abb40fdc
[improvement](string) throw exception instead of log fatal if string column exceed total size limit ( #17989 )
...
Throw exception instead of log fatal if string column exceed total size limit, so that we can catch it and let query fail, instead of causing be exit.
2023-03-27 08:55:26 +08:00
7d91114304
[fix](join) fix wrong result of null aware left anti join ( #17752 )
2023-03-14 09:35:46 +08:00
93a865c3e8
[improvement](join) Avoid reading from left child while hash table is empty(right join) ( #17655 )
...
When the right (build) side is empty in a right outer join, there is no need to read data from the left child.
2023-03-13 09:03:17 +08:00
00727e8c11
[fix](in-bitmap) fix result may be wrong if the left side of the in bitmap predicate is a constant ( #17570 )
2023-03-09 10:59:05 +08:00
1244eed1cd
[Opt](exec) opt the dispose nullable column logic ( #17192 )
2023-03-01 23:25:40 +08:00
e22a9ecc3b
[enhancement](execute model) using thread pool to execute report or join task instead of staring too many thread ( #17212 )
...
* [enhancement](execute model) using thread pool to execute report or join task instead of staring too many thread
Doris will start report thread and join thread during fragment execution. There are many problems if create and destroy thread very frequently. Jemalloc may not behave very well, it may crashed.
jemalloc/jemalloc#1405
It is better to using thread pool to do these tasks.
---------
Co-authored-by: yiguolei <yiguolei@gmail.com >
2023-03-01 08:35:27 +08:00
a1c0054b4c
[fix](memory) fix memory GC details and join probe catch bad_alloc ( #16989 )
...
Fix Redhat 4.x OS /proc/meminfo has no MemAvailable, disable MemAvailable to control memory.
vm_rss_str and mem_available_str recorded when gc is triggered, to avoid memory changes during gc and cause inaccurate logs.
join probe catch bad_alloc, this may alloc 64G memory at a time, avoid OOM.
Modify document doris_be_all_segments_num and doris_be_all_rowsets_num names.
2023-02-23 08:33:30 +08:00
fb0d08ff4c
[fix](mark join) fix bug of mark join with other conjuncts ( #16655 )
...
Fix bug that probe_index is not increased for mark hash join with other conjuncts.
2023-02-14 14:47:15 +08:00
f71fc3291f
[Bug](fix) right anti join error result when batch size is low ( #16510 )
2023-02-08 17:26:19 +08:00
f6a20f844b
[fix](hashjoin) join produce blocks with rows larger than batch size: handle join with other conjuncts ( #16402 )
2023-02-08 14:26:35 +08:00
91229bb87d
[Bug](makr join) Fix mark join with other conjuncts ( #16435 )
2023-02-07 09:31:41 +08:00
696c6ffcc5
[fix](join) crash caused by canceling query ( #16311 )
...
If the query was canceled,
the status in shared context may be `OK` with other fields not set.
2023-02-02 09:55:37 +08:00
bf16228851
[fix](hashjoin) join produce blocks with rows larger than batch size ( #16166 )
...
* [fix](hashjoin) join produce blocks with rows larger than batch size
* fix
2023-02-01 16:02:31 +08:00
46347a51d2
[Bug](exec) enable warning on ignoring function return value for vctx ( #16157 )
...
* enable warning on ignoring function return value for vctx
2023-01-29 17:23:21 +08:00
79ad74637d
[refactor](remove expr) remove non vectorized Expr and ExprContext related codes ( #16136 )
2023-01-24 10:45:35 +08:00
9f106161a7
[Bug](join) Fix null aware anti join error in fuzzy mode ( #15987 )
2023-01-17 11:32:16 +08:00
97fcad76f8
[enhancement](memtracker) Improve readability ( #15716 )
2023-01-16 16:30:35 +08:00
9468711f9f
[Bug](join) fix bug null aware left anti join not correct result ( #15841 )
2023-01-13 10:18:05 +08:00
d857b4af1b
[refactor](remove row batch) remove impala rowbatch structure ( #15767 )
...
* [refactor](remove row batch) remove impala rowbatch structure
Co-authored-by: yiguolei <yiguolei@gmail.com >
2023-01-11 09:37:35 +08:00
9c0f96883a
[fix](hashjoin) Fix right join pull output block memory overflow ( #15440 )
...
For outer join / right outer join / right semi join, when HashJoinNode::pull->process_data_in_hashtable outputs a block, it will output all rows of a key in the hash table into a block, and the output of a key is completed After that, it will check whether the block size exceeds the batch size, and if it exceeds, the output will be terminated.
If a key has 2000w+ rows, memory overflow will occur when the subsequent block operations on the 2000w+ rows are performed.
2023-01-10 10:10:43 +08:00
9c36278c4a
[improvement](pipeline) Support sharing hash table for broadcast join ( #15628 )
2023-01-06 15:11:28 +08:00
05d72e8919
[fix](join) fix anti join incorrectly outputs null values ( #15567 )
2023-01-06 09:55:48 +08:00
5ff5b8fc98
[feature](mark join) Support mark join for hash join node ( #15569 )
...
* [feature](mark join) Support mark join for hash join node
2023-01-05 09:32:26 +08:00
10be583e52
[chore](pipeline) optimize profile information ( #15433 )
2022-12-30 09:56:33 +08:00
06f71f2bca
[pipeline](fix) Fix bugs to pass all regression cases ( #15306 )
...
* [pipeline](fix) Fix bugs to pass all regression cases
* update
* update
2022-12-23 22:17:50 +08:00
b085ff49f0
[refactor](non-vec) delete non-vec data sink ( #15283 )
...
* [refactor](non-vec) delete non-vec data sink
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-12-23 14:10:47 +08:00
8ecf69b09b
[pipeline](regression) nested loop join test get error result in pipeline engine and refactor the code for need more input data ( #15208 )
2022-12-21 19:03:51 +08:00
af54299b26
[Pipeline](projection) Support projection on pipeline engine ( #15220 )
2022-12-21 15:47:29 +08:00
732417258c
[Bug](pipeline) Fix bugs to pass TPCDS cases ( #15194 )
2022-12-20 22:29:55 +08:00
5cf21fa7d1
[feature](planner) mark join to support subquery in disjunction ( #14579 )
...
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com >
2022-12-20 15:22:43 +08:00
7c67fa8651
[Bug](pipeline) fix bug of right anti join error result in pipeline ( #15165 )
2022-12-19 19:28:44 +08:00
0732f31e5d
[Bug](pipeline) Fix bugs for scan node and join node ( #15164 )
...
* [Bug](pipeline) Fix bugs for scan node and join node
* update
2022-12-19 15:59:29 +08:00
874acdf68f
[vectorized](join) add try catch in create thread ( #15065 )
2022-12-16 19:55:09 +08:00
284a3351f4
[Refactor](exec) refactor the code of datasink eos logic ( #15009 )
2022-12-13 15:33:08 +08:00
68092fe514
[pipeline](NLJ) support nested loop join for pipeline ( #14966 )
2022-12-10 00:20:16 +08:00
9d36931038
[Refactor](NLJ) refactor the nested loop join node ( #14911 )
...
* [Refactor](NLJ) refactor the nested loop join node
* change the logic of alloc/release resource
2022-12-09 14:10:26 +08:00
0c817e6b3a
[Pipeline](hashjoin) Support hash join on pipeline engine ( #14898 )
2022-12-08 15:43:02 +08:00
8c0e13ab51
[improvement](profile) add detail memory counter for exec nodes ( #14806 )
...
* [improvement](profile) improve accuraccy of memory usage and add detail memory counter
* fix
2022-12-05 11:51:52 +08:00
176f519fa1
[enhancement](memtracker) Optimize exec node memory tracking ( #14711 )
2022-12-01 14:52:21 +08:00
b4d32a0c44
[fix](join) runtime filter shared from other instance wasn't be published ( #14717 )
2022-12-01 14:17:23 +08:00
6c70d794f6
[fix](bitmapfilter) fix core dump caused by bitmap filter ( #14702 )
2022-12-01 09:56:22 +08:00
7513c82431
[NLJoin](conjuncts) separate join conjuncts and general conjuncts ( #14608 )
2022-11-29 08:55:54 +08:00
78adecac1b
[enhancemennt](be)optimize mem usage in join and set node ( #14602 )
2022-11-27 13:38:49 +08:00
4728e75079
[feature](bitmap) Support in bitmap syntax and bitmap runtime filter ( #14340 )
...
1.Support in bitmap syntax, like 'where k1 in (select bitmap_column from tbl)';
2.Support bitmap runtime filter. Generate a bitmap filter using the right table bitmap and push it down to the left table storage layer for filtering.
2022-11-25 15:22:44 +08:00
9103ded1dd
[improvement](join)optimize sharing hash table for broadcast join ( #14371 )
...
This PR is to make sharing hash table for broadcast more robust:
Add a session variable to enable/disable this function.
Do not block the hash join node's close function.
Use shared pointer to share hash table and runtime filter in broadcast join nodes.
The Hash join node that doesn't need to build the hash table will close the right child without reading any data(the child will close the corresponding sender).
2022-11-24 21:06:44 +08:00
6c7f758ef7
[improvement](hashjoin) support partitioned hash table in hash join ( #14480 )
2022-11-24 14:16:47 +08:00
2c42f0a905
[refactor](decimalv3) Refine code for DecimalV3 ( #14394 )
2022-11-19 16:57:17 +08:00