This is part of the array type support and has not been fully completed.
The following functions are implemented
1. fe array type support and implementation of array function, support array syntax analysis and planning
2. Support import array type data through insert into
3. Support select array type data
4. Only the array type is supported on the value lie of the duplicate table
this pr merge some code from #4655#4650#4644#4643#4623#2979
1. support in/bloomfilter/minmax
2. support broadcast/shuffle/bucket shuffle/colocate join
3. opt memory use and cpu cache miss while build runtime filter
4. opt memory use in left semi join (works well on tpcds-95)
The cause of the problem is that after query cancel, OlapScanNode::transfer_thread still continues to schedule
OlapScanNode::scanner_thread until all tasks are scheduled.
Although each task does not scan data and exits quickly, it still consumes a lot of resources.
(Guess)This may be the cause of the BUG (#5767) causing the I/O to be full.
So after query cancel, immediately exit the scheduling loop in transfer_thread, and after waiting for
the end of all scanner_threads, transfer_thread will also exit.
1.
Add timer to count the time the transfer thread waits for the scaner thread to return rowbatch.
2.
Add timer to count the time that the scanner thread waits for the available worker threads in the thread pool.
Co-authored-by: chenmingyu <chenmingyu@baidu.com>
At present, the application of vlog in the code is quite confusing.
It is inherited from impala VLOG_XX format, and there is also VLOG(number) format.
VLOG(number) format does not have a unified specification, so this pr standardizes the use of VLOG
Scanner threads may be running and using the member vars of OlapScanNode,
when the OlapScanNode has already destroyed.
We can use `_running_thread` to be the last accessed member variable.
And `transfer_thread` need to wait for `_running_thread==0`.
After `transfer_thread` joined, `OlapScanNode::close()` can continue.
Introduced by PR #5051.
As @liutang123 said, when PlanFragmentExecutor is destructed, it will call
`close -> ExecNode::close -> OlapScanNode::close`. OlapScanNode will wait for `_transfer_thread`.
`_transfer_thread` will wait for all OlapScanner processing to complete.
OlapScanner is processed by the scanner thread. When the last scanner processing is completed,
`_transfer_thread` will break out of the loop, and PlanFragmentExecutor will continue to destruct.
And if it is completed, its RuntimeProfile::Counter will also be destructed.
At this time, the ScopedTimer in the Scan thread may still use this Counter when it is destructed.
So we must make sure that the timer is deconstructed before deconstructing the runtime profile.
And Refactor ColumnRangeValue and OlapScanNode
This patch mainly do the following:
- Fix issue #5071
- Change type_min in ColumnRangeValue as static
- Add Class of type_limit make code clear
- Refactor the function of normalize_in_and_eq_predicate
mainly includes:
- `OLAP_SCAN_NODE` profile layering: `OLAP_SCAN_NODE`,`OlapScanner`, and `SegmentIterator`.
- Delete meaningless statistical values. mainly in scan_node.cpp.
- Increase `RowsConditionsFiltered` statistical, split from `RowsDelFiltered`, the meaning is the number of rows filtered by various column indexes, only in segment V2.
- Modify the document based on the above, and enhance readability.
We make all MemTrackers shared, in order to show MemTracker real-time consumptions on the web.
As follows:
1. nearly all MemTracker raw ptr -> shared_ptr
2. Use CreateTracker() to create new MemTracker(in order to add itself to its parent)
3. RowBatch & MemPool still use raw ptrs of MemTracker, it's easy to ensure RowBatch & MemPool destructor exec
before MemTracker's destructor. So we don't change these code.
4. MemTracker can use RuntimeProfile's counter to calc consumption. So RuntimeProfile's counter need to be shared
too. We add a shared counter pool to store the shared counter, don't change other counters of RuntimeProfile.
Note that, this PR doesn't change the MemTracker tree structure. So there still have some orphan trackers, e.g. RowBlockV2's MemTracker. If you find some shared MemTrackers are little memory consumption & too time-consuming, you could make them be the orphan, then it's fine to use the raw ptr.
Replace some boost to std in OlapScanNode.
This refactor seems solve the problem describe in #3929.
Because I found that BE will crash to calling `boost::condition_variable.notify_all()`.
But after upgrade to this, BE does not crash any more.
This CL mainly changes:
1. Add a new BE config `max_pushdown_conditions_per_column` to limit the number of conditions of a single column that can be pushed down to storage engine.
2. Add 2 new session variables `max_scan_key_num` and `doris_max_scan_key_num` which can set in session level and overwrite the config value in BE.
This CL mainly made the following modifications:
1. Delete Invalid MemoryUsed Counter and Add PeakMemUsage in each exec node and datastreamsender
2. Add intent in child execnode profile,make it is easily to know the relationship between execnode
3. Del _is_result_order we not support any more in olap_scan_node.h and olap_scan_node.cpp
4. Add scan_disk method to olap_scanner to fix the counter _num_disks_accessed_counter
5. Now we do not use buffer pool to read and write disk, so annotation eadio counter and
6. Delete the MemUsed counter in exec node.
This CL mainly made the following modifications:
1. Delete Invalid method in Running Profile Class.
2. Move Memlimit Counter from blockmgr to fragment and add PeakMemUsage Counter
3. Fix the bug of buffer pool memlimit counter
4. Call compute_time_in_profile() before pretty_print() to show the _local_time_percent without child running profile
5. Add TransferThread ThreadToken count in AveThreadToken Counter
Remove unused LLVM related codes of directory:be/src/exec (#2910)
there are many LLVM related codes in code base, but these codes are not really used.
The higher version of GCC is not compatible with the LLVM 3.4.2 version currently used by Doris.
The PR delete all LLVM related code of directory: be/src/exec.
1. MemTableFlushExecutor maintain a ThreadPool to receive FlushTask.
2. FlushToken is used to seperate different tasks from different tablets.
Every DeltaWriter of tablet constructs a FlushToken,
task in FlushToken are handle serially, task between FlushToken are
handle concurrently.
3. I have remove thread limit on data_dir, because of I/O is not the main
timer consumer of Flush thread. Much of time is consumed in CPU decoding
and compress.
For #2589
1. date(uint24_t)/datetime(int64_t)/largeint(int128_t) use frame of reference code as dict.
2. decimal(decimal12_t) also uses frame of reference code as dict.
3. float/double use bitshuffle code as dict.
[STORAGE][SEGMENTV2]
use block split bloom filter
build bloom filter against data page
add distinct value to bloom filter
add ordinal index to bloom filter index
* Change Null-safe equal operator from cross join to hash join
ISSUE-2136
This commit change the join method from cross join to hash join when the equal operator is Null-safe '<=>'.
It will improve the speed of query which has the Null-safe equal operator.
The finds_nulls field is used to save if there is Null-safe operator.
The finds_nulls[i] is true means that the i-th equal operator is Null-safe.
The equal function in hash table will return true, if both val and loc are NULL when finds_nulls[i] is true.
NOTE: This patch would modify all Backend's data.
And this will cause a very long time to restart be.
So if you want to interferer your product environment,
you should upgrade backend one by one.
1. Refactoring be is to clarify the structure the codes.
2. Use unique id to indicate a rowset.
Nameing rowset with tablet_id and version will lead to
many conflicts among compaction, clone, restore.
3. Extract an rowset interface to encapsulate rowsets
with different format.