broadcast join cost is used compressed data size currently.
The amount of memory used may be significantly more than estimated.
This patch:
1. add a compressed ratio to broadcast join cost and set to 5 according to the experience.
2. add a new session variable `auto_broadcast_join_threshold` to limit memory used by broadcast in bytes, the default value is 1073741824(1GB)
1. We forgot to check disk capacity when writing data.
2. TODO: the user specified disk capacity is not used now. We need to find a way to use it.
3. Avoid print too much compaction log when there is not suitable version for compaction.
Fixed#8850
The column in inline view maybe a function instead of slotRef.
So when this column is used as the input of explode function,
it can't be converted to slotRef.
The correct way is to treat it as an Expr and extract the required slotRef for materialization.
For example:
```
with d as (select k1+k1 as k1_plus from table)
select k1_plus from d explode_split(k1_plus, ",")
```
FnExp: SlorRef<k1_plus>
SubstituteFnExpr: functionCallExpr<k1+k1>
originSlotRefList: SlotRef<k1>
Add [IF EXISTS] support to following statements:
- CREATE [IF NOT EXISTS] USER
- CREATE [IF NOT EXISTS] ROLE
- DROP [IF EXISTS] USER
- DROP [IF EXISTS] ROLE
In meituan, pr #6625 was revert due to the oom probleam.
currently, we are trying to modify the old hyperloglog, based on pr #8555, we did some works.
via some test, we find it better than old hll, and better than apache:master hll.
Changes summary:
- use SIMD max tp speed up heavy function _merge_registers
- use phmap::flat_hash_set rather than std::set
- replace std::max
- other small changes
when restore table with dynamic partition properties, 'dynamic_partition.enable' is set to the backup time value.
but Doris could not turn on dynamic partition automatically when restore.
So we cloud see table never do dynamic partition with dynamic_partition.enable is set to 'true'.
recurrent:
When `enable_low_cardinality_optimize = true`, for the TPCH dataset, using the following SQL query will Core
```sql
select count(*) from lineitem where l_comment = 'ously even exc';
```
This SQL will trigger the execution of `ColumnDictionary::convert_to_predicate_column_if_dictionary`, and `res->reserve(_codes.size())` is problematic because the current `_codes.size()` is smaller than its reserve value, so inserting a value into `PredicateColumn` will Core.
http v2 has been actually tested in production, and it is completely replaceable to have http code. In order to simplify code maintenance, remove the previous http part of the code
When restore snapshot from 0.13 to master, the restore job is pending for long time.
However, we get error "Could not set meta version to 93 since it is lower than minimum required version 100" in log.
We should cancel restore job once get that error.
The Repeat Node will change the fragment data partition.
So the output partition of child fragment is different from the data partition of current fragment.
When judging whether colocate can be enabled,
the current data partition of fragment should be used directly instead of the child's output partition.
Before this PR fix, queries with '''rollup + concurrency greater than 1''' may have incorrect results.
For example:
```
select t1.tc1,t1.tc2,sum(t1.tc3) as total from t1 join[shuffle] t1 t2 on t1.tc1=t2.tc1
group by rollup(tc1,tc2) order by t1.tc1,t1.tc2,total;
```
Fixed#8778
The patch fixes two problems.
1. Memory order problem accessing _last_patch_processed_finished and in_flight, actually _last_patch_processed_finished is redundant, so the patch removes it.
2. synchronization in join on cid.
Fix for #8725.