Previously, delete statement with conditions on value columns are only supported on duplicate tables. After we introduce delete sign mechanism to do batch delete, a delete statement with conditions on value columns on unique tables will be transformed into the corresponding insert into ..., __DELETE_SIGN__ select ... statement. However, for unique table with merge-on-write enabled, the overhead of inserting these data can be eliminated. So this PR add the ability to allow delete predicate on value columns for merge-on-write unique tables.
The DCHECK may not always be right in case of Vertical compaction.
remove it to let DEBUG run.
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
This issue is introduced by #22765, if #22765 is picked to 2.0, then also need to pick this PR.
When shuffle type is BUCKET_SHFFULE_HASH_PARTITIONED, since data of multi buckets maybe sent to the same channel, send eos too early may cause data lost.
update the capacity coeficient for calcutating the backend load score:
1. Add fe config entry `backend_load_capacity_coeficient` to allow setting the capacity coeficient manually;
2. Adjust calculating capacity coeficient as below.
We emphasize disk usage for calculating load score.
If a be has a high used capacity percent, we should increase its load score.
So we increase capacity coefficient with a be's used capacity percent.
But this is not enough. For example, if the tablets have a big difference in data size.
Then for below two BEs, their load score maybe the same:
BE A: disk usage = 60%, replica number = 2000 (it contains the big tablets)
BE B: disk usage = 30%, replica number = 4000 (it contains the small tablets)
But what we want is: firstly move some big tablets from A to B, after their disk usages are close,
then move some small tablets from B to A, finally both of their disk usages and replica number
are close.
To achieve this, when the max difference between all BE's disk usages >= 30%, we set the capacity cofficient to 1.0 and avoid the affect of replica num. After the disk usage difference decrease, then decrease the capacity cofficient to make replica num effective.
Sometimes the BEs will be deployed on the same node with DataNode, so we can use a more reasonable BE selection policy to use the hdfs short-circuit-read as much as possible.
Currently, we only return ambiguous "INTERNAL ERROR" to the user when
load. This commit will no more hide the root cause.
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
before the parquet write export decimal as byte-binary,
but can't be import those fied to Hive.
Now, change to export decimal as fixed-len-byte-array in order to import hive directly.
type check could not work because no expression in plan.
sink and scan have no expression at all. so cannot check type.
this pr add expression on logical sink to let type check work well
## Proposed changes
Refactor thoughts: close#22383
Descriptions about `enclose` and `escape`: #22385
## Further comments
2023-08-09:
It's a pity that experiment shows that the original way for parsing plain CSV is faster. Therefor, the refactor is only applied on enclose related code. The plain CSV parser use the original logic.
Fallback of performance is unavoidable anyway. From the `CSV reader`'s perspective, the real weak point may be the write column behavior, proved by the flame graph.
Trimming escape will be enable after fix: #22411 is merged
Cases should be discussed:
1. When an incomplete enclose appears in the beginning of a large scale data, the line delimiter will be unreachable till the EOF, will the buffer become extremely large?
2. What if an infinite line occurs in the case? Essentially, `1.` is equivalent to this.
Only support stream load as trial in this PR, avoid too many unrelated changes. Docs will be added when `enclose` and `escape` is available for all kinds of load.