* [Improvement](meta) add default_value column for result of show_variables stmt
* add Changed column to show whether value is modified
* fix code style issue
if we write sql with : select array(1.0,2.0,null, null,2.0)
here will pass arg type with uint8 to be which does not match array() func sign with deicmal, and make be core. so here should cast from be and make null tag to cast decimal type
Iceberg has its own metadata information, which includes count statistics for table data. If the table does not contain equli'ty delete, we can get the count data of the current table directly from the count statistics.
1. should not use ((LogicalPlanAdapter)parsedStmt).getStatementContext().getOriginStatement().originStmt.toLowerCase() as the cache key (do not invoke toLowerCase()), for example: select * from tbl1 where k1 = 'a' is different with select * from tbl1 where k1 = 'A', so the cache should be missed.
2. according to issue 6735 , the cache key should contains all views' s ddl sql (including nested views)
should use getHashJoinConjuncts() and getOtherJoinConjuncts() to get hash and other conjuncts of hash join node instead of categorizing them by checking if it's 'EqualTo' expression
slot bind failed for following querys:
select tpch.lineitem.* from lineitem
select tpch.lineitem.l_partkey from lineitem
the unbound slot is tpch.lineitem.l_partkey, but the bounded slot is default_cluster:tpch.lineitem.l_partkey. They are not matched.
we need to ignore default_cluster: when compare dbName
Problem:
When executing group_concat with order by inside in view, column can not be found when analyze.
Example:
create view if not exists test_view as select group_concat(c1,',' order by c1 asc) from table_group_concat;
select * from test_view;
it will return an error like: "can not find c1 in table_list"
Reason:
When we executing this sql in multi-instance environment, Planner would try to create plan in multi phase
aggregation. And because we analyze test_view independent with tables outside view. So we can not get
table informations inside view.
Solution:
Substitute order by expression of merge aggregation expressions.
insert into table command run at a follower node, it will forward to the master node, and the parsed statement is not set to the cascades context, but set to the executor::parsedStmt, we use the latter to get the user info.
Previously, delete statement with conditions on value columns are only supported on duplicate tables. After we introduce delete sign mechanism to do batch delete, a delete statement with conditions on value columns on unique tables will be transformed into the corresponding insert into ..., __DELETE_SIGN__ select ... statement. However, for unique table with merge-on-write enabled, the overhead of inserting these data can be eliminated. So this PR add the ability to allow delete predicate on value columns for merge-on-write unique tables.
update the capacity coeficient for calcutating the backend load score:
1. Add fe config entry `backend_load_capacity_coeficient` to allow setting the capacity coeficient manually;
2. Adjust calculating capacity coeficient as below.
We emphasize disk usage for calculating load score.
If a be has a high used capacity percent, we should increase its load score.
So we increase capacity coefficient with a be's used capacity percent.
But this is not enough. For example, if the tablets have a big difference in data size.
Then for below two BEs, their load score maybe the same:
BE A: disk usage = 60%, replica number = 2000 (it contains the big tablets)
BE B: disk usage = 30%, replica number = 4000 (it contains the small tablets)
But what we want is: firstly move some big tablets from A to B, after their disk usages are close,
then move some small tablets from B to A, finally both of their disk usages and replica number
are close.
To achieve this, when the max difference between all BE's disk usages >= 30%, we set the capacity cofficient to 1.0 and avoid the affect of replica num. After the disk usage difference decrease, then decrease the capacity cofficient to make replica num effective.
Sometimes the BEs will be deployed on the same node with DataNode, so we can use a more reasonable BE selection policy to use the hdfs short-circuit-read as much as possible.