in previous [PR 12872](https://github.com/apache/doris/pull/12872), we convert multi equals on same slot into `in predicate`. for example, `a =1 or a = 2` => `a in (1, 2)`
This pr makes 4 changes about convert or to in:
1. fix a bug: `Not IN` is merged with equal. `a =1 or a not in (2, 3)` => `a in (1, 2, 3)`
2. extends this rule on more cases
- merge for more than one slot: 'a =1 or a = 2 or b = 3 or b = 4' => `a in (1, 2) or b in (3, 4)`
- merge skip not-equal and not-in: 'a =1 or a = 2 or b !=3 or c not in (1, 2)' => 'a in (1, 2) or b!=3 or c not in (1,2)`
3. rewrite recursively.
4. OrToIn is implemented in ExtractCommonFactorsRule. This rule will generate new exprs. OrToIn should apply on such generated exprs. for example `(a=1 and b=2) or (a=3 and b=4)` => `(a=1 or a=3) and (b=2 or b=4) and [(a=1 and b=2) or (a=3 and b=4)]` => `a in (1,3) and b in (2 ,4) and [(a=1 and b=2) or (a=3 and b=4)]`
In addition, this pr add toString() for some Expr.
if a materialized view is selected, the olap scan node's NonUserVisibleOutput property may contains column from other materialized view. This pr remove invalid column
1. Remove some redundant code.
2. Fix the issue with the state of MTMV task.
3. Fix the case - test_create_mtmv.
## Problem summary
1. We used a retry policy to re-run the failed MTMV tasks, but we set the state to `FAILURE` during re-running the tasks.
We should do this after all the retry runs fail.
2. There are some redundant code can be removed.
3. In the case test_create_mtmv, we created many background tasks to refresh the data. Some task may fail due to the concurrency and cause the test fail. Actually, we only need single one task to verify the functionality.
For now, type information of child expr which is NullLiteral would get lost in the CastExpr#getResultValue, this will produce a NullLiteral with Null type which cause BE core when doing cast
we have some rules that change output's nullable in rewrite step. So we need a rule to adjust nullable at the end of rewrite step.
TODO
- remove the output slot map
- add nullable compare into slot reference
- use exprid to compare two slot if do not need to compare nullable
- merge all rules into one to adjust all type plans
example:
unbound slot k
bounded [k, t.k]
In previous binding algorithm, there are 2 candidate bindings,
in which bounded k is exactly matched unbound slot k, it has higher priority than that of t1.k
Because of the rollup has the same keys and the keys's order is same, BE will do linked schema change. The base tablet's segments will link to the new rollup tablet. But the unique id from the base tablet is starting from 0 and as the rollup tablet also. In this case, the unique id 4 in the base table is column 'city', but in the rollup tablet is 'cost'. It will decode the varcode page to bigint page so that be coredump. It needs to be rejected.
I think that if a rollup add by link schema change, it means this rollup is redundant. It brings no additional revenue and wastes storage space. So It needs to be rejected.
Reorder in analyze phase would produce a stmt which its corresponding SQL could not be analyzed correctly and cause an analyze exception that may be happened in the stmt rewrite, since the rewriter will reset and reanalyze the rewritten stmt.
The inferface of hive metastore changes from version to version.
Currently, Doris use hive 2.3.7 as hms client version.
When using to connect hive 1.x, some interface such as get_table_req does not exist
in hive 1.x. So we can't get metadata from hive 1.x.
In this PR, I copied the HiveMetastoreClient from hive 2.3.7 release, and modify some of interface's
implementation, so that it will use old interface to connect to hive 1.x.
And when creating hms catalog, you can specify the hive version, eg:
CREATE CATALOG `hive` PROPERTIES (
"hive.metastore.uris" = "thrift://127.0.0.1:9083",
"type" = "hms",
"hive.version" = "1.1"
);
If hive.version does not specified, Doris will use hive 2.3.x compatible interface to visit hms.
ComputeSignatureHelper.upgradeDateOrDateTimeToV2()
we upgrate return date type, but forget to upgrade arguments datatype.
The same problem in upgradeDecimalV2ToV3()
case 1
`select count(1) from t1 join t2 on t1.a = t2.a group by a`
`group by a` is ambiguous
case 2
`select t1.a from t1 join t2 on t1.a = t2.a group by a`
`group by a` is bound on t1.a