## Proposed changes
Issue Number: close#35635
cherry-pick https://github.com/apache/doris/pull/35637 from master to
branch-2.1
<!--Describe your changes.-->
Cast rules:Consistent with mysql.
String:Date
The first part is 1-digit x: 000x-month-day
The first part is 2-digit xy: 20xy-month-day / 19xy-month-day The first
part is 3-digit xyz: 20xy-0z-day / 19xy-0z-day The first part is 4-digit
xyzw: xyzw-month-day
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
## Proposed changes
https://github.com/apache/doris/pull/37349
error code
```C++
return creator_without_type::create<AggregateFunctionForEach>(transform_arguments, true,
nested_function);
```
"transform_arguments is an internal type of array. All internal types of
the array are null, so an array that is not null was mistakenly treated
as a null array."
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
pick (#37078)
In previous versions, we adopted the strategy of reading the object
address for Oracle's raw type, which would lead to unstable and
meaningless results. Here I changed it to read hexadecimal or UTF8
pick https://github.com/apache/doris/pull/37062
1. revert https://github.com/apache/doris/pull/25097. we decide to rely
on OS. not maintain independent tzdata anymore to keep result
consistency
2. refactor timezone load. removed rwlock.
before:
```sql
mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates;
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) | count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| 16000000 | 16000000 |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
1 row in set (6.88 sec)
```
now:
```sql
mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates;
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) | count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| 16000000 | 16000000 |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
1 row in set (2.61 sec)
```
3. now don't support timezone offset format string like 'UTC+8', like we
already said in
https://doris.apache.org/docs/dev/query/query-variables/time-zone/#usage
4. support case-insensitive timezone parsing in nereids.
5. a bug when parse timezone using nereids. should check DST by input,
but wrongly by now before. now fixed.
doc pr: https://github.com/apache/doris-website/pull/810
## Proposed changes
```
mysql [test]>set DEBUG_SKIP_FOLD_CONSTANT = true;
Query OK, 0 rows affected (0.00 sec)
mysql [test]>select cast(unix_timestamp("2024-01-01",'yyyy-MM-dd') as bigint);
+------------------------------------------------------------+
| cast(unix_timestamp('2024-01-01', 'yyyy-MM-dd') as BIGINT) |
+------------------------------------------------------------+
| 1704038400000000 |
+------------------------------------------------------------+
```
now
```
mysql [test]>select cast(unix_timestamp("2024-01-01",'yyyy-MM-dd') as bigint);
+------------------------------------------------------------+
| cast(unix_timestamp('2024-01-01', 'yyyy-MM-dd') as BIGINT) |
+------------------------------------------------------------+
| 1704038400 |
+------------------------------------------------------------+
1 row in set (0.01 sec)
```
The column does not have a scale set, but the cast uses the scale to
perform the cast.
<!--Describe your changes.-->
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
pick from #37235
the algorithm for computing stats for "expr1 and expr2" predicate is as
following:
1. compute output stats of expr1 based on input stats. the result stats
is denoted by leftStats
2. compute stats of expr2 based on leftStats after step1, leftStats
should be normalized to avoid abnormal cases, such as ndv > rowCount or
numNulls > rowCount
Issue Number: close #xxx
<!--Describe your changes.-->
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->