add negative case for agg table
fix agg table support replace agg type for complex type , and Now We only support complex type with agg state for replace only
fix test output
A left join B on A.x=B.x and A.y=B.y
B.x and B.y make result tuple number scale out.
suppose A is scaled out by B.x N1 time, and scaled out by B.y N2 time, and N1 < N2.
we should choose N1 as the final scale out factor, not N2.
this pr impact on tpcds_sf100 59/17/29/25/47/40/54
before
query59 77295 75279 75230 75230
query17 22642 21566 21599 21566
query29 16508 16092 16006 16006
query25 20262 20571 21171 20571
query47 23571 23264 23107 23107
query40 3305 2849 3064 2849
query54 9052 8882 8715 8715
Total cold run time: 172635 ms
Total hot run time: 168044 ms
after
query59 56435 54717 53919 53919
query17 24167 22377 23237 22377
query29 16950 18325 16333 16333
query25 21478 22975 21358 21358
query47 24412 24611 23920 23920
query40 5491 4779 5176 4779
query54 8671 8664 8658 8658
Total cold run time: 157604 ms
Total hot run time: 151344 ms
Revert "[feature](function) add json->operator convert to json_extract (#19899)"
because it conflict with lambda syntax
This reverts commit f54a068d82e88e8535f3ed55a4224886b752e46b.
In the old code, when using desc command to view the table schema
It will display as follows
```
ARRAY<TINYINT(4)>
ARRAY<SMALLINT(6)>
ARRAY<INT(11)>
ARRAY<BIGINT(20)>
ARRAY<LARGEINT(40)>
```
However, for normal integer type displays, the width is not displayed
So, I changed it to the following
```
ARRAY<TINYINT>
ARRAY<SMALLINT>
ARRAY<INT>
ARRAY<BIGINT>
ARRAY<LARGEINT>
```
In the context of reorder join, when a new plan is generated, it may include a project operation. In this case, the newly generated join root and the original join root will no longer be in the same group. To avoid inconsistencies in the statistics between these two groups, we keep the child group's row count unchanged when the parent group expression is a project operation.
Sometimes the first run of query may be longer then former given threshold, which case test fail.
Also add a new session variable test_query_cache_hit
So that we can use it to test if cache is hit in regression test
UNIX_TIMESTAMP function data format parameter supports 'yyyy-MM-dd HH:mm:ss'
The implementation is the same as the date_format function
before:
```sql
mysql> select UNIX_TIMESTAMP('2023-09-18 00:00:00','yyyy-MM-dd HH:mm:ss');
+--------------------------------------------------------------+
| unix_timestamp('2023-09-18 00:00:00', 'yyyy-MM-dd HH:mm:ss') |
+--------------------------------------------------------------+
| NULL |
+--------------------------------------------------------------+
1 row in set (0.04 sec)
```
now:
```sql
mysql> select UNIX_TIMESTAMP('2023-09-18 00:00:00','yyyy-MM-dd HH:mm:ss');
+------------+
| 1694966400 |
+------------+
| 1694966400 |
+------------+
1 row in set (0.01 sec)
```
Two improvements:
1. Move the `Job_id` column for the return info of `Analyze table` command to the first column. To keep consistent with `show analyze`.
```
mysql> analyze table hive.tpch100.region;
+--------+--------------+-------------------------+------------+--------------------------------+
| Job_Id | Catalog_Name | DB_Name | Table_Name | Columns |
+--------+--------------+-------------------------+------------+--------------------------------+
| 14403 | hive | default_cluster:tpch100 | region | [r_regionkey,r_comment,r_name] |
+--------+--------------+-------------------------+------------+--------------------------------+
1 row in set (0.03 sec)
```
2. Add `analyze_timeout` session variable, to control `analyze table/database with sync` timeout.
since we have three infrastructure to ensure changing input column order
not lead to wrong result, we could remove this flag on LogicalProject to
eliminate project as mush as possible and let code clear.
1. output list in ResultSink node
2. regular children output in SetOperation node
3. producer to consumer slot id map in CteConsumer
Current cte common filter extraction doesn't work if the filters can be aggregated, which will lead the common filter can't be pushed down inside cte. Consider the following case:
with main as (select c1 from t1) select * from (select m1.* from main m1, main m2 where m1.c1 = m2.c1) abc where c1 = 1;
The common c1=1 filter can't be pushed down.
This pr fixed the original extraction logic from set to list to make the logic works, and this will also resolve the tpcds query4/11's pattern works well also.
In previous, when querying hive table in orc format, and the file is splitted.
the result of select count(*) may be multiple of the real row number.
This is because the number of rows should be got after orc strip prune,
otherwise, it may return wrong result
1. Change from using string matching function to using Expr matching
2. Replace the `nvl` function with `ifnull` when pushed down to MySQL
3. Adapt ClickHouse's `from_unixtime` function to push down
4. Non-function filtering can still be pushed down when `enable_func_pushdown` is set to false
support insert into table values(...) for Nereids.
sql like:
insert into t values(1, 2, 3)
insert into t values(1 + 1, dayofweek(now()), 4), (4, 5, 6)
insert into t values('1', '6.5', cast(1.5 as int))
1. Change the external hive docker network mode from the bridge mode to the host mode to support the external test of the multi-node doris cluster
2. Added more hive test data in various formats
3. Added a test case with hive
now we make error if we deal with leading zeros in decimal value , type_precision >= precision will make value overflow and DCHECK will fail , so if here has leading zero we should only make type_precision > precision to make value right