This pr makes three changes to the display of complex types:
1. NULL value in complex types refers to being displayed as `null`, not `NULL`
2. struct type is displayed as "column_name": column_value
3. Time types such as `datetime` and `date`, are displayed with double quotes in complex types. like
`{1, "2023-10-26 12:12:12"}`
This pr also do a code refactor:
1. nesting_level is set to a member variable of the `DataTypeSerDe`, rather than a parameter in methods.
What's more, this pr fix a bug that fileSize is not correct, introduced by this pr: #25854
fix shrink char column in map/struct
before we has char with specific length defined in map or struct field
we select map or struct , the char column in which one has been padding if we just insert less than specific length chars
but in mysql here just show inserted chars not padding specific length chars
---------
Co-authored-by: yiguolei <676222867@qq.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This function requires one arguments just as ARRAY_AGG(col) and col means the column whose values you want to aggregate.
This function Aggregates the values including NULL in a column into an array and returns a value of the ARRAY data type.
This function can be used to replace bitmap_union(to_bitmap(expr)), because bitmap_union(to_bitmap(expr)) need create many many small bitmaps firstly and then merge them into a single bitmap.
bitmap_agg will convert the column value into a bitmap directly. Its performance is better than bitmap_union(to_bitmap(expr)) . In our test , there is about 30% improvement.
New aggregation function: map_agg.
This function requires two arguments: a key and a value, which are used to build a map.
select map_agg(column1, column2) from t group by column3;
In the previous implementation, the check for groupby exprs was ignored. Add this necessary check to make sure it would work
You could reproduce it by runnning belowing sql:
CREATE TABLE t_push_filter_through_agg (col1 varchar(11451) not null, col2 int not null, col3 int not null)
UNIQUE KEY(col1)
DISTRIBUTED BY HASH(col1)
BUCKETS 3
PROPERTIES(
"replication_num"="1"
);
CREATE VIEW `view_i` AS
SELECT
`b`.`col1` AS `col1`,
`b`.`col2` AS `col2`
FROM
(
SELECT
`col1` AS `col1`,
sum(`cost`) AS `col2`
FROM
(
SELECT
`col1` AS `col1`,
sum(CAST(`col3` AS INT)) AS `cost`
FROM
`t_push_filter_through_agg`
GROUP BY
`col1`
) a
GROUP BY
`col1`
) b;
SELECT SUM(`total_cost`) FROM view_a WHERE `dt` BETWEEN '2023-06-12' AND '2023-06-18' LIMIT 1;
Since slot that reference to constant has been marked as constant expr either, just add condition check to make sure such slot wouldn't be eliminated as constant from group exprs
1. the agg function without distinct keyword should be a "merge" funcion in threePhaseAggregateWithDistinct
2. use aggregateParam.aggMode.consumeAggregateBuffer instead of aggregateParam.aggPhase.isGlobal() to indicate if a agg function is a "merge" function
3. add an AvgDistinctToSumDivCount rule to support avg(distinct xxx) in some case
4. AggregateExpression's nullable method should call inner function's nullable method.
5. add a bind slot rule to bind pattern "logicalSort(logicalHaving(logicalProject()))"
6. don't remove project node in PhysicalPlanTranslator
7. add a cast to bigint expr when count( distinct datelike type )
8. fallback to old optimizer if bitmap runtime filter is enabled.
9. fix exchange node mem leak