Before, when executing `create table hive.db.table as select` to create table in hive catalog,
if current catalog is not hive catalog, the default engine name will be filled with `olap`, which is wrong.
This PR will fill the default engine name base on specified catalog.
before fix, join node will retain some slots, which are not materialized and unrequired.
join node need remove these slots and not make them be output slots.
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
partition id will change when insert overwrite
When the materialized view runs a task, if the base table is in insert overwrite, the materialized view task may report an error: partition not found by partitionId
Upgrade compatibility: Hive currently does not support automatic refresh, so it has no impact
Problem:
When using current_date as input of functions like date_sub,
fold constant would failed cause of missing of function signature in Planner
Solved:
Add complete function signature of functions like date_sub
Slot mapping is used for materialized view rewritting
given the relation mapping, the slot mapping is the same
Optimize the slot mapping genarate logic
Cache the slot mapping in materialization context by realation mapping key
Need check materialization sql pattern in different abstract rule when rewrite by materialized view.
Such as the subClass of AbstractMaterializedViewJoinRule, MaterializedViewScanRule, AbstractMaterializedViewAggregateRule.
This check result can be cached when has checked, this can avoid unnecessary repeat check
if create MTMV `date_trunc(`xxx`,'month')`
when related table is `range` partition,and have 3 partitions:
```
20200101-20200102
20200102-20200103
20200201-20200202
```
then MTMV will have 2 partitions:
```
20200101-20200201
20200201-20200301
```
when related table is `list` partition,and have 3 partitions:
```
(20200101,20200102)
(20200103)
(20200201)
```
then MTMV will have 2 partitions:
```
(20200101,20200102,20200103)
(20200201)
```
When `skip_write_index_on_load` is turned on, users will get an error when querying for the latest data(not compacted), giving them a bad experience. And we can use `inverted_index_ram_dir_enable = true` and `inverted_index_storage_format=V2` to reduce IO and CPU consumption. So we disable it now.
1. Disable setting `skip_write_index_on_load` to `true` in create table stmt.
2. Disable setting `skip_write_index_on_load` to `true` in alter table properties stmt. You can still alter `skip_write_index_on_load` to `false`.
Co-authored-by: Luennng <luennng@gmail.com>
1. fix sql cache return old value when truncate partition
2. use expire_sql_cache_in_fe_second to control the expire time of the sql cache which in the NereidsSqlCacheManager
1. remove phmap for padding rows
2. add SimpleFieldVisitorToScarlarType for short circuit type deducing
3. correct type coercion for conflict types bettween integers
4. improve nullable column performance
5. remove shared_ptr dependancy for DataType use TypeIndex instead
6. Optimization by caching the order of fields (which is almost always the same)
and a quick check to match the next expected field, instead of searching the hash table.
benchmark:
In clickbench data, load performance:
12m36.799s ->7m10.934s about 43% latency reduce
In variant_p2/performance.groovy:
3min44s20 -> 1min15s80 about 66% latency reducy
Decoupling the MTMV from the materialization context.
Change MaterializationContext to abstract which is the materialization desc.
It now has AsyncMaterializationContext sub class, can also has other type of MaterializationContext such as
SyncMaterializationContext and so on.