A rf is effective if it could filter target data.
In this pr, a rf is effective if any one of following conditions is satisfied:
A filter is applied on rf src, like T.A =1
A effective rf applied on this rf's src,
denote X as src and target insertsection range. src.ndv with respect to X is smaller than target.ndv
explaination of condition 2
Supplier join Nation on s_nationkey = n_nationkey
join Region on n_regionkey = r_regionkey
RF(nation->supplier) is effective because nation is filtered by an effective rf: RF(region->nation)
1. Analyze with sample automatically when table size is greater than huge_table_lower_bound_size_in_bytes(5G by default). User can disable this feature by fe option enable_auto_sample
2. Support grammer like `ANALYZE TABLE test WITH FULL` to force do full analyze whatever table size is
3. Fix bugs that tables stats doesn't get updated properly when stats is dropped, or only few column is analyzed
- feature: normalize date/datetime with leading 0
- feature: support 'HH' offset in date/datetime
- feature: normalize() add missing Minute/Second in Time part
- feature: normalize offset HH to HH:MM
- correct DateTimeFormatterUtilsTest
CREATE EXTERNAL TABLE `dim_server` (
`col1` varchar(50) NOT NULL,
`col2` varchar(50) NOT NULL
)
create view ads_oreo_sid_report
(
`col1` ,
`col2`
)
AS
select
tmp.col1,tmp.col2
from (
select 'abc' as col1,'def' as col2
) tmp
inner join dim_server ds on tmp.col1 = ds.col1 and tmp.col2 = ds.col2;
select * from ads_oreo_sid_report where col1='abc' and col2='def';
before this pr, col1='abc' and col2='def' can't be pushed to dim_server. now the 2 predicates can be pushed to odbc table.
```sql
CREATE TABLE IF NOT EXISTS t (
k1 tinyint NOT NULL,
k2 smallint NOT NULL,
k3 int NOT NULL,
k4 bigint NOT NULL,
k5 decimal(9, 3) NOT NULL,
k8 double max NOT NULL,
k9 float sum NOT NULL )
AGGREGATE KEY(k1,k2,k3,k4,k5)
PARTITION BY LIST(k1) (
PARTITION p1 VALUES IN ("1","2","3","4"),
PARTITION p2 VALUES IN ("5","6","7","8"),
PARTITION p3 )
DISTRIBUTED BY HASH(k1) BUCKETS 5 properties("replication_num" = "1")
select * from t where k1=10
```
The query will return 0 rows because p3 is pruned, we fix it by skip prune default partitions.
TODO: prune default partition if filter do not hit it
* unify all Date/Datetime use one string-parser
* support microsecond & ZoneOffset both exist
* add many UT case
* add determineScale() to get scale of datetime, original code just get length of part after .
* reject more bad condition like 2022-01-01 00:00:00., we don't allow . without microsecond.
* .....
1. ctas should support without distribution desc
2. ctas should support column name list
3. ctas should throw exception when excution failed
4. ctas should convert null type to tinyint
5. ctas should support type conversion
6. ctas should convert first column from string to varchar
In this case, forward to master will throw catalog or db not found exception:
Connect to a follower:
1. create database test
2. use test
3. drop database test
4. create database test
This is because after step 2, the default db in follower has been set to `test`, drop database will not change the default db. In step 4, the default db `test` is set and forwarded to master, and master will fail to find it because it is already dropped.
This pr is to set the default catalog and db only when they exist.
The actual reason is that, when Follower handle the `drop db` stmt, it will forward to master to execute it, but can not
unset its own "current db"
Fix three bugs:
1. Hudi slice maybe has log files only, so `new Path(filePath)` will throw errors.
2. Hive column names are lowercase only, so match column names in ignore-case-mode.
3. Compatible with [Spark Datasource Configs](https://hudi.apache.org/docs/configurations/#Read-Options), so users can add `hoodie.datasource.merge.type=skip_merge` in catalog properties to skip merge logs files.
Problem:
When inferring predicate,we lost cast of source expressions and some datatype derivation.
Example:
a = b and cast(a as targetType) = constant
(cast(a as targetType) = constant ) this expression is define as source expression.
we expect getting cast(b as targetType) = constant instead of b = constant
Reason:
When inferring predicate, we will compare original type of a and b. if they can be cast
without precision lost, a new predicate would be created. But created predicate forgot
to cast to target type
Solved:
Add cast to target type, and open make other datatype valid also.
before the lambda function Expr not implement toSqlImpl() function.
so it's call parent function, which is not suit for lambda function.
and will be have error when create view.
**Support sql cache for hms catalog. Legacy planner and Nereids planner are all supported.
Not support partition cache now, not support federated query now.**
The origin virtual number is Math.max(Math.min(512 / backends.size(), 32), 2);, which is too small,
causing uneven cache distribution when enabling file cache.
This pr mainly has two changes:
1. add some merge processes about partition events
2. add a ut for `MetastoreEventFactory`. First add some mock classes (`MockCatalog`/`MockDatabase` ...) to simulate the real hms catalog/databases/tables/partitions, then create a event producer which can produce every kinds of `MetastoreEvent`s randomly. Use two catalogs for test, one is named `testCatalog` and the other is the `validateCatalog`, use event producer to produce many events and let `validateCatalog` to handle all of the events, but `testCatalog` just handles the events which have been merged by `MetastoreEventFactory`, check if the `validateCatalog` is equals to `testCatalog`.
If there 3 above fe nodes,
the following opeartions will cause all FE nodes down.
DROP USER revoke_test_user
DROP ROLE revoke_test_role
DROP DATABASE IF EXISTS revoke_test_db
CREATE DATABASE revoke_test_db
CREATE ROLE revoke_test_role
CREATE USER revoke_test_user IDENTIFIED BY 'revoke_test_pwd'
GRANT SELECT_PRIV ON revoke_test_db.* TO ROLE 'revoke_test_role'
GRANT 'revoke_test_role' TO revoke_test_user
SHOW GRANTS FOR revoke_test_user
REVOKE 'revoke_test_role' from revoke_test_user
SHOW GRANTS FOR revoke_test_user
DROP USER revoke_test_user
DROP ROLE revoke_test_role
DROP DATABASE revoke_test_db