Commit Graph

8289 Commits

Author SHA1 Message Date
1f0844b992 [fix](env)mock env.isCheckpointThread #24280
ssue Number: close #xxx

ShowTableStmtTest.testNoDb and DropDbStmtTest.testNoPriv are unstable cases,error msg is:

java.lang.Exception: Unexpected exception, expected<org.apache.doris.common.AnalysisException> but was<mockit.internal.expectations.invocation.MissingInvocation>
reason is missing mock env.isCheckpointThread
2023-09-14 11:37:01 +08:00
354cb3970b [feature](Nereids): normalize two-digit basic date/datetime (#24333)
normalize two-digit basic date/datetime

220201 -> 20220201
220201T010101 -> 20220201T010101
......
2023-09-14 11:25:00 +08:00
46f5988245 [fix](Nereids) set operation children output order not same (#24060)
we generate project for all set operation's children to ensure the order
of all children are not changed. However, some rules, such as
PushDownProjectThroughLimit could remove these projects involuntarily.
When it happen, the column order is wrong and lead to BE core dump.
This PR use a new variable in SetOperation to save the output order of
children of set operation. Then the children's output order could be
changed and never affect to SetOperation at all.
2023-09-14 11:09:58 +08:00
64337a8698 [Improve](metadata)Start the script to set metadata_failure_recovery (#24308) 2023-09-14 10:02:35 +08:00
1a4929b59e [fix](planner) having clause analyze bug #24288 2023-09-14 09:54:09 +08:00
9847f7789f [Feature](Export) Export sql supports to export data of view and exrernal table (#24070)
Previously, EXPORT only supported the export of the olap table,
This pr supports the export of view table and external table.
2023-09-13 22:55:19 +08:00
d7e5f97b74 [feature](Nereids): eliminate AssertNumRows (#23842) 2023-09-13 22:24:02 +08:00
dbfacdc4af [improvement](jdbc catalog) Optimize Loop Performance by Caching isNebula Method Result (#24260) 2023-09-13 21:40:28 +08:00
5238be24a2 [fix](jdbc catalog) Ensure Thread Safety by Refactoring isDoris&convertDateToNull Static Variable in JdbcMySQLClient (#24253) 2023-09-13 20:19:44 +08:00
dad671af8e [feature](nereids)prune runtime filter (tpch part) #19312
A rf is effective if it could filter target data.
In this pr, a rf is effective if any one of following conditions is satisfied:

A filter is applied on rf src, like T.A =1
A effective rf applied on this rf's src,
denote X as src and target insertsection range. src.ndv with respect to X is smaller than target.ndv
explaination of condition 2
Supplier join Nation on s_nationkey = n_nationkey
join Region on n_regionkey = r_regionkey
RF(nation->supplier) is effective because nation is filtered by an effective rf: RF(region->nation)
2023-09-13 20:12:08 +08:00
786a721e03 [feat](stats) Support analyze with sample automatically (#23978)
1. Analyze with sample automatically when table size is greater than huge_table_lower_bound_size_in_bytes(5G by default). User can disable this feature by fe option enable_auto_sample
2. Support grammer like `ANALYZE TABLE test WITH FULL` to force do full analyze whatever table size is
3. Fix bugs that tables stats doesn't get updated properly when stats is dropped, or only few column is analyzed
2023-09-13 19:42:10 +08:00
05722b4cfd [feature](Nereids): date/datetime parser support many complex case (#24287)
- feature: normalize date/datetime with leading 0
- feature: support 'HH' offset in date/datetime
- feature: normalize() add missing Minute/Second in Time part
- feature: normalize offset HH to HH:MM
- correct DateTimeFormatterUtilsTest
2023-09-13 17:30:58 +08:00
231038f050 [fix](planner)allow infer predicate for external table (#24227)
CREATE EXTERNAL TABLE `dim_server` (
    `col1` varchar(50) NOT NULL,
    `col2` varchar(50) NOT NULL
    )
create view ads_oreo_sid_report
    (
    `col1` ,
        `col2`
    )
    AS
    select
    tmp.col1,tmp.col2
    from (
    select 'abc' as col1,'def' as col2
    ) tmp
    inner join dim_server ds on tmp.col1 = ds.col1  and tmp.col2 = ds.col2;

select * from ads_oreo_sid_report where col1='abc' and col2='def';

before this pr,  col1='abc' and col2='def' can't be pushed to dim_server. now the 2 predicates can be pushed to odbc table.
2023-09-13 17:22:39 +08:00
d87b852e18 [enhancement](delete-handler) split Deletehandler#commitJob and add preconditions to intercept NPE(#24086) 2023-09-13 14:34:12 +08:00
335064f897 [feature](Nereids) add lambda argument and array_map function (#23598)
add array_map function

SELECT ARRAY_MAP(x->x+1, ARRAY(87, 33, -49))
+----------------------------------------------------------------------+
| array_map([x] -> (x + 1), x#1 of array(87, 33, -49))     |
+----------------------------------------------------------------------+
| [88, 34, -48]                                                                 |
+----------------------------------------------------------------------+
2023-09-13 14:24:16 +08:00
f985b28ac6 [fix](Nereids) default partition be prunned by mistake (#24186)
```sql
CREATE TABLE IF NOT EXISTS t ( 
            k1 tinyint NOT NULL, 
            k2 smallint NOT NULL, 
            k3 int NOT NULL, 
            k4 bigint NOT NULL, 
            k5 decimal(9, 3) NOT NULL,
            k8 double max NOT NULL, 
            k9 float sum NOT NULL ) 
        AGGREGATE KEY(k1,k2,k3,k4,k5)
        PARTITION BY LIST(k1) ( 
            PARTITION p1 VALUES IN ("1","2","3","4"), 
            PARTITION p2 VALUES IN ("5","6","7","8"), 
            PARTITION p3 ) 
        DISTRIBUTED BY HASH(k1) BUCKETS 5 properties("replication_num" = "1")

select * from t where k1=10
```
The query will return 0 rows because p3 is pruned, we fix it by skip prune default partitions.

TODO: prune default partition if filter do not hit it
2023-09-13 12:04:20 +08:00
7025293e17 [refactor](Nereids): new Date/Datetime parser to support more condition (#24224)
* unify all Date/Datetime use one string-parser
* support microsecond & ZoneOffset both exist
* add many UT case
* add determineScale() to get scale of datetime, original code just get length of part after .
* reject more bad condition like 2022-01-01 00:00:00., we don't allow . without microsecond.
* .....
2023-09-13 11:20:27 +08:00
f205473426 [feat](stats) enable set auto analyze time by set global session variable (#24026) 2023-09-13 10:59:25 +08:00
1a3b70bf4a [fix](Nereids) fix ctas bugs (#24267)
1. ctas should support without distribution desc
2. ctas should support column name list
3. ctas should throw exception when excution failed
4. ctas should convert null type to tinyint
5. ctas should support type conversion
6. ctas should convert first column from string to varchar
2023-09-13 09:17:57 +08:00
ebe3749996 [fix](tvf)support s3,local compress_type and append regression test (#24055)
support s3,local compress_type and append regression test.
2023-09-13 00:32:59 +08:00
9df72a96f3 [Feature](multi-catalog) Support hadoop viewfs. (#24168)
### Feature

Support hadoop viewfs.

### Test

- Regression tests: 
  - hive viewfs test.
  - tvf viewfs test.

- Broker load with broker and with hdfs tests manually.
2023-09-13 00:20:12 +08:00
c402d48f97 [fix](query-cache) fix query cache with empty set (#24147)
If the query result set is empty, the query cache will not cache the result.
This PR fix it.
2023-09-12 20:11:20 +08:00
d3f1388717 [Feature](partitions) Support auto-partition (#24153)
Co-authored-by: zhangstar333 <2561612514@qq.com>
2023-09-12 15:23:15 +08:00
4bb9a12038 [function](bitmap) support bitmap_remove (#24190) 2023-09-12 14:52:04 +08:00
9e0d843501 [fix](publish) publish go ahead even if quorum is not met (#23806)
Co-authored-by: Yongqiang YANG <dataroaring@gmail.com>
2023-09-12 14:29:01 +08:00
2e2e174804 [fix](forward master op)Set default catalog and db only when they exist in master FE while executing forwarded stmt (#24212)
In this case, forward to master will throw catalog or db not found exception:
Connect to a follower:
1. create database test
2. use test
3. drop database test
4. create database test

This is because after step 2, the default db in follower has been set to `test`, drop database will not change the default db. In step 4, the default db `test` is set and forwarded to master, and master will fail to find it because it is already dropped.

This pr is to set the default catalog and db only when they exist.

The actual reason is that, when Follower handle the `drop db` stmt, it will forward to master to execute it, but can not
unset its own "current db"
2023-09-12 14:12:18 +08:00
232f120edc [Improve](Job)Support other types of Job query interfaces (#24172)
- Support MTMV job
- Task info add create time and sql
- Optimize scheduling logic
2023-09-12 13:55:56 +08:00
5ab2aea8af add test for bindExpr (#24032)
add unit test for bindExpression rule
2023-09-12 11:00:57 +08:00
fca34ec337 [fix](multi-catalog)support bit type and hidden mc secret key (#24124)
support max compute bit type and mask mc secret key
bool type will use bit arrow vector
should mask secret key: close #24019
2023-09-12 10:36:48 +08:00
484215e1cc [fix](Nereids): datetime - offset is wrong & support Two-Digital date (#24201)
- bug: datetime - offset is wrong
- support Two-Digital date
- remove useless override code
2023-09-12 10:17:56 +08:00
6e28d878b5 [fix](hudi) compatible with hudi spark configuration and support skip merge (#24067)
Fix three bugs:
1. Hudi slice maybe has log files only, so `new Path(filePath)`  will throw errors.
2. Hive column names are lowercase only, so match column names in ignore-case-mode.
3.  Compatible with [Spark Datasource Configs](https://hudi.apache.org/docs/configurations/#Read-Options), so users can add `hoodie.datasource.merge.type=skip_merge` in catalog properties to skip merge logs files.
2023-09-11 19:54:59 +08:00
115969c3fb [opt](nereids) improve eliminate outerjoin in cascades (#24120)
* eliminate outer join cascading
2023-09-11 19:42:05 +08:00
a538b4922c [fix](block rule) throw npe when use Nereids explain or fallback (#24182) 2023-09-11 18:03:46 +08:00
b5227af6a1 [Feature](partitions) Support auto partition FE part (#24079) 2023-09-11 17:48:19 +08:00
6384198136 [minor](fe) optimize some log info and imports issue (#24138) 2023-09-11 16:16:58 +08:00
f27f486e8d fix missing stats in physical plan (#24159) 2023-09-11 15:41:32 +08:00
be3618316f [Fix](Nereids) fix infer predicate lost cast of source expression (#23692)
Problem:
When inferring predicate,we lost cast of source expressions and some datatype derivation.

Example:
a = b and cast(a as targetType) = constant
(cast(a as targetType) = constant ) this expression is define as source expression.
we expect getting cast(b as targetType) = constant instead of b = constant

Reason:
When inferring predicate, we will compare original type of a and b. if they can be cast
without precision lost, a new predicate would be created. But created predicate forgot
to cast to target type

Solved:
Add cast to target type, and open make other datatype valid also.
2023-09-11 14:30:31 +08:00
e847091dfe [fix](Nereids): add DateTimeFormatterUtils and fix bug (#24171)
bug
- should reject 20200219 010101
- datetime should be compatible with date
2023-09-11 14:28:03 +08:00
8b5453296e [fix](optimizer) Fix sql block when new optimizer is enabled (#23804)
The check would skipped since when checkBlockPolicy get invoked, new optimizer doesn't do plan yet
2023-09-11 14:27:11 +08:00
b4020a13ef [Improve](Routineload)Set the maximum timeout for obtaining partition to 60s (#24173) 2023-09-11 14:15:06 +08:00
7abd88f1b4 remove editlogport in frontrnds disks (#24047) 2023-09-11 12:38:56 +08:00
9c441a4a16 [feature](Nereids) support create table and ctas (#24150)
Co-authored-by: sohardforaname <organic_chemistry@foxmail.com>
2023-09-11 12:37:58 +08:00
db139cfd6e [fix](log) delete useless log (#24161)
useless log in #23635
2023-09-11 12:08:59 +08:00
d18d272ac2 [improvement](jdbc catalog) Added create jdbc catalog properties validation (#23764) 2023-09-11 10:38:53 +08:00
d2cd0c30c7 [improvement](jdbc catalog) optimize the JDBC Catalog connection error message (#23868) 2023-09-11 10:26:54 +08:00
480fcef0a1 [typo](errmsg) Improve partition error message (#23968) 2023-09-11 10:25:06 +08:00
cd13f9e8c6 [BUG](view) fix can't create view with lambda function (#23942)
before the lambda function Expr not implement toSqlImpl() function.
so it's call parent function, which is not suit for lambda function.
and will be have error when create view.
2023-09-11 10:04:00 +08:00
31bffdb5fc [enhancement](stats) audit for stats collection #24074
log stas collection sqls in audit log
2023-09-11 08:26:12 +08:00
586492c124 [Feature](multi-catalog) Support sql cache for hms catalog (#23391)
**Support sql cache for hms catalog. Legacy planner and Nereids planner are all supported. 
Not support partition cache now, not support federated query now.**
2023-09-10 21:56:35 +08:00
f85da7d942 [improvement](jdbc) add profile for jdbc read and convert phase (#23962)
Add 2 metrics in jdbc scan node profile:
- `CallJniNextTime`: call get next from jdbc result set
- `ConvertBatchTime`: call convert jobject to columm block

Also fix a potential concurrency issue when init jdbc connection cache pool
2023-09-10 21:42:06 +08:00