Commit Graph

13721 Commits

Author SHA1 Message Date
c3fe113894 rename PaloFe to DorisFE (#18167) 2023-03-29 00:30:16 +08:00
a813ad56ad [fix](multi-catalog) key and value columns of map are normal column type (#18160)
PR(#17330) has changed the column type of kay and value from array to normal column, but orc&parquet reader still cast to array column, resulting in cast error.
2023-03-28 23:11:40 +08:00
6b6682cd96 [Enhancement](Expr) Opt In Set by small size fixed container to improve performance. (#17976) 2023-03-28 23:10:39 +08:00
5d218388f3 [enhancement](stmt-forward) make fe follower err msg shown to client be consistent with master (#18180)
Found that RPC timeout is too short that RPC client will close before execute result is return.

Therefore, use a coefficient to prolong the RPC client timeout, so that it can wait for the real cause to be recieved.
2023-03-28 21:27:45 +08:00
4f2135f869 [test](Nereids) Add regression test for query empty table of tpcds (#18172)
Add regression test for query empty table tpcds, this can prevent test fallback
2023-03-28 20:34:27 +08:00
012f7bd031 [feature](function)Add ST_Area function (#18138) 2023-03-28 19:36:09 +08:00
cff6a7195b [feature](Nereids): add bushy tree rule; (#18130) 2023-03-28 19:32:53 +08:00
d27201f331 [fix](nested_loop_join)got incorrect result from nested loop join without condition (#18139) 2023-03-28 16:20:05 +08:00
ba1b159ad2 [fix](regression) deal with output order and timeout for segcompaction p1 (#18162)
1. Add `order by` to regulate the output order to avoid false-negative
    mismatch for dup table.
2. Increase load timeout.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-03-28 16:00:27 +08:00
b9161295b7 [Fix](plan) fix bug that the case sensibility of column name may impact join method (#17904)
Issue Number: close #17876
2023-03-28 15:18:30 +08:00
6bd2609294 [Enhancement](multi-catalog) add config for external meta cache loade… (#18117)
Add config for external cache-loader's max thread-pool size.
2023-03-28 15:10:19 +08:00
d7dcdfcba9 [Fix](Create View) support create view from tvf (#18087)
Support create view as select * from tvf()
2023-03-28 15:07:32 +08:00
d6339b36a4 [fix](Nereids): correct the order of pushdown semi rules. (#18148) 2023-03-28 14:20:07 +08:00
1956f04aa2 [feature](multi-catalog) add specified_database_list PROPERTY for jdbc/hms/iceberg catalog (#17803)
add specified_database_list PROPERTY for jdbc catalog, user can use many database specified by jdbc catalog
2023-03-28 14:04:41 +08:00
daeaa91dd6 [feature](function) support variadic template type in SQL function (#17985)
Inspired by c++ function `std::vector::emplace_back()`, we can use variadic template for this issue.

e.g.

```
[['struct'], 'STRUCT<TYPES>', ['TYPES'], 'ALWAYS_NOT_NULLABLE', ['TYPES...']]
```

`...TYPES` in template_types defines a variadic template `TYPE`.  Then the variadic template will be expanded to multiple normal templates based on actual input arguments at runtime in FE.

But make sure `TYPES...` is placed on the last position in all template type arguments.

BTW, the origin template function logic is not affected.
2023-03-28 11:08:24 +08:00
Pxl
d2839eb41f [Chore](Materialized-View) add some mv regression test case (#18095)
add some mv regression test case
2023-03-28 10:31:37 +08:00
Pxl
9c1e86f84f [Bug](materialized-view) add some limit for create mv on aggregate table (#18141)
add some limit for create mv on aggregate table.
```sql
CREATE TABLE t1 (   
p1 INT,   
p2 INT,   
p3 INT,   
v1 INT SUM,   
v2 INT MAX,   
v3 INT MIN ) AGGREGATE KEY (p1, p2, p3) DISTRIBUTED BY HASH (p1) BUCKETS 1 PROPERTIES ('replication_num' = '1');


CREATE MATERIALIZED VIEW mv_1 AS SELECT p1, SUM(v3) FROM t1 GROUP BY p1;  // invalid aggregate type
CREATE MATERIALIZED VIEW mv_2 AS SELECT p1, MIN(v3+v3) FROM t1 GROUP BY p1; // invalid expression calculate on aggregate column
CREATE MATERIALIZED VIEW mv_3 AS SELECT p1, SUM(v1) FROM t1 GROUP BY p1; // cast v1 as bigint, ok
CREATE MATERIALIZED VIEW mv_4 AS SELECT p1, SUM(abs(v1)) FROM t1 GROUP BY p1; // invalid expression calculate on aggregate column

```
2023-03-28 10:28:29 +08:00
6af93016a8 [typo](docs) fix docs DROP-CATALOG.md (#18135) 2023-03-28 10:17:07 +08:00
60073ebc84 [typo](docs) fix docs install-deploy.md (#18132) 2023-03-28 10:16:51 +08:00
d8ab8662af [typo](docs) fix docs multi-catalog.md (#18133) 2023-03-28 10:16:38 +08:00
4e015fcfb2 [typo](docs) fix docs DROP-ROLE.md (#18143) 2023-03-28 10:16:21 +08:00
09e346e47c [fix](type) Data precision is lost when converting DOUBLE type data to DECIMAL (#17191) (#17562)
1. Fix bug when converting DOUBLE to DECIMAL;
2. Fix bug when converting DOUBLE to DECIMALV3;
2023-03-28 09:46:43 +08:00
c95b81f950 [fix](order by) fix bug of order by desc when rowsets is no overlapping (#18100)
In the case of rowets non-overlap and desc sorting, the logic of VCollectIterator::Level0Iterator::init_for_union will be followed. In this function, the row ref pos of the first level0 iterator is set to 0, and the row pos of other level0 iterators are all Set to -1.

But in the level1iterator, when rowets are non-overlapping and is ordering by desc, the list of rowset iterators will be reversed, causing the row ref pos of the first level0 iterator in the list to be -1, causing the block reader to think that the entire tablet has no data.
2023-03-28 09:31:37 +08:00
99427d409d [vectorized](udaf) fix java-udaf case is unstable with fuzzy mode #18146
he udaf case is unstable reason:
when fuzzy enable_pipeline_engine=true, the case of agg function only 1 instance,
so not merge the default value, but if instance>1, will merge the default value
2023-03-28 09:30:49 +08:00
a7ce99d8b3 [typo](docs) fix docs SELECT.md (#18144) 2023-03-28 08:45:56 +08:00
84c6f47e4f [Feature](Nereids) add WinMagic rule to rewrite scalar sub-query to window function (#17968)
refer paper: WinMagic - Subquery Elimination Using Window Aggregation

SQL like TPC-H Q2 and Q17, which contains a correlated sub-query with only one aggregation function output, we can eliminate the sub-query and transform it to window function. For example, TPC-H Q17 is

```sql
select
        sum(l_extendedprice) / 7.0 as avg_yearly
    from
        lineitem,
        part
    where
        p_partkey = l_partkey
        and p_brand = 'Brand#23'
        and p_container = 'MED BOX'
        and l_quantity < (
            select
                0.2 * avg(l_quantity)
            from
                lineitem
            where
                l_partkey = p_partkey
        );
```

we rewrite it to

```sql
select
        sum(l_extendedprice) / 7.0 as avg_yearly
    from (
    select
        l_extendedprice, l_quantity, avg(l_quantity) over(partition by l_partkey) avg_l_quantity
    from 
        lineitem,
        part
    where
        p_partkey = l_partkey
        and p_brand = 'Brand#23'
        and p_container = 'MED BOX' )
    where l_quantity < 0.2 * avg_l_quantity
```

now the rule can only handle: where conjuncts in outer scope contain one sub-query and the conjunct contain sub-query is a comparison-predicate, we will support compound-predicate and more than one conjuncts containing sub-query later.
2023-03-27 23:58:41 +08:00
359f5be53e [refactor](cgroup) remove cgroup manager it is useless (#18124)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-03-27 23:02:18 +08:00
fa586c00a9 [fix](merge-on-write) fix that missed rows don't match merged rows (#18128)
Due to concurrent load, there may be duplication in the delete bitmap of historical data and incremental calculations, resulting in duplicate calculations of missed rows.
2023-03-27 23:00:54 +08:00
5191b4f473 [fix](ut)support run be-ut on release mode (#18119)
Fixed improper usage. So now be ut could be run on release mode.
btw, split be build type environment variable to be/be-ut.
2023-03-27 23:00:03 +08:00
115e52c16c [Opt](array) optimize_array_sort (#18123) 2023-03-27 22:01:24 +08:00
ac5b47e515 [bugfix](addlog) expr context is not closed and will core during deconstructor (#18134) 2023-03-27 21:59:46 +08:00
ee80c12815 [feature](json) add json_extract function (#17808) 2023-03-27 21:19:47 +08:00
785e3e3bca [Enhancement](multi catalog) Support hive meta cache TTL (#18102)
Currently, if user modify the file on hdfs directly, no through hive. The changes of file will not be noticed by Doris and user
will get wrong data. Support the TTL(Time-to-Live) config of File Cache, so that the stale file info will be invalidated automatically after expiring.

1.Add a parameter configuration to set file cache ttl. "file.meta.cache.ttl-second".
2.Set the value corresponding to guava expireAfterAccess to the configuration value.

Co-authored-by: lexluo <lexluo@tencent.com>
2023-03-27 19:19:31 +08:00
c8e4684578 [enhancement](nereids)support topN opt in nereids (#17741)
1. support topN opt in nereids
2. pushdown limit->proj->sort
2023-03-27 18:57:56 +08:00
2785202816 [Bug](regression-test) be coredump in pipeline when grace exit in regression test (#18131) 2023-03-27 18:36:27 +08:00
894f38a517 [fix](planner) fix conjunct planned on exchange node (#18042)
sql like: 
select k5, k6, SUM(k3) AS k3 
from ( 
    select
        k5,
        date_format(k6, '%Y-%m-%d') as k6,
        count(distinct k3) as k3 
    from t 
    group by k5, k6
) AS temp where 1=1
group by k5, k6;

will throw exception since conjuncts planned on exchange node, because exchange node cannot handle conjuncts, now we skip exchange node when planning conjuncts, which fixes the bug. 
notice: the bug occurs iff the conjunct is always true like 1=1 above.
2023-03-27 17:50:52 +08:00
902629adb6 [fix](planner) fix targetTypeDef NPE when value is null (#18072)
sql like:
select * from (select *, null as top from v1)t where top = 5;
select * from (select *, null as top from v1)t where top is not null;
will cause NPE because targetTypeDef is null when value is null. Now we use cast target type to the targetTypeDef.
2023-03-27 17:29:14 +08:00
cd85b5b262 [conf](nereids) disable new cost model since it hurts performance (#18127) 2023-03-27 16:12:15 +08:00
da8c53a831 [feat](Nereids): pushdown semijoin through agg. (#18105) 2023-03-27 15:27:44 +08:00
642c378fc7 [feature](table-valued-function) add Backends table-valued-function (#17667)
This pr implement a new Metadata TVF called backends. And the implement process tutorial is in #17974.
2023-03-27 15:18:31 +08:00
1576130094 [ehancement](stats) Tune stats framework (#18118) 2023-03-27 14:38:10 +08:00
8b07021f5f [enhancement](regression-test) add hint to disable nereids planner for some cases (#18066) 2023-03-27 14:06:50 +08:00
f03598f214 [enhance](cooldown) no snapshot or migration action for cooldown tablet (#17658) 2023-03-27 13:35:32 +08:00
d1f34a3be4 [bugfix](inverted index)temporary disable skip read column data if it match inverted index (#18065)
The optimization for skip reading column data if it match inverted index and only used in WHERE clause may get wrong result for complex SQL.

This PR temporary disable the optimization and later PRs will resolve the problem fundamentality.
2023-03-27 11:29:42 +08:00
dc7b2015f5 eh (#18122) 2023-03-27 11:09:35 +08:00
2929a96224 [Refactor](inverted index cache) Use asc set instead of priority queue at the lru cache (#18033)
use asc set instead of priority queue at the LRU cache, to keep the lifecycle of the LRUHandle consistent in the sorted set and the LRU free list
2023-03-27 10:27:37 +08:00
bcf95cd920 [feature](function)Add ST_Angle_Sphere function (#17919) 2023-03-27 10:14:46 +08:00
fd5dd9a391 [Opt](Pipeline) opt pipeline code in mult tablet (#17999) 2023-03-27 10:02:48 +08:00
990479e177 [refactor](memory) Query waits for memory free in Allocator, after memory exceed limit. (#18075)
After the memory exceeds the limit, the previous query waited for memory free in the mem hook, and changed it to wait in the Allocator.

more controllable and safe
2023-03-27 09:06:03 +08:00
78abb40fdc [improvement](string) throw exception instead of log fatal if string column exceed total size limit (#17989)
Throw exception instead of log fatal if string column exceed total size limit, so that we can catch it and let query fail, instead of causing be exit.
2023-03-27 08:55:26 +08:00