Commit Graph

2134 Commits

Author SHA1 Message Date
ce271ff382 [fix](parquet)fix can not read parquet lz4 compress. (#27383)
Fixed the problem of not being able to read parquet lz4 compressed format. By default, it is decompressed according to the Hadoop lz4 format. If it fails, it will fall back to the standard lz4 compression format.
2023-11-29 19:04:53 +08:00
573f0eaad9 [fix](regression)fix parquet data page v2 unstable case (#27753) 2023-11-29 18:58:37 +08:00
498d27c905 [improve](json_reader) add prompt when all fields is null (#27630) 2023-11-29 18:26:42 +08:00
7398c3daf1 [Feature-Variant](Variant Type) support variant type query and index (#27676) 2023-11-29 10:37:28 +08:00
d771f16b79 [fix](parquet)fix bug that can not read parquet data page v2 (#27655) 2023-11-28 22:43:46 +08:00
Pxl
d969047b50 [Refactor](join) refactor of hash join (#27557)
Improve the performance under the tpch data set by reconstructing the join related code and the use of hash table

Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: BiteTheDDDDt <pxl290@qq.com>
2023-11-28 19:46:00 +08:00
b93dd1d5f7 [enhancement](load) improve error msg for load when cancelled by mem gc (#26809)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-11-28 17:36:11 +08:00
7087250b4a [fix](insert) txn insert and group commit should write \N string corr… (#27637) 2023-11-28 17:32:50 +08:00
91f56cefc0 [feature](Nereids): Pushdown TopN-Distinct through Union (#27628)
```
  TopN-Distinct
  -> Union All
  -> child plan1
  -> child plan2
  -> child plan3
 
  rewritten to
 
  TopN-Distinct
  -> Union All
    -> TopN-Distinct
      -> child plan1
    -> TopN-Distinct
      -> child plan2
    -> TopN-Distinct
      -> child plan3
```
2023-11-28 15:23:46 +08:00
2ea1e9db44 [fix](nereids) temp partition is always pruned (#27636) 2023-11-28 14:18:14 +08:00
f329b90696 [fix](show_variables) fix default value for special variables (#27651) 2023-11-28 11:35:46 +08:00
9903c30591 [opt](nereids)adjust distribution cost for better choice of broadcast join and shuffle join (#27113)
add boundary to distribution cost factor
2023-11-28 10:41:16 +08:00
4ea69ed390 [regression test](broker load) add case for num_as_string (#27588) 2023-11-27 21:25:59 +08:00
3d7d166355 [feature](cmd) add UNSET_VARIABLE statement to set back variables (#27552) 2023-11-27 20:30:04 +08:00
13b26ee920 [Fix](core) Fix wal space back pressure core and add regression test (#27311) 2023-11-27 15:10:26 +08:00
234aff3e78 [feature](Nereids): Pushdown TopN through Union (#27535)
```
topn
-> Union All 
  -> child plan1
  -> child plan2
  -> child plan3

rewritten to

topn
-> Union All 
 -> topn
  -> child plan1
 -> topn
  -> child plan2
 -> topn
  -> child plan3
```
2023-11-27 14:13:18 +08:00
1b4cd24b36 [opt](Nereids) support where, group by, having, order by clause without from clause in query statement (#27006)
Support where, group by, having, order by clause without from clause in query statement.
For example as following:

SELECT 1 AS a, COUNT(), SUM(2), AVG(1), RANK() OVER() AS w_rank
WHERE 1 = 1
GROUP BY a, w_rank
HAVING COUNT() IN (1, 2) AND w_rank = 1
ORDER BY a;

this will return result:

| a  |count(*)|sum(2)|avg(1)|w_rank|
+----+--------+------+------+------+
| 1  |       1|     2|   1.0|     1|


For another example as following:

select 1 c1, 2 union (select "hell0", "") order by c1
the second column datatype will be varchar(65533), 65533 is the default varchar length.

this will return result:

|c1    | 2 |
+------+---+
|1     | 2 |
|hell0 |   |
2023-11-27 12:05:14 +08:00
cd6c61347d [Feature](tvf)(avro-jni) avro-jni add projection push down (#26885) 2023-11-27 10:33:27 +08:00
baadc14e60 [Enhancement](function) support unix_timestamp with float (#26827)
---------

Co-authored-by: YangWithU <plzw8@outlook.com>
2023-11-27 09:58:53 +08:00
3791de3cfa [feature](mtmv)(6)implement cancel method (#27541)
1.implement cancel task method
2.fix `show create table ` not display `comment`
2023-11-27 09:49:46 +08:00
ff1a06abcf [test](regression) add routine load sequence and error test (#27519) 2023-11-25 23:30:20 +08:00
cc395f5428 [Fix](hive-transactional-table) Fix NPE when query empty hive transactional table. (#27563) 2023-11-25 10:29:39 +08:00
a0b1cb48a1 [Improve](regresscases) update cases for three-level nested types #27529 2023-11-24 20:53:28 +08:00
70bbaa4e56 [test](regression) add cases about datev1/datatimev1 (#27543)
All cases' results are tested and passed with datetime/date v2
Cases about:
Calculation ( +, -
Kinds of predicates(<, >, =, <>, in, not in, is null, is not null)
Load test(from csv and select into)
Runtime filter
Delete conditions
Key columns(agg/duplicate/uniq model, distributed/partition, bitmap index...)
2023-11-24 17:58:32 +08:00
553e4a8903 [feature-wip](merge-on-write) MOW table support different primary keys and sort keys (#24788) 2023-11-24 16:37:30 +08:00
dfe3a2dd01 [feature](mtmv)(3)Implementing multi table materialized views (#26146)
Introduction to Main Classes:
- MTMVService:MTMV services for other modules to call
- MTMVHookService:All operations that affect the MTMV
  - MTMVJobManager:All operations that affect the MTMV job
  - MTMVCacheManager:All operations that affect the MTMV Cache
- MTMVTask&MTMVJob:Inherit from job framework
2023-11-24 12:34:38 +08:00
da71fde066 [fix](build index) Fix inverted index hardlink leak and missing problem (#26903) 2023-11-24 10:30:21 +08:00
160adbaa69 [regression test](routine test) add case for num_as_string (#27436) 2023-11-24 10:15:47 +08:00
5adbe47d3a [test](regression) add stream load tvf properties regression test (#27467) 2023-11-23 23:04:10 +08:00
5d31bc99b8 [Fix](Group_commit) Fix group commit regression test failure (#27475) 2023-11-23 23:03:38 +08:00
2ea33518b0 [Opt](load) use batching to optimize auto partition (#26915)
use batching to optimize auto partition
2023-11-23 19:12:28 +08:00
d9f6e51884 [fix](planner)output slot should be materialized as intermediate slot in agg node (#27282) 2023-11-23 18:41:08 +08:00
1555b11035 [fix](nereids)remove literal partition by and order by expression in window function (#26899) 2023-11-23 18:40:51 +08:00
2ec3395087 [fix](planner)the data type should be the same between input slot and sort slot (#27137) 2023-11-23 18:40:02 +08:00
4b22fc14d5 [Feature](update) Support update on current_timestamp (#25884) 2023-11-23 16:23:31 +08:00
c884e46e6c [regression test](routine test) add case for desired_concurrent_number (#27372) 2023-11-23 15:11:01 +08:00
6253f7d6c7 [test](regression) add routine load condition test (#27430) 2023-11-23 14:37:35 +08:00
699798eaa7 [fix](function) make TIMESTAMP function DEPEND_ON_ARGUMENT (#27343)
* fix

* fix nullable

* remove null

* add case
2023-11-23 14:26:19 +08:00
Pxl
301bfe4d5d [Bug](mark-join) fix mark join report error when probe block have column do not output (#27360)
fix mark join report error when probe block have column do not output
2023-11-23 11:16:02 +08:00
42c32c584b [case](regression) test invalid jsonpaths (#27359)
Co-authored-by: qinhao <qinhao@newland.com.cn>
2023-11-23 10:16:34 +08:00
5b8aaf96d2 [fix](planner)scan node should project all required expr from parent node (#26886) 2023-11-23 09:44:21 +08:00
0302a9d026 [fix](fe) slots in having clause should be set to need materialized (#27412) 2023-11-22 19:47:09 +08:00
a2a6a722eb [test](regression) add routine load command test (#27384) 2023-11-22 18:55:35 +08:00
fd3c42d8cf [fix](test) order by clause in test_map (#27390) 2023-11-22 16:43:31 +08:00
e06e976a8b [test](case) delete duplicate pipelineX cases (#27381) 2023-11-22 12:58:30 +08:00
Pxl
b541de7a03 do not push down agg on aggregate column (#27356)
do not push down agg on aggregate column
2023-11-22 10:53:29 +08:00
9b59bc14b5 [test](Export) add show export regression testes (#27140) 2023-11-22 00:13:30 +08:00
6e86bf5b1b [test](decimalv2) add some regression cases about decimalv2 (#27352)
All cases' results are tested and passed with decimalv3
Cases about:
Calculation ( +, - , *, /)
Kinds of predicates(<, >, =, <>, in, not in, is null, is not null)
Load test(from csv and select into)
Runtime filter
Delete conditions
Key columns(agg/duplicate/uniq model, distributed/partition, bitmap index...)
2023-11-21 21:36:20 +08:00
c9b959d2d8 [opt](Nereids) AssertNumRows node should triger runtime filter pruning #27279
1. optimize rf prune when col stats are not avaliable
2. add regression case to check plan and rf for tpcds_sf100 with stats
3. add regression case to check plan and rf for tpcds_sf100 without stats
2023-11-21 21:00:41 +08:00
1cd1c58eee [Feature](group commit) move group_commit_interval_ms from be.conf to table property (#27116) 2023-11-21 20:50:02 +08:00