Commit Graph

6669 Commits

Author SHA1 Message Date
dbff9d7a89 [chore](fuzzy) topn_opt_limit_threshold (#27496) 2023-11-24 14:08:27 +08:00
dbbab63623 [fix](nereids)keep cast operator if cast a varchar to another longer varchar in LogicalSetOperator (#27393) 2023-11-24 14:07:04 +08:00
dfe3a2dd01 [feature](mtmv)(3)Implementing multi table materialized views (#26146)
Introduction to Main Classes:
- MTMVService:MTMV services for other modules to call
- MTMVHookService:All operations that affect the MTMV
  - MTMVJobManager:All operations that affect the MTMV job
  - MTMVCacheManager:All operations that affect the MTMV Cache
- MTMVTask&MTMVJob:Inherit from job framework
2023-11-24 12:34:38 +08:00
c24a33c857 [enhancement](audit) hide password and other sensitive information in log and audit log (#27115)
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2023-11-24 10:27:30 +08:00
17ca75f834 [chore](Nereids): add eager aggregate into rules (#27505)
Add `Eager Aggregate` rules into Rewrite rules.
2023-11-24 10:06:04 +08:00
8e74470db9 [fix](statistics)Fix auto analyze remove finished job bug (#27486)
Finished job must be removed from the job list, otherwise the next batch of jobs will not be scheduled.
2023-11-23 23:22:02 +08:00
eb878ad0d2 [fix](Export) add feut for Cancel Export (#27178) 2023-11-23 23:18:30 +08:00
540bce4d1b [typo](log) Let env lock msg more distinct (#27493) 2023-11-23 23:03:06 +08:00
d73b945535 [chore](Nereids): rename pushdown to push_down (#27473) 2023-11-23 21:04:40 +08:00
d04a2de3cc [fix](hms) fix compatibility issue of hive metastore client (#27327)
For hive version lower than 2.3.7, there is no enum ClientCapability.INSERT_ONLY_TABLES.
So if we send this enum to the server side, the server side will get a null,
and this will cause some undefined behavior, eg, failed to get tables infos from hms.
2023-11-23 19:42:46 +08:00
2ea33518b0 [Opt](load) use batching to optimize auto partition (#26915)
use batching to optimize auto partition
2023-11-23 19:12:28 +08:00
511eedb4ff [fix](nereids)select base index if mv's data type is different from base table (#27387)
normally, mv column's data type should be same as base table. This pr plays as a fail-safe, if mv column's data type is different from base table accidentally, fall back to select base table to make the query works.
2023-11-23 18:41:59 +08:00
d9f6e51884 [fix](planner)output slot should be materialized as intermediate slot in agg node (#27282) 2023-11-23 18:41:08 +08:00
1555b11035 [fix](nereids)remove literal partition by and order by expression in window function (#26899) 2023-11-23 18:40:51 +08:00
2ec3395087 [fix](planner)the data type should be the same between input slot and sort slot (#27137) 2023-11-23 18:40:02 +08:00
772f181e94 [fix](stats) Fix thread leaks when doing checkpoint (#27334) 2023-11-23 03:18:19 -06:00
4b22fc14d5 [Feature](update) Support update on current_timestamp (#25884) 2023-11-23 16:23:31 +08:00
5d9c555dcf [minor](stats) Fix potential npe when loading stats (#27200)
Besides, lower down the log level of loading stats from warning to debug, since it doesn't such matter for workflow.
2023-11-23 01:37:58 -06:00
8e3b4e99d9 [improve](move-memtable) add switch for stream load in fe.conf (#27440) 2023-11-23 15:11:17 +08:00
c884e46e6c [regression test](routine test) add case for desired_concurrent_number (#27372) 2023-11-23 15:11:01 +08:00
97932d0381 [fix](export) the label of export should be unique with database scope (#27401)
### How to reproduce
1. create a database db1 and a table tbl1;
2. insert some data and export with label L1;
3. drop the db1 and tbl1, and recreate them with same name.
4. insert some data and export with same label L1;

Expect: export success
Actual: error: Label L1 have already been used.

This PR fix it.
2023-11-23 14:30:57 +08:00
93cfdffb75 [regression test](routine test) add case for exec_mem_limit (#27308) 2023-11-23 14:25:54 +08:00
dbbed113cf [feature](mtmv)(4)MTMV extends Olap (#26645) 2023-11-23 14:10:36 +08:00
33de92cc61 [improve](nereids) support agg function of count(const value) pushdown #26677
support sql: select count(1)-count(not null) from table, the agg of count could push down.
2023-11-23 11:26:06 +08:00
5b8aaf96d2 [fix](planner)scan node should project all required expr from parent node (#26886) 2023-11-23 09:44:21 +08:00
044a295541 [performance](Planner): optimize getStringValue() in DateLiteral (#27363)
- reduce cost of `getStringValue()` 
- original code don't consider `microsecond` part in `getStringValue()`
2023-11-22 22:42:44 +08:00
19c36dcc86 [Fix](statistics)Fix auto job start time incorrect bug (#27402)
Before, the auto analyze job start time was the job creation time, not the start to execute time, which is inaccurate. This pr is to change the start time to the first task start to execute time.
2023-11-22 21:38:08 +08:00
0302a9d026 [fix](fe) slots in having clause should be set to need materialized (#27412) 2023-11-22 19:47:09 +08:00
c7e3d74ffc [minor](stats) Report error with more friendly meesage when timeout (#27197) 2023-11-22 04:50:09 -06:00
cfb6af295f [fix](stats) Fix creating too many tasks on new env #27364
If there exists huge datasets with many database and may tables and many columns, Auto collector might be submit too many jobs which would occupy too much of FE memory.

In this PR, limit job each round could submit up to 5
2023-11-22 16:53:31 +08:00
6a48abeb80 [feature](Nereids) support queries tvf (#27138) 2023-11-22 02:47:16 -06:00
732a3fa9c8 [fix](stats) fix auto collector always create sample job no matter the table size (#26968) 2023-11-22 02:42:40 -06:00
127525ebe2 [hotfix](jdbc catalog) fix realColumnNames serialize npe (#27280)
In the previous PR #27124, we used `objectMapper.readValue` for deserialization. However, this method does not handle null fields, which can lead to issues when upgrading from older versions. Specifically, if a required field is missing in the persistent data, `String realColumnNamesJson = serializeMap.get(REAL_COLUMNS);` will return null, resulting in deserialization errors and frontend startup failure. This issue is likely to occur when upgrading from an older version that uses Jdbc Catalog to a new version including PR #27124. As this represents a specific upgrade scenario involving compatibility with old version data structures, it was not covered in the regular PR test cases. Given the specificity and difficulty in replicating such a scenario, no special test cases were added for this PR.
2023-11-22 15:22:06 +08:00
Pxl
b541de7a03 do not push down agg on aggregate column (#27356)
do not push down agg on aggregate column
2023-11-22 10:53:29 +08:00
eaa1ca7143 [fix](fe) Fix show frontends npt in some situations (#27295)
```
java.lang.NullPointerException: null
    at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getMasterSocket(ReplicationGroupAdmin.java:191)
    at com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:607)
    at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getGroup(ReplicationGroupAdmin.java:406)
    at org.apache.doris.ha.BDBHA.getElectableNodes(BDBHA.java:132)
    at org.apache.doris.common.proc.FrontendsProcNode.getFrontendsInfo(FrontendsProcNode.java:84)
    at org.apache.doris.qe.ShowExecutor.handleShowFrontends(ShowExecutor.java:1923)
    at org.apache.doris.qe.ShowExecutor.execute(ShowExecutor.java:355)
    at org.apache.doris.qe.StmtExecutor.handleShow(StmtExecutor.java:2113)
    ...
```
2023-11-22 10:46:59 +08:00
c9b959d2d8 [opt](Nereids) AssertNumRows node should triger runtime filter pruning #27279
1. optimize rf prune when col stats are not avaliable
2. add regression case to check plan and rf for tpcds_sf100 with stats
3. add regression case to check plan and rf for tpcds_sf100 without stats
2023-11-21 21:00:41 +08:00
1cd1c58eee [Feature](group commit) move group_commit_interval_ms from be.conf to table property (#27116) 2023-11-21 20:50:02 +08:00
c1435c0589 [regression test](routine test) add case for send_batch_parallelism (#27333) 2023-11-21 20:43:20 +08:00
d541424936 do not check invisible column stats (#27201)
1. forbid_unknown_col_stats check ignore invisible columns
2. a better error message when meet unknown col stats
2023-11-21 19:46:16 +08:00
dea40e7095 [fix](Nereids): NullSafeEqual should be in HashJoinCondition (#27127)
Originally, we just put `EqualTo` in `HashJoinCondition`, we also need to allow `NullSafeEqual`
2023-11-21 19:08:14 +08:00
445484270b [fix](Nereids): fill miss slot in having subquery (#27177)
fill miss slot in having subquery.

such as 
```
select * from t group by k having max(k) in (select k from t2)
```

the max(k) should be push down aggregate
2023-11-21 18:54:58 +08:00
7e707f5d64 [fix](fe ut) Fix OlapQueryCacheTest failed (#27305)
1. 
```
java.lang.NullPointerException: null
        at org.apache.doris.catalog.Env.getCurrentSystemInfo(Env.java:793) ~[classes/:?]
        at org.apache.doris.qe.SimpleScheduler$UpdateBlacklistThread.run(SimpleScheduler.java:206) ~[classes/:?]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_382]

java.lang.NullPointerException
        at org.apache.doris.qe.OlapQueryCacheTest.setUp(OlapQueryCacheTest.java:226)
```

2.
```
[ERROR] testSqlCacheKeyWithNestedViewForNereids  Time elapsed: 1.962 s  <<< FAILURE!
java.lang.AssertionError: SELECT command denied to user 'testCluster:testUser'@'192.168.1.1' for table 'internal: testCluster:testDb: appevent'
	at org.apache.doris.qe.OlapQueryCacheTest.parseSqlByNereids(OlapQueryCacheTest.java:579)
	at org.apache.doris.qe.OlapQueryCacheTest.testSqlCacheKeyWithNestedViewForNereids(OlapQueryCacheTest.java:1338)
```

3.
```
[ERROR] Tests run: 28, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 113.63 s <<< FAILURE! - in org.apache.doris.qe.OlapQueryCacheTest
[ERROR] testCacheModeTable  Time elapsed: 1.657 s  <<< ERROR!
java.lang.IllegalArgumentException: Value of type org.apache.doris.qe.QueryState incompatible with return type org.apache.doris.system.SystemInfoService of org.apache.doris.catalog.Env#getCurrentSystemInfo()
        at org.apache.doris.qe.OlapQueryCacheTest.setUp(OlapQueryCacheTest.java:156)
```
2023-11-21 16:47:57 +08:00
dd2e9f682f [Fix](statistics)Fix analyze sql including key word bug (#27321)
Fix analyze sql including key word bug. Need use `` to wrap column names.
2023-11-21 13:15:37 +08:00
Pxl
cee8cc44e2 [Bug](insert)fix insert wrong data on mv when stmt have multiple values (#27297)
fix insert wrong data on mv when stmt have multiple values
2023-11-21 12:55:30 +08:00
0b459e50fb [fix](partial update) keep case insensitivity and use the columns' origin names in partialUpdateCols in origin planner (#27223)
close: #27161
2023-11-20 21:16:28 +08:00
fec94b7278 [feature](Nereids): use session variable to enable rule (#27036) 2023-11-20 20:23:24 +08:00
19b1d5365c [minor](stats) rename stats related session variable name #26936 2023-11-20 18:13:12 +08:00
80c75b6da4 [fix](schema change) fix bug of query failure after rename column (#26300) 2023-11-20 16:54:40 +08:00
34c3cde0de Revert "[feature-wip](catalog) support deltalake catalog step1-metadata (#22493)" (#27095)
This reverts commit 5b641ebd40fff71e632ee9be4ede58b744b602b9.

Currently, Deltalake Catalog is not a usable feature. We will continue to implement it in the datalake plug-in system in the future, so we will delete it from the FE code for now.
2023-11-20 16:10:33 +08:00
d939903753 [improvement](statistics)Use count as ndv for unique/agg olap table single key column (#27186)
Single key column of unique/agg olap table has the same value of count and ndv, for this kind of column,
don't need to calculate ndv, simply use count as ndv.
2023-11-20 15:49:08 +08:00