Commit Graph

6823 Commits

Author SHA1 Message Date
b7c2bf81fe [chore](planner) remove some useless code (#34430)
remove vectorAnalyze and computeOutputColumn function on Expr
remove vectorOpcode and outputColumn attribute on Expr
remove useless static LOG object on some Expr class
2024-05-10 14:39:17 +08:00
520774a24b [fix](serde) fix ipv4/v6 serde functions for arrow, orc, parquet format (#34042)
this PR is from @sjyango work in #32326,
wants merge #32326 into master branch, but it's draft and not maintain long time. so have this new PR.
Co-authored-by: sjyango <sjyang2022@zju.edu.cn>
2024-05-10 14:37:04 +08:00
07207b7b51 [feature](shuffle) enable strict consistency dml by default (#32958) (#34641) 2024-05-10 14:31:50 +08:00
5a3107442a [feature](tvf) support query table value function (#34516) (#34640)
This PR supports a Table Value Function called `Query`. He can push a query directly to the catalog source for execution by specifying `catalog` and `query` without parsing by Doris. Doris only receives the results returned by the query.
Currently only JDBC Catalog is supported.

Example:

```
Doris > desc function query('catalog' = 'mysql','query' = 'select count(*) as cnt from test.test');           
+-------+--------+------+------+---------+-------+
| Field | Type   | Null | Key  | Default | Extra |
+-------+--------+------+------+---------+-------+
| cnt   | BIGINT | Yes  | true | NULL    | NONE  |
+-------+--------+------+------+---------+-------+

Doris > select * from query('catalog' = 'mysql','query' = 'select count(*) as cnt from test.test');           
+----------+
| cnt      |
+----------+
| 30000000 |
+----------+
```
2024-05-10 14:29:17 +08:00
60e5583b01 [fix](nereids-Branch-2.1) fix bug: try to prune a not-exist rf #34630 2024-05-10 14:28:19 +08:00
c055174483 [fix](insert) fix forget to abort txn when insert checkStrictMode failed (#34612) (#34632) 2024-05-10 11:29:11 +08:00
7a40f2a547 [branch-2.1](resource)fix check available fail when s3 aws_token is set and reset as, sk faild on be. (#34219) 2024-05-09 19:06:14 +08:00
53332eb4ba [fix](catalog) refactor the schema cache for external table (#34517) (#34599)
bp #34517
2024-05-09 18:02:18 +08:00
3ae3f9d6e1 [opt](catalog) support using loading cache for db/table list in external catalog (#33610) (#34596)
bp #33610
2024-05-09 17:50:39 +08:00
8fa1b78d7b Revert "[feature](shuffle) enable strict consistency dml by default (#32958)"
This reverts commit 400105a92182755bdd95a58a7d378d67c6b27f51.
2024-05-08 23:00:46 +08:00
7ec12eed9f Fix duplicate normal workload group when upgrade 2.0->2.1.1->2.1.2 (#34560) 2024-05-08 21:59:26 +08:00
202cdb2744 [fix](mtmv)fix refresh failed when not use db before create MTMV (#34431) (#34522)
when create MTMV,we will save current ctl and db.

when refresh MTMV,will create an ConnectContext, and set same ctl, db to ctx

when db,ctx dropped, task will be failed.

But sometimes deleting a db does not actually have an impact, so changing it to not directly fail. If refreshing the data does cause an error, then giving the user an error message
2024-05-08 16:03:22 +08:00
400105a921 [feature](shuffle) enable strict consistency dml by default (#32958) 2024-05-08 11:00:14 +08:00
39fdc9ba0c [refactor](executor)Rename workload schedule policy #34497 2024-05-08 08:35:20 +08:00
182177def0 [Improve](config)The stream_load label length is changed to a configurable (#34459)
pick from #33745
2024-05-07 20:43:16 +08:00
ac56255f82 [opt](inverted index) the "unicode" tokenizer can be configured to disable stop words. (#34467) 2024-05-07 18:23:43 +08:00
63cd632abe Revert "[fix](statistics) Use column update rows to decide min/max stats are valid or not (#34263)"
This reverts commit 8db4d48731688354d6ee3ae22e02041419ca73e0.
2024-05-07 08:09:37 +08:00
a33715bc1c [fix](partial update) only unique table with MOW insert with target columns can consider be a partial update (#33656)
* [fix](partial update) only unique table with MOW insert with target columns can consider be a partial update

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

* fix 1

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

---------

Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2024-05-07 07:53:25 +08:00
92dc8ed718 [opt](mtmv) Add enable materialized view nest rewrite switch (#34197)
* [opt](mtmv) Add enable materialized view nest rewrite switch

* fix ut

* fix ut2
2024-05-07 07:51:18 +08:00
818022cadf [Fix](MethodName) fix method issue #34178 2024-05-07 07:50:54 +08:00
2b5bac3b60 [enhance](serde) expr serde support gson (#34180)
In the future, it can be easier to change to Expression to avoid metadata compatibility issues
2024-05-07 07:50:27 +08:00
e840102e99 [Feat](nereids)nereids support create table like (#34025)
nereids support create table like statement.
e.g. CREATE TABLE test1.table2 LIKE test1.table1
2024-05-07 07:50:19 +08:00
8f40882701 [fix](nereids) disable PROJECT_OTHER_JOIN_CONDITION rule if bitmap filter is enabled. #34189
this pr is a quick solution, but not complete.
runtime filter on NestLoopJoin suffers this bug even without PROJECT_OTHER_JOIN_CONDITION rule.

for example, when enable Min_Max Runtime filter, the target Expression is n_regionkey, but it should be "n_regionkey - 28"
explain
select n_nationkey, nrkey
from (select n_regionkey -28 nrkey, n_nationkey from nation) T
join region on nrkey > r_regionkey;

we will refactor RuntimeFilterGenerator to completely solve this issue in following pr.
2024-05-07 07:49:35 +08:00
74029f56d4 [BugFix](TabletInvertedIndex) fix replica not found in TabletInvertedIndex (#34117)
* fix replica not found in TabletInvertedIndex
2024-05-07 07:48:13 +08:00
fe7d2b8159 [Fix](nereids) ignore slot implements SlotNotFromChildren when check the slot from children in NormalizeAggregate (#34171) 2024-05-07 07:48:05 +08:00
f90c2f6401 fix syntax error for CreateTableLikeStmt with partition properties (#34187)
fix syntax error for CreateTableLikeStmt with partition properties
2024-05-07 07:47:39 +08:00
8ee7bc430d [fix](Nereids) should derive stats asap to avoid npe (#34238)
we do derive stats job eager to avoid un derive stats due to merge group and optimize group
consider:
  we have two groups burned by order: G1 and G2
  then we have job by order derive G2, optimize group expression in G2,
    derive G1, optimize group expression in G1
  if G1 merged into G2, then we maybe generated job optimize group G2 before derive G1
  in this case, we will do get stats from G1's child before derive G1's child stats
  then we will meet NPE in CostModel.
2024-05-07 07:47:07 +08:00
Pxl
e66dcd0e72 [Bug](materialized-view) change nvl to ifnull when create mv (#34272)
change nvl to ifnull when create mv
2024-05-07 07:45:33 +08:00
e9064d1b94 [fix](Nereids) topn should not inherit logical properties when repace child (#34282) 2024-05-07 07:44:36 +08:00
a391cf6bfe [fix](Nereids) rewritten mv should check output set and should not return null (#34288)
1. we should check output set, since we will remove top project and
   result output size will diff with its child output size if there are
   dup slot in result list
2. should not return null, instead we should return rewritten plan
   itself, because we will use return result in many place and do not
   check null at all
2024-05-07 07:44:16 +08:00
8db4d48731 [fix](statistics) Use column update rows to decide min/max stats are valid or not (#34263)
This is a following pr of #33685
After #33703 merged, need to check update rows in column level instead of table level.
2024-05-07 07:41:28 +08:00
ad35968236 [Fix](Job)Job repaly logic error (#34378) 2024-05-07 07:37:14 +08:00
3fd3dfe16f [Feat](Job) Job supports task execution statistics (#34109)
* Support statistics

* - Fix Failed task not showing up in the task list
- Task metadata add jobName
- Fix Finished job clear time error
- Job metadata add successCount, failedCount, totalTaskCount

* add test
2024-05-07 07:36:54 +08:00
956ae2f83d [opt](Nereids) let behavior of function char same with legacy planner (#34415)
1. first argument must be string like literal
2. only support utf-8 charset
2024-05-07 07:34:34 +08:00
2d4da7d177 [fix](kerberos)enable hadoop auto renew tgt (#34439) 2024-05-07 00:36:20 +08:00
f7900b53ce [enhancement](function) floor/ceil/round/round_bankers can use column as scale argument (#34391) 2024-05-06 22:18:36 +08:00
b7b843d944 [fix](load) acquire latest token instead of oldest token in TokenManager (#34424)
* [fix](load) acquire latest token instead of oldest token

* fixup
2024-05-06 20:19:36 +08:00
3cb0deae9c [opt](ranger) modify and enhance the feature of ranger access controller (#34392) (#34426)
bp #34392
2024-05-06 17:08:47 +08:00
7ae5de316b [feature](Nereids) support set and use statement syntax only (#33979) (#34409)
pick from master #33979
commit id 65fb7d43b7e838c48502d4e8a69e2541dc73aa88

This PR:
1. add a new Command type: UnsupportedCommand to handle the statement only support parse but could not execute.
2. support syntax about set and use
3. add keyword VAULT to follow legacy planner

TODO
1. support all statment syntax in Nereids
2024-05-06 11:36:01 +08:00
7248420cfd [chore](session_variable) Add 'data_queue_max_blocks' to prevent the DataQueue from occupying too much memory. (#34017) (#34395) 2024-05-05 21:20:33 +08:00
c3096cabe2 [Fix](executor)normal group not auth #34377 2024-05-02 15:17:19 +08:00
8da260ee0d [fix](hdfs)read 'fs.defaultFS' from core-site.xml for hdfs load which has no default fs (#34217) (#34372)
bp #34217
Co-authored-by: slothever <18522955+wsjz@users.noreply.github.com>
2024-05-01 00:31:49 +08:00
581e168ee1 [Fix](executor)Fix normal workload group alter may failed #34356 2024-04-30 22:17:40 +08:00
5fc1f11cf1 [improvement](hive)add the queryid to the temporary file path (#34278) (#34368)
bp #34278

Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>
2024-04-30 22:00:05 +08:00
50855f01c7 [fix](nereids) when runtimefilter target is null, skip the rf #34358 2024-04-30 18:48:50 +08:00
35f8563a75 [feature](iceberg) support iceberg equality delete (#34223) (#34327)
bp #34223

Co-authored-by: Ashin Gau <AshinGau@users.noreply.github.com>
2024-04-30 11:51:29 +08:00
d1df0b8878 [fix](mtmv)Solving the problem of calling each other in toString() loops (#34277) (#34317)
bp #34277
2024-04-29 21:46:29 +08:00
3495ed58e0 [Enhancement](jdbc catalog) Change Jdbc connection pool to hikari (#34045) (#34310) 2024-04-29 20:22:48 +08:00
7cb00a8e54 [Feature](hive-writer) Implements s3 file committer. (#34307)
Backport #33937.
2024-04-29 19:56:49 +08:00
1bfe0f0393 [feature](iceberg)support read iceberg complex type,iceberg.orc format and position delete. (#33935) (#34256)
master #33935
2024-04-29 14:40:12 +08:00