Commit Graph

13820 Commits

Author SHA1 Message Date
2ec50dcfc7 [log](compaction) add more stats for compaction log (#24984) 2023-09-28 15:29:15 +08:00
a574f29d76 [enhancement](Nereids): use enforcer to choose the n-th plan (#22929) 2023-09-28 15:16:24 +08:00
b50c1448df [fix](Nereids) should not replace slot by Alias when do NormalizeSlot (#24928)
when we do NormalizeToSlot, we pushed complex expression and only remain
slot of it. When we do this, we collect alias and their child and
compute its child in bottom project, remain the result slot in current
node. for example

Window(max(...), c1 as a1)

after normalization, we get

Window(max(...), a1)
+-- Project(..., c1 as a1)

But, in some cases, we remove some SlotReference by mistake, for example

Window(max(...), c1, c1 as a1)

after normalization, we get

Window(max(...), a1)
+-- Project(..., c1 as a1)

we lost the SlotReference c1. This PR fix this problem. After this Pr,
we get

Window(max(...), c1, a1)
+-- Project(..., c1, c1 as a1)
2023-09-28 14:51:08 +08:00
377554ee1c [Fix](Job)Job Task does not display error message (#24897) 2023-09-28 14:47:12 +08:00
b6babf3af4 [pipelineX](sink) support jdbc table sink (#24970)
* [pipelineX](sink) support jdbc table sink
2023-09-28 14:39:32 +08:00
e863cfe5c7 [fix](nereids) fix multi window projection issue temporarily (#24912)
Current multi-window plan generation has problem on the project sequence, for example:

+--LogicalWindow ( windowExpressions=[avg(sum_sales#115) WindowSpec(...) AS `avg_monthly_sales`#116, rank() WindowSpec(...) AS `rn`#117], ...)
and correspond physical plan is:

+--PhysicalWindow[6572]@16 ( windowFrameGroup=(Funcs=[avg(sum_sales#115) WindowSpec(...) AS `avg_monthly_sales`#116], ... )
    +--PhysicalWindow[6568]@29 ( windowFrameGroup=(Funcs=[rank() WindowSpec(...) AS `rn`#117], ...] )
If the final plan is generated as following:

MultiCastDataSinks
STREAM DATA SINK
  EXCHANGE ID: 20
  HASH_PARTITIONED: rn[#208], i_brand[#202], cc_name[#203], i_category[#201]
Before we eventually resolve the multi-window issue, we add a projection as following and force a mapping but this will not cover all potential problems.

MultiCastDataSinks
STREAM DATA SINK
  EXCHANGE ID: 20
  HASH_PARTITIONED: rn[#219], i_brand[#213], cc_name[#214], i_category[#212]
  PROJECTIONS: i_category[#184], i_brand[#185], cc_name[#186], d_year[#187], d_moy[#188], sum_sales[#189], avg_monthly_sales[#191], rn[#190]
  PROJECTION TUPLE: 20
2023-09-28 14:33:00 +08:00
f5c38b29a5 [Improve](Load)Change the response label prefix of Update and Delete to the corresponding operations (#24996)
Doris Whether it is insert, delete, or update, the label prefix is insert, which may confuse users.

Change
Add Update and Delete label prefix

Test

mysql>  insert into t2 (id,id_str) values (2,'test2');
Query OK, 1 row affected (0.09 sec)
{'label':'insert_b16405a387f14bfa_947dc9b2217ee3df', 'status':'VISIBLE', 'txnId':'17023'}

mysql>  insert into t1 (id,id_str) values (2,'test2');
Query OK, 1 row affected (0.09 sec)
{'label':'insert_c3acdf63bf94e87_ad65a2dca88f5576', 'status':'VISIBLE', 'txnId':'17025'}

mysql> update t2 set id_str='update2';
Query OK, 2 rows affected (5.27 sec)
{'label':'update_903a88c8defe41d5_a7fca85159c84e50', 'status':'VISIBLE', 'txnId':'17026'}


mysql> delete from  t2  where id =2;
Query OK, 0 rows affected (5.56 sec)
{'label':'delete_1ca419aa-b7a2-41f6-9cbd-e14f4c7517f4', 'status':'VISIBLE', 'txnId':'17028'}

mysql> delete from t1 where t1.id in (select id from t2);
Query OK, 1 row affected (4.41 sec)
{'label':'delete_7e2ae75fee9a42b7_9322d4ae8b80a28b', 'status':'VISIBLE', 'txnId':'17034'}
2023-09-28 14:32:35 +08:00
b35171b582 [pipelineX](bug) fix distinct streaming agg (#24995) 2023-09-28 14:01:26 +08:00
5ffb01a068 [doc](doc)Update doc: delete Z-order (#24955) 2023-09-28 13:45:48 +08:00
11d03a3ab0 [thirdyparty] new thirdy party dragonbox (#24979) 2023-09-28 13:42:44 +08:00
bf808e9aa6 [fix](Nereids): tolerate DateLike overflow in SQL CAST/CONVERT (#24943)
- explicit type cast, we need tolerate overflow and convert it to be NULL
- implicit type cast, throw exception
2023-09-28 12:11:50 +08:00
f0fad61db4 [pipelineX](bug) Fix file scan operator (#24989) 2023-09-28 11:12:27 +08:00
188d9ab94e [enhancement](statistics) collect table level loaded rows on BE to make RPC light weight (#24609) 2023-09-28 10:51:50 +08:00
42207df89f [refactor](nereids)update NormalizeRepeat comments (#24893)
update NormalizeRepeat comments
2023-09-28 10:42:16 +08:00
4ff1ab7a4d [fix](regression-test) regenerate test_http_stream_properties.out file (#24946) 2023-09-28 10:39:15 +08:00
21d6f41492 [fix](regresion-test) Fix the problem of occasional failure of test_outfile_exception regression-test case (#24937) 2023-09-28 10:05:43 +08:00
584646c054 [improvement](nereids)dphyper GraphSimplifier should consider missed edges when estimating join cost (#21747) 2023-09-28 09:30:57 +08:00
430634367a [pipelineX](node)support file scan operator (#24924) 2023-09-27 22:10:43 +08:00
68087f6c82 [fix](json function) Fix the slow performance of get_json_path when processing JSONB (#24631)
When processing JSONB, automatically convert to jsonb_extract_string
2023-09-27 21:17:39 +08:00
732f821c15 [Fix](inverted index) make parser mode coarse grained by default (#24949) 2023-09-27 21:04:41 +08:00
d4e823950a [bug](json)Fix some problems of json function on Nereids (#24898)
Fix some problems of json_length and json_contains function on Nereids
fix wrong result of json_contains function
Regression test jsonb_p0 to enable Nereids
2023-09-27 21:01:45 +08:00
391a4e29eb [fix](schema) Table column order is changed if add a column and do truncate (#24981) 2023-09-27 20:59:11 +08:00
63b283a848 [fix](Nereids) init Date/DateV2Literal should check non-zero time fields (#24971) 2023-09-27 20:48:36 +08:00
947b116318 [pipelineX](fix) Fix BE crash due to ES scan operator (#24983) 2023-09-27 20:45:38 +08:00
1fb9022d07 [pipelineX](bug) Fix meta scan operator (#24963) 2023-09-27 20:34:47 +08:00
671b5f0a0a [Bug](pipeline) Fix block reusing for union source operator (#24977)
[CANCELLED][INTERNAL_ERROR]Merge block not match, self:[String], input:[String, Nullable(String), Nullable(String), Nullable(String), Nullable(String), DateV2]
2023-09-27 19:41:56 +08:00
bb7f8d18a8 [fix](nereids) push down filter through partition topn (#24944)
support pushing down filter through partition topn if the filter can pass through window.
fix CreatePartitionTopNFromWindow bug which may generate two partition topn unexpectly.
case:
select * from (select c2, row_number() over (partition by c2) as rn from t1) T where rn<=1 and c2 = 1;
before this pr:
| PhysicalResultSink                       |
| --PhysicalDistribute                     |
| ----filter((rn <= 1))                    |
| ------PhysicalWindow                     |
| --------PhysicalQuickSort                |
| ----------PhysicalDistribute             |
| ------------PhysicalPartitionTopN        |
| --------------filter((T.c2 = 1))         |
| ----------------PhysicalPartitionTopN    |
| ------------------PhysicalProject        |
| --------------------PhysicalOlapScan[t1] |
+------------------------------------------+
after:

| PhysicalResultSink                     |
| --PhysicalDistribute                   |
| ----filter((rn <= 1))                  |
| ------PhysicalWindow                   |
| --------PhysicalQuickSort              |
| ----------PhysicalDistribute           |
| ------------PhysicalPartitionTopN      |
| --------------PhysicalProject          |
| ----------------filter((T.c2 = 1))     |
| ------------------PhysicalOlapScan[t1] |
+----------------------------------------+
2023-09-27 19:38:04 +08:00
00786a3295 [fix](Nereids) could not prune datev1 partition column (#24959)
because storage engine could not process date comparison predicates.
we convert it to datetime comparison predicates.
however, partition prunner could not process cast(slot) cp literal.
so, we convert back in partition pruner to let it work well.

TODO:
move convert date to datetime in translate stage
and only convert predicates for storage engine.
2023-09-27 18:41:56 +08:00
a6bc0e7668 Revert "[chore](clang-tidy) Apply uninitialized variables check of clang-tidy (#23497)" (#24976) 2023-09-27 18:28:24 +08:00
5d138b6928 [remove](function) make execute_impl const and remove running_difference function (#24935) 2023-09-27 18:17:28 +08:00
100d76510c [Fix](HttpServer) Refactor API Endpoints to Only Allow GET Requests for Enhanced Security (#24855) 2023-09-27 17:10:11 +08:00
00e8d1c3b4 [Fix](Planner) disable bitmap type in compare expression (#24792)
Problem:
be core because of bitmap calculation.

Reason:
when be check failed, it would core directly.

Example:
SELECT id_bitmap FROM test_bitmap WHERE id_bitmap IN (NULL) LIMIT 20;

Solved:
Forbidden this kind of expression in fe when analyze. And also forbid bitmap type comparing in other unsupported expressions.
2023-09-27 16:57:06 +08:00
0227292c85 [bug](profile) query profile api of fe cann't get result if non-root user query on the other fe #24858 (#24914)
Issue Number: #24858

If isAllNode is true, the api should only distribute the query to all fe and do not run checkAuthByUserAndQueryId.
If isAllNode is false, the api queries profile on the fe, at this time the api should run checkAuthByUserAndQueryId.
2023-09-27 16:50:41 +08:00
c04078f3b8 [improvement](compaction) output tablet_id when be core dumped. (#24952) 2023-09-27 16:50:18 +08:00
19cff5d167 [fix](compile) failed on arm platform, with clang compiler and pch on (#24636)
failed on arm platform, with clang compiler and pch on
2023-09-27 16:47:02 +08:00
9562e280af [enhancement](Nereids): remove stats derivation in CostAndEnforce job (#24945)
1. remove stats derivation in CostAndEnforce job
2. enforce valid for each stats after estimating
2023-09-27 16:31:03 +08:00
Pxl
5fc04b6aeb [Improvement](hash) some refactor of process hash table probe impl (#24461)
some refactor of process hash table probe impl
2023-09-27 16:14:49 +08:00
83f5ff7b22 [typo](doc)modify error result of explode_split function. (#24185) 2023-09-27 03:04:18 -05:00
41dfbdac14 [typo](doc)modify the wrong parameter in the example. (#24307) 2023-09-27 03:04:07 -05:00
53025ce3fc [typo](doc)modify error description of GROUP_CONCAT fiction. (#24619) 2023-09-27 03:03:56 -05:00
a6d1da0db9 [typo](doc)modify error link description in fe-config-template (#24676) 2023-09-27 02:51:47 -05:00
aa4dbbedc7 [pipelineX](bug) Fix dead lock in exchange sink operator (#24947) 2023-09-27 15:40:25 +08:00
87a30dc41d [feature-wip](arrow-flight)(step3) Support authentication and user session (#24772) 2023-09-27 14:53:58 +08:00
26818de9c8 [feature](jni) support complex types in jni framework (#24810)
Support complex types in jni framework, and successfully run end-to-end on hudi.
### How to Use
Other scanners only need to implement three interfaces in `ColumnValue`:
```
// Get array elements and append into values
void unpackArray(List<ColumnValue> values);

// Get map key array&value array, and append into keys&values
void unpackMap(List<ColumnValue> keys, List<ColumnValue> values);

// Get the struct fields specified by `structFieldIndex`, and append into values
void unpackStruct(List<Integer> structFieldIndex, List<ColumnValue> values);
```
Developers can take `HudiColumnValue` as an example.
2023-09-27 14:47:41 +08:00
a1ab8f96a1 [fix](nereids) mark two phase partition topn global to notice be passthrough logic (#24886)
mark partition topn phase to notice be to handle passthrough logic well, this pr is fe part code.
be side logic: the the phase equals to PTopNPhase.TWO_PAHSE_GLOBAL, it should skip the bypass logic and do the second phase ptopn operation anyway.
2023-09-27 14:08:59 +08:00
1b0e3246ea [pipelineX](fix) Fix exception reporting and Nereids plan (#24936) 2023-09-27 13:15:40 +08:00
c04e5bac39 [bug](pipelineX) fix java-udaf failed with open pipelineX (#24939) 2023-09-27 13:14:10 +08:00
452318a9fc [Enhancement](streamload) stream tvf support user specified label (#24219)
stream tvf support user specified label
example:

curl -v --location-trusted -u root: -H "sql: insert into test.t1 WITH LABEL label1 select c1,c2 from http_stream(\"format\" = \"CSV\", \"column_separator\" = \",\")" -T example.csv http://127.0.0.1:8030/api/_http_stream
return:

{
    "TxnId": 2064,
    "Label": "label1",
    "Comment": "",
    "TwoPhaseCommit": "false",
    "Status": "Success",
    "Message": "OK",
    "NumberTotalRows": 2,
    "NumberLoadedRows": 2,
    "NumberFilteredRows": 0,
    "NumberUnselectedRows": 0,
    "LoadBytes": 27,
    "LoadTimeMs": 152,
    "BeginTxnTimeMs": 0,
    "StreamLoadPutTimeMs": 83,
    "ReadDataTimeMs": 92,
    "WriteDataTimeMs": 41,
    "CommitAndPublishTimeMs": 24
}
2023-09-27 12:09:35 +08:00
Pxl
18b5f70a7c [Bug](materialized-view) enable rewrite on select materialized index with aggregate mode (#24691)
enable rewrite on select materialized index with aggregate mode
2023-09-27 11:30:36 +08:00
6b64b7fec7 [typo](doc)Add flink to read the doris table and use doris.filter.query to configure the display (#24736) 2023-09-27 11:22:58 +08:00