6fe060b79e
[fix](streamload) fix http_stream retry mechanism ( #24978 )
...
If a failure occurs, doris may retry. Due to ctx->is_read_schema is a global variable that has not been reset in a timely manner, which may cause exceptions.
---------
Co-authored-by: yiguolei <676222867@qq.com >
2023-10-08 11:16:21 +08:00
feb1cbe9ed
[bug](partition_sort)partition sort need sort all data in two phase global ( #24960 )
...
#24886 this PR have mark phase in FE, now add those change in BE.
partition sort need sort all data in two pahse global
2023-10-08 10:46:43 +08:00
4e8cde127c
[Enhance](catalog)add table cache in paimon jni ( #25014 )
...
- fix get old schema after refresh paimon table
- add table cache in paimon jni
2023-10-08 10:36:18 +08:00
239df5860b
[enhancement](tablet_meta_lock) add more trace for write lock of tablet's _meta_lock ( #25095 )
2023-10-08 10:28:10 +08:00
f66708db0e
[log](load) PUBLISH_TIMEOUT should not print stacktrace ( #25080 )
2023-10-08 10:16:25 +08:00
0df32c8e3e
[Fix](Outfile) Use data_type_serde to export data to csv file format ( #24721 )
...
Modify the outfile logic, use the data type serde framework.
2023-10-07 22:50:44 +08:00
8953179c11
[fix](multi-table) fix multi table task cannot end ( #25056 )
...
When exec multi table task, it can not end when exec plan error, which causes other routine load task can not submit.
2023-10-07 19:45:42 +08:00
59261174d5
[chore](unused) Remove unused variable CPU_HARD_LIMIT in task_group.cc ( #25076 )
...
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com >
2023-10-07 03:36:13 -05:00
335804bb25
[fix](pipelinex) fix multi cast sink without init ( #25066 )
2023-10-07 15:49:03 +08:00
7b2ff38401
query cpu hard limit based on doris scheduler ( #24844 )
2023-10-07 12:03:07 +08:00
0631ed61b0
[feature](profilev2) Preliminary support for profilev2. ( #24881 )
...
You can set the level of counters on the backend using ADD_COUNTER_WITH_LEVEL/ADD_TIMER_WITH_LEVEL. The profile can then merge counters with level 1.
set profile_level = 1;
such as
sql
select count(*) from customer join item on c_customer_sk = i_item_sk
profile
Simple profile
PLAN FRAGMENT 0
OUTPUT EXPRS:
count(*)
PARTITION: UNPARTITIONED
VRESULT SINK
MYSQL_PROTOCAL
7:VAGGREGATE (merge finalize)
| output: count(partial_count(*))[#44 ]
| group by:
| cardinality=1
| TotalTime: avg 725.608us, max 725.608us, min 725.608us
| RowsReturned: 1
|
6:VEXCHANGE
offset: 0
TotalTime: avg 52.411us, max 52.411us, min 52.411us
RowsReturned: 8
PLAN FRAGMENT 1
PARTITION: HASH_PARTITIONED: c_customer_sk
STREAM DATA SINK
EXCHANGE ID: 06
UNPARTITIONED
TotalTime: avg 106.263us, max 118.38us, min 81.403us
BlocksSent: 8
5:VAGGREGATE (update serialize)
| output: partial_count(*)[#43 ]
| group by:
| cardinality=1
| TotalTime: avg 679.296us, max 739.395us, min 554.904us
| BuildTime: avg 33.198us, max 48.387us, min 28.880us
| ExecTime: avg 27.633us, max 40.278us, min 24.537us
| RowsReturned: 8
|
4:VHASH JOIN
| join op: INNER JOIN(PARTITIONED)[]
| equal join conjunct: c_customer_sk = i_item_sk
| runtime filters: RF000[bloom] <- i_item_sk(18000/16384/1048576)
| cardinality=17,740
| vec output tuple id: 3
| vIntermediate tuple ids: 2
| hash output slot ids: 22
| RowsReturned: 18.0K (18000)
| ProbeRows: 18.0K (18000)
| ProbeTime: avg 862.308us, max 1.576ms, min 666.28us
| BuildRows: 18.0K (18000)
| BuildTime: avg 3.8ms, max 3.860ms, min 2.317ms
|
|----1:VEXCHANGE
| offset: 0
| TotalTime: avg 48.822us, max 67.459us, min 30.380us
| RowsReturned: 18.0K (18000)
|
3:VEXCHANGE
offset: 0
TotalTime: avg 33.162us, max 39.480us, min 28.854us
RowsReturned: 18.0K (18000)
PLAN FRAGMENT 2
PARTITION: HASH_PARTITIONED: c_customer_id
STREAM DATA SINK
EXCHANGE ID: 03
HASH_PARTITIONED: c_customer_sk
TotalTime: avg 753.954us, max 1.210ms, min 499.470us
BlocksSent: 64
2:VOlapScanNode
TABLE: default_cluster:tpcds.customer(customer), PREAGGREGATION: ON
runtime filters: RF000[bloom] -> c_customer_sk
partitions=1/1, tablets=12/12, tabletList=1550745,1550747,1550749 ...
cardinality=100000, avgRowSize=0.0, numNodes=1
pushAggOp=NONE
TotalTime: avg 18.417us, max 41.319us, min 10.189us
RowsReturned: 18.0K (18000)
---------
Co-authored-by: yiguolei <676222867@qq.com >
2023-10-07 11:16:53 +08:00
83a9d07288
[refactor](segment iterator) remove some code to make the logic more clear ( #25050 )
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2023-10-07 11:14:28 +08:00
bd582aee75
[pipelineX](minor) refine code ( #25015 )
2023-10-07 10:45:33 +08:00
a9d12f7b82
[Debug](float) Add clang debug tune float accuracy ( #25041 )
2023-10-07 09:34:50 +08:00
c2b46e4df7
[fix](move-memtable) exclude rpc memory in flush mem-tracker ( #24722 )
2023-10-05 22:10:53 +08:00
db6c16058a
[improve](move-memtable) always share load streams ( #24763 )
2023-10-05 22:09:59 +08:00
93eedaff62
[opt](function) Use Dict to opt the function of time_round ( #25029 )
...
Before:
select hour_floor(`@timestamp`, 7) as t, count() as cnt from httplogs_date group by t order by t limit 10;
+---------------------+--------+
| t | cnt |
+---------------------+--------+
| 1998-04-30 21:00:00 | 324 |
| 1998-05-01 04:00:00 | 286156 |
| 1998-05-01 11:00:00 | 266130 |
| 1998-05-01 18:00:00 | 483765 |
| 1998-05-02 01:00:00 | 276706 |
| 1998-05-02 08:00:00 | 169945 |
| 1998-05-02 15:00:00 | 223593 |
| 1998-05-02 22:00:00 | 272616 |
| 1998-05-03 05:00:00 | 188689 |
| 1998-05-03 12:00:00 | 184405 |
+---------------------+--------+
10 rows in set (3.39 sec)
after:
select hour_floor(`@timestamp`, 7) as t, count() as cnt from httplogs_date group by t order by t limit 10;
+---------------------+--------+
| t | cnt |
+---------------------+--------+
| 1998-04-30 21:00:00 | 324 |
| 1998-05-01 04:00:00 | 286156 |
| 1998-05-01 11:00:00 | 266130 |
| 1998-05-01 18:00:00 | 483765 |
| 1998-05-02 01:00:00 | 276706 |
| 1998-05-02 08:00:00 | 169945 |
| 1998-05-02 15:00:00 | 223593 |
| 1998-05-02 22:00:00 | 272616 |
| 1998-05-03 05:00:00 | 188689 |
| 1998-05-03 12:00:00 | 184405 |
+---------------------+--------+
10 rows in set (2.19 sec)
2023-10-04 23:34:24 +08:00
4ce5213b1c
[fix](insert) Fix test_group_commit_stream_load and add more regression in test_group_commit_http_stream ( #24954 )
2023-10-03 20:56:24 +08:00
6e836fe381
[fix](jdbc catalog) fix jdbc catalog read bitmap data crash ( #25034 )
2023-10-03 20:52:47 +08:00
10f0c63896
[FIX](complex-type) fix agg table with complex type with replace state ( #24873 )
...
fix agg table with complex type with replace state
2023-10-03 16:32:58 +08:00
f8a3034dca
[Opt](performance) refactor and opt time round floor function ( #25026 )
...
refactor and opt time round floor function
2023-10-01 11:51:26 +08:00
642e5cdb69
[Fix](Status) Make Status [[nodiscard]] and handle returned Status correctly ( #23395 )
2023-09-29 22:38:52 +08:00
d23bedf170
[fix](single-replica-load) fix duplicated done run in request_slave_tablet_pull_rowset ( #25013 )
...
BE will crash because done run twice when try_offer() failed in
request_slave_tablet_pull_rowset.
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com >
2023-09-28 21:08:18 +08:00
864a0f9bcb
[opt](pipeline) Make pipeline fragment context send_report asynchronized ( #23142 )
2023-09-28 17:55:53 +08:00
2ec50dcfc7
[log](compaction) add more stats for compaction log ( #24984 )
2023-09-28 15:29:15 +08:00
b6babf3af4
[pipelineX](sink) support jdbc table sink ( #24970 )
...
* [pipelineX](sink) support jdbc table sink
2023-09-28 14:39:32 +08:00
b35171b582
[pipelineX](bug) fix distinct streaming agg ( #24995 )
2023-09-28 14:01:26 +08:00
f0fad61db4
[pipelineX](bug) Fix file scan operator ( #24989 )
2023-09-28 11:12:27 +08:00
188d9ab94e
[enhancement](statistics) collect table level loaded rows on BE to make RPC light weight ( #24609 )
2023-09-28 10:51:50 +08:00
430634367a
[pipelineX](node)support file scan operator ( #24924 )
2023-09-27 22:10:43 +08:00
68087f6c82
[fix](json function) Fix the slow performance of get_json_path when processing JSONB ( #24631 )
...
When processing JSONB, automatically convert to jsonb_extract_string
2023-09-27 21:17:39 +08:00
d4e823950a
[bug](json)Fix some problems of json function on Nereids ( #24898 )
...
Fix some problems of json_length and json_contains function on Nereids
fix wrong result of json_contains function
Regression test jsonb_p0 to enable Nereids
2023-09-27 21:01:45 +08:00
947b116318
[pipelineX](fix) Fix BE crash due to ES scan operator ( #24983 )
2023-09-27 20:45:38 +08:00
1fb9022d07
[pipelineX](bug) Fix meta scan operator ( #24963 )
2023-09-27 20:34:47 +08:00
671b5f0a0a
[Bug](pipeline) Fix block reusing for union source operator ( #24977 )
...
[CANCELLED][INTERNAL_ERROR]Merge block not match, self:[String], input:[String, Nullable(String), Nullable(String), Nullable(String), Nullable(String), DateV2]
2023-09-27 19:41:56 +08:00
5d138b6928
[remove](function) make execute_impl const and remove running_difference function ( #24935 )
2023-09-27 18:17:28 +08:00
c04078f3b8
[improvement](compaction) output tablet_id when be core dumped. ( #24952 )
2023-09-27 16:50:18 +08:00
19cff5d167
[fix](compile) failed on arm platform, with clang compiler and pch on ( #24636 )
...
failed on arm platform, with clang compiler and pch on
2023-09-27 16:47:02 +08:00
5fc04b6aeb
[Improvement](hash) some refactor of process hash table probe impl ( #24461 )
...
some refactor of process hash table probe impl
2023-09-27 16:14:49 +08:00
aa4dbbedc7
[pipelineX](bug) Fix dead lock in exchange sink operator ( #24947 )
2023-09-27 15:40:25 +08:00
87a30dc41d
[feature-wip](arrow-flight)(step3) Support authentication and user session ( #24772 )
2023-09-27 14:53:58 +08:00
26818de9c8
[feature](jni) support complex types in jni framework ( #24810 )
...
Support complex types in jni framework, and successfully run end-to-end on hudi.
### How to Use
Other scanners only need to implement three interfaces in `ColumnValue`:
```
// Get array elements and append into values
void unpackArray(List<ColumnValue> values);
// Get map key array&value array, and append into keys&values
void unpackMap(List<ColumnValue> keys, List<ColumnValue> values);
// Get the struct fields specified by `structFieldIndex`, and append into values
void unpackStruct(List<Integer> structFieldIndex, List<ColumnValue> values);
```
Developers can take `HudiColumnValue` as an example.
2023-09-27 14:47:41 +08:00
1b0e3246ea
[pipelineX](fix) Fix exception reporting and Nereids plan ( #24936 )
2023-09-27 13:15:40 +08:00
c04e5bac39
[bug](pipelineX) fix java-udaf failed with open pipelineX ( #24939 )
2023-09-27 13:14:10 +08:00
452318a9fc
[Enhancement](streamload) stream tvf support user specified label ( #24219 )
...
stream tvf support user specified label
example:
curl -v --location-trusted -u root: -H "sql: insert into test.t1 WITH LABEL label1 select c1,c2 from http_stream(\"format\" = \"CSV\", \"column_separator\" = \",\")" -T example.csv http://127.0.0.1:8030/api/_http_stream
return:
{
"TxnId": 2064,
"Label": "label1",
"Comment": "",
"TwoPhaseCommit": "false",
"Status": "Success",
"Message": "OK",
"NumberTotalRows": 2,
"NumberLoadedRows": 2,
"NumberFilteredRows": 0,
"NumberUnselectedRows": 0,
"LoadBytes": 27,
"LoadTimeMs": 152,
"BeginTxnTimeMs": 0,
"StreamLoadPutTimeMs": 83,
"ReadDataTimeMs": 92,
"WriteDataTimeMs": 41,
"CommitAndPublishTimeMs": 24
}
2023-09-27 12:09:35 +08:00
24ee3607e1
[Bug](pipeline) nullprt may be close the sink if init failed ( #24926 )
2023-09-27 09:11:06 +08:00
a689a2fbb1
[pipelineX](fix) Fix projection expression ( #24923 )
2023-09-26 21:48:28 +08:00
55d1090137
[feature](insert) Support group commit stream load ( #24304 )
2023-09-26 20:57:02 +08:00
fe2879d8fe
[fix](merge-on-write) MergeIndexDeleteBitmapCalculator stack overflow ( #24913 )
2023-09-26 20:32:23 +08:00
77e864df12
[enhancement](delete) use column id in delete push task instead of column name ( #24549 )
2023-09-26 19:54:55 +08:00