doris

Author	SHA1	Message	Date
lihangyu	913282b29b	[refactor](column) remove `get_data_type` in IColumn (#25242 )	2023-10-10 20:27:15 +08:00
zhiqqqq	62a6b132be	[Fix](func numbers) Remove backend_nums argument of numbers function (#25200 )	2023-10-10 20:25:58 +08:00
morrySnow	5f95e97c56	[fix](function) array distance should return null when result is nan (#25214 )	2023-10-10 04:41:51 -05:00
Gabriel	6ca0f3fa5f	[Bug](writer) Fix ub in async writer (#25218 )	2023-10-10 16:00:45 +08:00
Gabriel	7434f80300	[pipelineX](refactor) Refactor pending finish dependency (#25181 )	2023-10-10 11:56:02 +08:00
HappenLee	880d0d7e70	[Bug](pipeline) Support the auto partition in pipeline load (#25176 )	2023-10-10 11:51:12 +08:00
Jerry Hu	f5b826b66d	[fix](mark join) mark join column should be nullable (#24910 )	2023-10-10 10:10:36 +08:00
amory	b8621364d2	[FIX](serde)fix scale with decimalv2 in mysql writer which get real scale #25190	2023-10-10 09:09:57 +08:00
Mingyu Chen	b58010c48e	[fix](export) BufferWritable must be committed before deconstruct (#25185 ) F20231009 16:03:47.659968 3342535 string_buffer.hpp:48] Check failed: _now_offset == 0 * Check failure stack trace: * @ 0x561a6f8e21e6 google::LogMessage::SendToLog() @ 0x561a6f8de7b0 google::LogMessage::Flush() @ 0x561a6f8e2a29 google::LogMessageFatal::~LogMessageFatal() @ 0x561a4a409233 doris::vectorized::BufferWritable::~BufferWritable() @ 0x561a6e202853 doris::vectorized::VCSVTransformer::write() @ 0x561a6e1f19ba doris::vectorized::VFileResultWriter::_write_file() @ 0x561a6e1f1522 doris::vectorized::VFileResultWriter::append_block() @ 0x561a6e121bed The error will occur in DEBUG mode, and doing export will invalid data. It has been covered by baidu case.	2023-10-09 22:39:45 +08:00
amory	53b46b7e6c	[FIX](filter) update for filter_by_select logic (#25007 ) this pr is aim to update for filter_by_select logic and change delete limit only support scala type in delete statement where condition only support column nullable and predict column support filter_by_select logic, because we can not push down non-scala type to storage layer to pack in predict column but do filter logic	2023-10-09 21:27:40 +08:00
yiguolei	4de3df6a46	[refactor](column) remove unused method and column definitions (#25152 ) remove unused method and column definitions using primitive type in predicate column to check datev1 and datev2	2023-10-09 17:14:35 +08:00
zhangstar333	d7b6fe57df	[Bug](java-udf) fix java-udf memory leak (#25151 )	2023-10-09 15:10:56 +08:00
Mryange	451e299151	[Opt](performance) Optimize timeround with minute / second (#25073 )	2023-10-08 23:14:23 +08:00
HappenLee	5c020be4d2	[Bug](join) corner case cause the mark join + null aware left join core dump in regression test in pipeline query engine (#25087 )	2023-10-08 22:50:12 +08:00
qiye	7e9ffad933	[fix](ES catalog)Doris cannot parse ES date field without time zone (#24864 ) 1. Add support for Doris to parse ES date field without time zone info. eg: `2023-04-17T23:01:18.151`, this time will be treated as UTC time, since ES assumes that the time zone for time fields without time zones is UTC. 2. Change local time zone convertion from system local time zone to session variable time zone.	2023-10-08 19:28:08 +08:00
yiguolei	b91335dbb8	[refactor](columndecimal) is_decimal_v2 member is useless because column decimal could detect by itself (#25110 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-10-08 18:09:19 +08:00
HHoflittlefish777	c3d9f42a3e	[fix](scanner) fix load cannot end when set exec_mem_limit (#25090 )	2023-10-08 17:07:30 +08:00
zzzzzzzs	6fe060b79e	[fix](streamload) fix http_stream retry mechanism (#24978 ) If a failure occurs, doris may retry. Due to ctx->is_read_schema is a global variable that has not been reset in a timely manner, which may cause exceptions. --------- Co-authored-by: yiguolei <676222867@qq.com>	2023-10-08 11:16:21 +08:00
zhangstar333	feb1cbe9ed	[bug](partition_sort)partition sort need sort all data in two phase global (#24960 ) #24886 this PR have mark phase in FE, now add those change in BE. partition sort need sort all data in two pahse global	2023-10-08 10:46:43 +08:00
zhangdong	4e8cde127c	[Enhance](catalog)add table cache in paimon jni (#25014 ) - fix get old schema after refresh paimon table - add table cache in paimon jni	2023-10-08 10:36:18 +08:00
Tiewei Fang	0df32c8e3e	[Fix](Outfile) Use data_type_serde to export data to csv file format (#24721 ) Modify the outfile logic, use the data type serde framework.	2023-10-07 22:50:44 +08:00
wangbo	7b2ff38401	query cpu hard limit based on doris scheduler (#24844 )	2023-10-07 12:03:07 +08:00
Mryange	0631ed61b0	[feature](profilev2) Preliminary support for profilev2. (#24881 ) You can set the level of counters on the backend using ADD_COUNTER_WITH_LEVEL/ADD_TIMER_WITH_LEVEL. The profile can then merge counters with level 1. set profile_level = 1; such as sql select count() from customer join item on c_customer_sk = i_item_sk profile Simple profile PLAN FRAGMENT 0 OUTPUT EXPRS: count() PARTITION: UNPARTITIONED VRESULT SINK MYSQL_PROTOCAL 7:VAGGREGATE (merge finalize) \| output: count(partial_count())[#44] \| group by: \| cardinality=1 \| TotalTime: avg 725.608us, max 725.608us, min 725.608us \| RowsReturned: 1 \| 6:VEXCHANGE offset: 0 TotalTime: avg 52.411us, max 52.411us, min 52.411us RowsReturned: 8 PLAN FRAGMENT 1 PARTITION: HASH_PARTITIONED: c_customer_sk STREAM DATA SINK EXCHANGE ID: 06 UNPARTITIONED TotalTime: avg 106.263us, max 118.38us, min 81.403us BlocksSent: 8 5:VAGGREGATE (update serialize) \| output: partial_count()[#43] \| group by: \| cardinality=1 \| TotalTime: avg 679.296us, max 739.395us, min 554.904us \| BuildTime: avg 33.198us, max 48.387us, min 28.880us \| ExecTime: avg 27.633us, max 40.278us, min 24.537us \| RowsReturned: 8 \| 4:VHASH JOIN \| join op: INNER JOIN(PARTITIONED)[] \| equal join conjunct: c_customer_sk = i_item_sk \| runtime filters: RF000[bloom] <- i_item_sk(18000/16384/1048576) \| cardinality=17,740 \| vec output tuple id: 3 \| vIntermediate tuple ids: 2 \| hash output slot ids: 22 \| RowsReturned: 18.0K (18000) \| ProbeRows: 18.0K (18000) \| ProbeTime: avg 862.308us, max 1.576ms, min 666.28us \| BuildRows: 18.0K (18000) \| BuildTime: avg 3.8ms, max 3.860ms, min 2.317ms \| \|----1:VEXCHANGE \| offset: 0 \| TotalTime: avg 48.822us, max 67.459us, min 30.380us \| RowsReturned: 18.0K (18000) \| 3:VEXCHANGE offset: 0 TotalTime: avg 33.162us, max 39.480us, min 28.854us RowsReturned: 18.0K (18000) PLAN FRAGMENT 2 PARTITION: HASH_PARTITIONED: c_customer_id STREAM DATA SINK EXCHANGE ID: 03 HASH_PARTITIONED: c_customer_sk TotalTime: avg 753.954us, max 1.210ms, min 499.470us BlocksSent: 64 2:VOlapScanNode TABLE: default_cluster:tpcds.customer(customer), PREAGGREGATION: ON runtime filters: RF000[bloom] -> c_customer_sk partitions=1/1, tablets=12/12, tabletList=1550745,1550747,1550749 ... cardinality=100000, avgRowSize=0.0, numNodes=1 pushAggOp=NONE TotalTime: avg 18.417us, max 41.319us, min 10.189us RowsReturned: 18.0K (18000) --------- Co-authored-by: yiguolei <676222867@qq.com>	2023-10-07 11:16:53 +08:00
HappenLee	a9d12f7b82	[Debug](float) Add clang debug tune float accuracy (#25041 )	2023-10-07 09:34:50 +08:00
Kaijie Chen	c2b46e4df7	[fix](move-memtable) exclude rpc memory in flush mem-tracker (#24722 )	2023-10-05 22:10:53 +08:00
Kaijie Chen	db6c16058a	[improve](move-memtable) always share load streams (#24763 )	2023-10-05 22:09:59 +08:00
HappenLee	93eedaff62	[opt](function) Use Dict to opt the function of time_round (#25029 ) Before： select hour_floor(`@timestamp`, 7) as t, count() as cnt from httplogs_date group by t order by t limit 10; +---------------------+--------+ \| t \| cnt \| +---------------------+--------+ \| 1998-04-30 21:00:00 \| 324 \| \| 1998-05-01 04:00:00 \| 286156 \| \| 1998-05-01 11:00:00 \| 266130 \| \| 1998-05-01 18:00:00 \| 483765 \| \| 1998-05-02 01:00:00 \| 276706 \| \| 1998-05-02 08:00:00 \| 169945 \| \| 1998-05-02 15:00:00 \| 223593 \| \| 1998-05-02 22:00:00 \| 272616 \| \| 1998-05-03 05:00:00 \| 188689 \| \| 1998-05-03 12:00:00 \| 184405 \| +---------------------+--------+ 10 rows in set (3.39 sec) after: select hour_floor(`@timestamp`, 7) as t, count() as cnt from httplogs_date group by t order by t limit 10; +---------------------+--------+ \| t \| cnt \| +---------------------+--------+ \| 1998-04-30 21:00:00 \| 324 \| \| 1998-05-01 04:00:00 \| 286156 \| \| 1998-05-01 11:00:00 \| 266130 \| \| 1998-05-01 18:00:00 \| 483765 \| \| 1998-05-02 01:00:00 \| 276706 \| \| 1998-05-02 08:00:00 \| 169945 \| \| 1998-05-02 15:00:00 \| 223593 \| \| 1998-05-02 22:00:00 \| 272616 \| \| 1998-05-03 05:00:00 \| 188689 \| \| 1998-05-03 12:00:00 \| 184405 \| +---------------------+--------+ 10 rows in set (2.19 sec)	2023-10-04 23:34:24 +08:00
amory	10f0c63896	[FIX](complex-type) fix agg table with complex type with replace state (#24873 ) fix agg table with complex type with replace state	2023-10-03 16:32:58 +08:00
HappenLee	f8a3034dca	[Opt](performance) refactor and opt time round floor function (#25026 ) refactor and opt time round floor function	2023-10-01 11:51:26 +08:00
bobhan1	642e5cdb69	[Fix](Status) Make `Status` `[[nodiscard]]` and handle returned `Status` correctly (#23395 )	2023-09-29 22:38:52 +08:00
Lijia Liu	864a0f9bcb	[opt](pipeline) Make pipeline fragment context send_report asynchronized (#23142 )	2023-09-28 17:55:53 +08:00
Mryange	430634367a	[pipelineX](node)support file scan operator (#24924 )	2023-09-27 22:10:43 +08:00
Chenyang Sun	68087f6c82	[fix](json function) Fix the slow performance of get_json_path when processing JSONB (#24631 ) When processing JSONB, automatically convert to jsonb_extract_string	2023-09-27 21:17:39 +08:00
Gabriel	947b116318	[pipelineX](fix) Fix BE crash due to ES scan operator (#24983 )	2023-09-27 20:45:38 +08:00
Mryange	5d138b6928	[remove](function) make execute_impl const and remove running_difference function (#24935 )	2023-09-27 18:17:28 +08:00
Pxl	5fc04b6aeb	[Improvement](hash) some refactor of process hash table probe impl (#24461 ) some refactor of process hash table probe impl	2023-09-27 16:14:49 +08:00
Gabriel	aa4dbbedc7	[pipelineX](bug) Fix dead lock in exchange sink operator (#24947 )	2023-09-27 15:40:25 +08:00
Ashin Gau	26818de9c8	[feature](jni) support complex types in jni framework (#24810 ) Support complex types in jni framework, and successfully run end-to-end on hudi. ### How to Use Other scanners only need to implement three interfaces in `ColumnValue`: ``` // Get array elements and append into values void unpackArray(List<ColumnValue> values); // Get map key array&value array, and append into keys&values void unpackMap(List<ColumnValue> keys, List<ColumnValue> values); // Get the struct fields specified by `structFieldIndex`, and append into values void unpackStruct(List<Integer> structFieldIndex, List<ColumnValue> values); ``` Developers can take `HudiColumnValue` as an example.	2023-09-27 14:47:41 +08:00
zhangstar333	c04e5bac39	[bug](pipelineX) fix java-udaf failed with open pipelineX (#24939 )	2023-09-27 13:14:10 +08:00
HappenLee	24ee3607e1	[Bug](pipeline) nullprt may be close the sink if init failed (#24926 )	2023-09-27 09:11:06 +08:00
meiyi	55d1090137	[feature](insert) Support group commit stream load (#24304 )	2023-09-26 20:57:02 +08:00
Tiewei Fang	28869b0f82	[fix](Outfile) Use data_type_serde to export data to orc file format (#24812 )	2023-09-26 19:46:42 +08:00
airborne12	94082ae59c	[Fix](inverted index) fix tokenize function coredump (#24896 )	2023-09-26 17:31:10 +08:00
huanghaibin	082bcd820b	[feature](insert) Support wal for group commit insert (#23053 )	2023-09-26 14:46:24 +08:00
Gabriel	a3427cb822	[pipelineX](fix) Fix nested loop join operator (#24885 )	2023-09-26 13:27:34 +08:00
zhangstar333	513e37bdbf	[pipelineX](node)support jdbc scan operator (#24851 )	2023-09-26 10:02:51 +08:00
plat1ko	8191cd1dad	[Bug](ScanNode) Fix potential incorrect query result caused by concurrent NewOlapScanNode initialization and Compaction (#24638 ) * Optimize fetch delete predicates * Fix incorrect query result when compaction eliminate delete predicates between `NewOlapScanNode::_init_scanners` and `NewOlapScanner::init` * Fix be ut	2023-09-25 22:24:35 +08:00
Gabriel	b38b8b4494	[pipelineX](fix) Fix BE crash caused by join and constant expr (#24862 )	2023-09-25 21:01:09 +08:00
Gabriel	3b4d8b4ac8	[pipelineX](feature) Support schema scan operator (#24850 )	2023-09-25 14:42:25 +08:00
wangbo	9412775686	remove useless variable in scanctx (#24849 ) remove useless variable in scanctx	2023-09-25 14:36:18 +08:00

1 2 3 4 5 ...

2321 Commits