doris

Author	SHA1	Message	Date
YueW	bb179b77f7	[Feature-WIP](inverted index) support array type for inverted index reader (#16355 )	2023-02-02 16:14:14 +08:00
Ashin Gau	9618427020	[improvement](multi-catalog) increase default batch_size to 4064 (#16326 ) The performance of ClickBench Q30 is affected by batch_size: \| batch_size \| 1024 \| 4096 \| 20480 \| \| -- \| -- \| -- \| -- \| \| Q30 query time \| 2.27 \| 1.08 \| 0.62 \| Because aggregation operator will create a new result block for each batch block, and Q30 has 90 columns, which is time-consuming. Larger batch_size will decrease the number of aggregation blocks, so the larger batch_size will improve performance. Doris internal reader will read at least 4064 rows even if batch_size < 4064, so this PR keep the process of reading external table the same as internal table.	2023-02-02 11:51:09 +08:00
yiguolei	eba70f972e	[improvement](global context) remove some unused method from runtime state (#16329 ) This is part of #16296. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-02-02 10:24:55 +08:00
Jerry Hu	696c6ffcc5	[fix](join) crash caused by canceling query (#16311 ) If the query was canceled, the status in shared context may be `OK` with other fields not set.	2023-02-02 09:55:37 +08:00
Ashin Gau	1c5279d26e	[fix](multi-catalog) remove the eof check among parquet columns (#16302 ) Read parquet file failed: ``` ERROR 1105 (HY000): errCode = 2, detailMessage = [INTERNAL_ERROR]Read parquet file xxx failed, reason = [CORRUPTION]The number of rows are not equal among parquet columns ``` This error may be thrown when reading non-predicate columns in lazy-read, for example: A row group with 1000 rows has tow non-predicate columns. Column A has one page, Column B has two pages with 500 rows for each page. The read range of `ParquetColumnReader` is [0, 400), and the rows between [0, 450) are all filtered by predicate columns. So column A can skip the first page, and reach the EOF, while column B can also skip the first page, but doesn't read the EOF.	2023-02-02 09:22:09 +08:00
Gabriel	82faa965f5	[Bug](followup) fix datev2 functions (#16330 )	2023-02-01 22:38:34 +08:00
huangzhaowei	b878a7e61e	[feature](Load)Suppot skip specific lines number for csv stream load (#16055 ) Support set skip line number for stream load to load csv file. Usage `-H skip_lines:number`: ``` curl --location-trusted -u root: -T test.csv -H skip_lines:5 -XPUT http://127.0.0.1:8030/api/testDb/testTbl/_stream_load ``` Skip line number also can be used in mysql load as below: ```sql LOAD DATA LOCAL INFILE '${mysql_load_skip_lines}' INTO TABLE ${tableName} COLUMNS TERMINATED BY ',' IGNORE 2 LINES PROPERTIES ("auth" = "root:"); ```	2023-02-01 20:42:43 +08:00
AlexYue	bb0d4ba787	[BugFix](sort) use correct agg function when using 2 phase sort for agg table (#16185 )	2023-02-01 20:07:43 +08:00
Jibing-Li	d224624bbe	[improvement](session variable)Add enable_file_cache session variable (#16268 ) Add enable_file_cache session variable, so that we can close file cache without restart BE.	2023-02-01 18:15:03 +08:00
TengJianPing	bf16228851	[fix](hashjoin) join produce blocks with rows larger than batch size (#16166 ) * [fix](hashjoin) join produce blocks with rows larger than batch size * fix	2023-02-01 16:02:31 +08:00
HappenLee	aaae1497cd	[Refactor](function) opt the exec of function with null column (#16256 )	2023-02-01 15:56:31 +08:00
Pxl	ca73c60442	[Chore](build) enable ignored-qualifiers check (#16196 ) enable ignored-qualifiers check	2023-02-01 15:15:59 +08:00
Pxl	1b99746355	[Bug](function) enchance esquery error msg && forbid to_quantile_state #16274 forbidden to_quantile_state temporary to avoid core dump. waiting for [Feature] support QuantileState in vectorized engine #15868 get the ball rolling on implementation.	2023-02-01 14:06:09 +08:00
Gabriel	ba026b6e99	[datev2](function) make function nullable DEPEND_ON_ARGUMENT (#16159 )	2023-02-01 13:57:43 +08:00
Gabriel	dbd1dfb64c	[Bug](date) fix BE crash if month_floor 's argument is null (#16281 )	2023-02-01 12:25:57 +08:00
HappenLee	95d7c2de26	[Refactor](function) Rewrite the function elt (#16287 )	2023-02-01 11:17:06 +08:00
YueW	934f2de8da	[fix](inverted index) fix some bug about fulltext match query with compound conditions (#16226 )	2023-01-31 21:34:30 +08:00
abmdocrt	ca7eb94f23	[improvement](agg-function) Increase the limit maximum number of agg function parameters (#15924 )	2023-01-31 21:03:50 +08:00
Gabriel	471db80f69	[Bug](date) Fix invalid date (#16205 ) Issue Number: close #15777	2023-01-31 10:08:44 +08:00
TengJianPing	a7b030778a	[fix](sort) fix heap-use-after-free error if sort with limit and is spilled (#16267 )	2023-01-31 09:59:03 +08:00
HappenLee	1020fe165b	[Bug](function) positive function coredump in decimal (#16230 )	2023-01-30 22:17:50 +08:00
Jerry Hu	9aa0d86fec	[fix](olap) Incorrect reserving size for PredicateColumn converted from ColumnDictionary (#16249 )	2023-01-30 20:28:22 +08:00
Pxl	322dc2a104	[Bug](function) fix now(int) use_default_implementation_for_nulls && fix dround signature (#16238 )	2023-01-30 18:01:26 +08:00
chenlinzhong	fdc042bb39	[fix](vresultsink) BufferControlBlock may block all fragment handle threads (#16231 ) BufferControlBlock may block all fragment handle threads leads to be out of work modify include: BufferControlBlock cancel after max timeout StmtExcutor notify be to cancel the fragment when unexcepted occur more details see issue #16203	2023-01-30 16:53:21 +08:00
yiguolei	90b12143a3	[refactor](remove unused code) remove runtime tuple structure and useless utils class (#16237 )	2023-01-30 16:45:14 +08:00
Jerry Hu	a9671b6dfd	[feature](agg)support two level-hash map in aggregation node (#15967 )	2023-01-30 16:43:33 +08:00
yiguolei	4b6a4b3cf7	[refactor](remove unused code) Remove unused mempool declare or function params (#16222 ) * Remove unused mempool declare or function params --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-30 13:03:18 +08:00
Ashin Gau	07d58e531a	[improvement](filecache) add profile for file cache (#16223 )	2023-01-30 10:46:31 +08:00
WenYao	69e748b076	[fix](schema scanner)change schema_scanner::get_next_row to get_next_block (#15718 )	2023-01-30 10:01:50 +08:00
HappenLee	7d437d5706	[Bug](function) running_difference function coredump in regression test (#16215 )	2023-01-30 09:58:27 +08:00
Gabriel	f72efd4523	[Improvement](decimal) do not log fatal when precision is invalid (#16207 )	2023-01-30 09:54:22 +08:00
谢健	98649ec9f8	[fix](Nereids): Fix some functions error (#16197 ) * fix bugs in regexp_extract_all * fix rpad * fix weekofday * fix cryptor * fix timestamp * fix st_ function	2023-01-30 00:41:31 +08:00
starocean999	1ec88cbff6	[fix](nereids) AggregationNode process null as key column in wrong way (#16125 ) in AggregationNode, _merge_with_serialized_key_helper method should convert the key column to full column if the key column is null literal.	2023-01-29 20:12:07 +08:00
Pxl	46347a51d2	[Bug](exec) enable warning on ignoring function return value for vctx (#16157 ) * enable warning on ignoring function return value for vctx	2023-01-29 17:23:21 +08:00
abmdocrt	eb7da1c0ee	[fix](datatype) fix some bugs about data type array datetimev2 and decimalv3 (#16132 )	2023-01-29 14:26:08 +08:00
Pxl	2b5f95f08a	[Bug](function) remove datev2 signature of hour_ceil/hour_floor #16168	2023-01-29 11:27:56 +08:00
yiguolei	241a956b20	[refactor](remove unused code) remove partition info from datastream sender (#16162 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-28 19:56:41 +08:00
Gabriel	7cf7706eb1	[Bug](runtimefilter) Fix wrong runtime filter on datetime (#16102 )	2023-01-28 18:16:06 +08:00
yiguolei	e49766483e	[refactor](remove unused code) remove many xxxVal structure (#16143 ) remove many xxxVal structure remove BetaRowsetWriter::_add_row remove anyval_util.cpp remove non-vectorized geo functions remove non-vectorized like predicate Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-28 14:17:43 +08:00
Qi Chen	fa14b7ea9c	[Enhancement](icebergv2) Optimize the position delete file filtering mechanism in iceberg v2 parquet reader (#16024 ) close #16023	2023-01-28 00:04:27 +08:00
Jibing-Li	1589d453a3	[fix](multi catalog)Support parquet and orc upper case column name (#16111 ) External hms catalog table column names in doris are all in lower case, while iceberg table or spark-sql created hive table may contain upper case column name, which will cause empty query result. This pr is to fix this bug. 1. For parquet file, transfer all column names to lower case while parse parquet metadata. 2. For orc file, store the origin column names and lower case column names in two vectors, use the suitable names in different cases. 3. FE side, change the column name back to the origin column name in iceberg while doing convertToIcebergExpr.	2023-01-27 23:52:11 +08:00
yiguolei	adb758dcac	[refactor](remove non vec code) remove json functions string functions match functions and some code (#16141 ) remove json functions code remove string functions code remove math functions code move MatchPredicate to olap since it is only used in storage predicate process remove some code in tuple, Tuple structure should be removed in the future. remove many code in collection value structure, they are useless	2023-01-26 16:21:12 +08:00
yiguolei	615a5e7b51	[refactor](remove non vec code) remove non vec functions and AggregateInfo (#16138 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-25 12:53:05 +08:00
yiguolei	6e8eedc521	[refactor](remove unused code) remove storage buffer and orc reader (#16137 ) remove olap storage byte buffer remove orc reader remove time operator remove read_write_util remove aggregate funcs remove compress.h and cpp remove bhp_lib Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-24 22:29:32 +08:00
yiguolei	79ad74637d	[refactor](remove expr) remove non vectorized Expr and ExprContext related codes (#16136 )	2023-01-24 10:45:35 +08:00
Mingyu Chen	23edb3de5a	[fix](icebergv2) fix bug that delete file reader is not opened (#16133 ) This pr #15836 change the way to use parquet reader by first open() then init_reader(). But we forgot to call open() for iceberg delete file, which cause coredump.	2023-01-24 10:19:46 +08:00
yiguolei	a3cd0ddbdc	[refactor](remove broker scan node) it is not useful any more (#16128 ) remove broker scannode remove broker table remove broker scanner remove json scanner remove orc scanner remove hive external table remove hudi external table remove broker external table, user could use broker table value function instead Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-23 19:37:38 +08:00
zhangstar333	61fccc88d7	[vectorized](analytic) fix analytic node of window function get wrong… (#16074 ) [Bug] 基础函数rank()开窗排序结果错误 #15951	2023-01-23 16:09:46 +08:00
ZhaoChangle	199d7d3be8	[Refactor]Merged string_value into string_ref (#15925 )	2023-01-22 16:39:23 +08:00
yiguolei	8920295534	[refactor](remoe non vec code) remove non vectorized conjunctx from scanner (#16121 ) 1. remove arrow group filter 2. remove non vectorized conjunctx from scanner Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-21 19:23:17 +08:00

1 2 3 4 5 ...

1180 Commits