doris

Author	SHA1	Message	Date
starocean999	4cbb6ece10	[fix](fe)ordering exprs should be substituted in the same way as select part (#20091 )	2023-05-27 21:00:57 +08:00
airborne12	ac8599fedb	[Fix](single replica load) fix indices_size key not found core (#20047 )	2023-05-27 13:28:07 +08:00
Yanko	f54a068d82	[feature](function) add json->operator convert to json_extract (#19899 )	2023-05-27 12:45:45 +08:00
lihangyu	f3d8af330a	[Bug](point query) check point query before check two phase read (#20055 ) * [Bug](point query) checkAndSetPointQuery before checkEnableTwoPhaseRead 1. checkEnableTwoPhaseRead rely on thr short circuit flag 2. add more metric to display lookup profile * fix rebase	2023-05-27 12:38:58 +08:00
Bin	b12250f9e8	[typo](docs)Data partition document correction. (#20103 ) * correct the wrongly conveyed meaning. * delete the item which should not be there anymore.	2023-05-27 12:37:50 +08:00
Adonis Ling	16c46974c5	[chore](build) Fix compilation errors reported by GCC-13 (#20114 )	2023-05-27 08:25:52 +08:00
HappenLee	9539bbf8ae	Revert "[test](executor)add crud regression test for resource group (#19659 )" (#20121 ) This reverts commit 8b9813663d87afa7b359b31782f3864dc54881df.	2023-05-27 08:25:00 +08:00
zhangdong	51ca645c3f	[fix](mtmv)Fix tablet not found when restart fe (#20095 ) The replayCreateTable restriction must be olapTable. If mv is used, nothing will be done, resulting in no call to invertedIndex.addReplica	2023-05-27 08:20:06 +08:00
lihangyu	23c95d15da	[regression-test](sort) Fix unstable sorting (#20125 )	2023-05-26 23:42:05 +08:00
Jack Drogon	93933308e6	[Feature-WIP](CCR): Add ccr doris interface (WIP) (#17881 )	2023-05-26 23:40:49 +08:00
HappenLee	e5b0d7a5cd	[CTE](eof) Support cte reuse reduce counter by eof status and pipeline task mem can release (#20056 )	2023-05-26 22:03:29 +08:00
plat1ko	3c6227a900	[fix](filesystem) Fix core caused by using moved variable in batch_delete_impl #20033	2023-05-26 21:39:27 +08:00
Qi Chen	860e28a3a3	[Fix](multi-catalog) Fix db name is not lower case when jdbc catalog configuration `lower_case_table_names` is `true`. (#20021 ) Fix db name is not lower case when jdbc catalog configuration lower_case_table_names is true. Fix regression-test test_oracle_jdbc_catalog.	2023-05-26 21:35:38 +08:00
Gabriel	23ad72e734	[Bug](runtime filter) Fix min/max filter for decimalv3 (#20005 )	2023-05-26 21:35:21 +08:00
Qi Chen	cb4a57f44f	[Opt](orc-reader) Support merge small IO facility in orc reader. (#20092 ) #18976 introduced merge small IO facility to optimize performance, and used by parquet reader. This PR support this facility in orc reader. Current ORC reader implementation need to reposition parent present stream when reading lazy columns in lazy materialization facility. So let it works by removing `DCHECK_GE(offset, cached_data.end_offset)`.	2023-05-26 21:06:12 +08:00
zclllyybb	346c51faa2	[fix](expr) Make VExprContext exit gracefully (#19984 )	2023-05-26 20:21:53 +08:00
starocean999	dcdc81844f	[fix](nereids)use same decimalv3 type for params and return types (#20101 )	2023-05-26 20:15:51 +08:00
Ashin Gau	9458a24cd7	[fix](multi-catalog) values in sqlserver should be enclosed by single quotes (#19971 ) Fix errors when inserting string/date/datetime values into SQLServer: ERROR 1105 (HY000): errCode = 2, detailMessage = (172.21.0.101)[INTERNAL_ERROR]UdfRuntimeException: JDBC executor sql has error: CAUSED BY: SQLServerException: Invalid column name '2021-10-30'. When using double quotes enclose string values, it will be parsed as column name, so we should enclose string values with single quotes.	2023-05-26 20:04:45 +08:00
amory	ce45d6119d	[FIX](regress-test) fix struct_export out data (#20111 ) fix struct_export out data	2023-05-26 19:57:51 +08:00
Xinyi Zou	a928b21434	[improvement](exception-safe) sort node is completely exception safe #20041	2023-05-26 18:29:02 +08:00
Pxl	c287e308ab	[Chore](java-udf) add some java-udf function name to asan_suppr #20093	2023-05-26 18:05:31 +08:00
Chengpeng Yan	d6c2ef4727	[opt](Nereids) support use the string as the hint name key (#20053 ) We can not use the string as the variable key to use in the hint. Before this PR mysql> SET enable_nereids_planner=true; Query OK, 0 rows affected (0.01 sec) mysql> set enable_fallback_to_original_planner=false; Query OK, 0 rows affected (0.10 sec) mysql> explain select /+ SET_var("enable_nereids_planner" = "false") / 1; ERROR 1105 (HY000): Exception, msg: Nereids cannot parse the SQL, and fallback disabled. caused by: no viable alternative at input 'select /+ SET_var("enable_nereids_planner"'(line 1, pos 27) After this PR mysql> SET enable_nereids_planner=true; Query OK, 0 rows affected (0.01 sec) mysql> set enable_fallback_to_original_planner=false; Query OK, 0 rows affected (0.10 sec) mysql> select /+ SET_var("enable_nereids_planner" = "false") */ 1; +------+ \| 1 \| +------+ \| 1 \| +------+ 1 row in set (0.00 sec) Describe your changes. Support the string for the hint key in the Parser.	2023-05-26 18:04:04 +08:00
LiBinfeng	b7fd481dcb	[Fix](Nereids) Using switch to control minidump input serialize (#20049 ) Before change, when doing optimize use Nereids planner, input will serialize to memory first. And when bug happen, it would be dump to minidump file when catching the exception. We found that serialization process will cause the performance when statistic message too large or when optimization time be small enough. So the user minidump using should change to ONLY YOU OPEN MINIDUMP SWITCH(set enable_minidump=true;) can you use it.	2023-05-26 18:03:34 +08:00
qiye	9e70a9ef84	[opt](compaction) add pick rowset to compact interval config (#19868 )	2023-05-26 17:39:02 +08:00
Pxl	041081f081	[Chore](decimal) make decimal value parse fail information readable #20057	2023-05-26 16:17:40 +08:00
minghong	a842b9787f	[fix](Nereids) should allow identity project when generate bitmap runtime filter (#20062 )	2023-05-26 16:08:57 +08:00
lihangyu	317338913c	[Bug](topn) Fix topn fetch set real default value (#20074 ) 1. Before this PR if rowset does not contain column which should be read for related SlotDescriptor will call `insert_default` to column, but it's not this real defautl value.Real default value relevant information should be provided by the frontend side. 2. Support fetch when light schema change is not enabled, but disable for AGG or UNIQUE MOR model	2023-05-26 16:06:55 +08:00
zy-kkk	50ced3c3a6	[improve] (jdbc catalog) better handling of postresql bit(1) types with bool type (#20022 ) When the postgresql bit type size is 1, it reads as a java.lang.boolean via jdbc, and if we match against string, it will display true or false. But the normal display should be a number, so when I detect that the size of bit is 1, I will match it with boolean	2023-05-26 16:06:38 +08:00
TengJianPing	488c9ba7c2	[improvement](exchange) test: data stream sender stop sending data to receiver if it returns eos early (#20081 )	2023-05-26 16:05:38 +08:00
zhangdong	b481045372	[improvement](k8s)when the IP of be is different from the IP specified by the master, obey it (#19951 ) When FE is deployed on a virtual machine and CN is deployed on k8s, FE needs to use a proxy IP to communicate with CN nodes, and BE cannot resolve the proxy IP from the local network card. We change the previous verification rules and obey the IP assigned by the master	2023-05-26 15:40:29 +08:00
ZhouZhou	635a9f7a0e	[fix](ui)(fe-system) fix fe System Info query error when the fe server run in Windows. (#20072 ) (#20073 ) 1. Fix duplicate '/' in front-end request URI. 2. When the FileSystemSeparator is '\\', replace '\\' as '/' Co-authored-by: labuladuo <labuladuo@douyu.tv>	2023-05-26 15:25:44 +08:00
yiguolei	0ed817ed1a	[improvement](status) should send query timeout status to be, instead of internal error (#20016 ) If a query is cancelled, the reason is very unclear and we do not know the call stack. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-05-26 15:11:17 +08:00
luozenglin	b81e9e2521	[fix](resource-group) Fix resource group memory isolation may release too much memory (#20066 ) Suppose three queries are executed in a resource group with a memory_limit of 8G, and they consume memory of query_a = 3G, query_b = 3G, and query_c = 3G. The total memory used is counted as 9G when the resource group GC is executed, which exceeds the resource group limit and cancels query_a. When the resource group is next GC, the memory of query_a may not be freed yet, and it will be counted again in the total memory consumed by that resource group, which again exceeds the resource group limit and cancels query_b. From the user's perspective, it is fine to execute query_a and query_b at the same time, but executing query_ a, query_b and query_c will be cancelled for two queries, which is not as expected. This pr skips the queries that are cancelled when counting the memory used by the resource group. If this causes the process memory to grow, the process gc will handle it.	2023-05-26 14:50:12 +08:00
Pxl	43aa062fb1	[Chore](hash-join) remove useless conditions and add some case (#20050 )	2023-05-26 14:45:24 +08:00
Mingyu Chen	047311171d	[deps](libhdfs) change hadoop libhdfs to source compile (#20058 ) compile it as other deps, instead of downloading prebuild binary. Only for linux platform	2023-05-26 14:37:49 +08:00
TengJianPing	315b30c23d	[testcase](union) add test case for union of decimal (#20080 )	2023-05-26 14:12:14 +08:00
amory	ee34b6de2d	[Refact] (serde) refact mysql serde with data type (#19543 ) refact mysql output (de)serialize with data type serde , avoid accoriding switch case Primitive type writed in mysqlWriter	2023-05-26 14:11:17 +08:00
Liqf	9bc01d3968	[doc](fix)Supplementary array type description #20079	2023-05-26 14:10:18 +08:00
Pxl	15a7420661	[Chore](ub) fix some undefined behaviors (#19986 ) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/column_reader.cpp:895:21: runtime error: load of value 423208544, which is not a valid value for type 'doris::ReaderType' /home/zcp/repo_center/doris_master/doris/be/src/vec/columns/column_decimal.cpp:260:33: runtime error: load of misaligned address 0x7fa3348b301c for type 'int64_t' (aka 'long'), which requires 8 byte alignment /home/zcp/repo_center/doris_master/doris/be/src/olap/block_column_predicate.cpp:82:24: runtime error: variable length array bound evaluates to non-positive value 0 /home/zcp/repo_center/doris_master/doris/be/src/vec/columns/column_string.h:225:26: runtime error: null pointer passed as argument 2, which is declared to never be null	2023-05-26 14:08:40 +08:00
Chengpeng Yan	dee9c2240f	[feature](Nereids) pushdown filter through window (#18784 ) Support the operator `PartitionTopN`, which can partition first and do the topn operation later in each partition. It used in the following case ``` -- Support push the filter down to the window and generate the PartitionTopN. -- The plan change from `window -> filter` to `partitionTopN -> window -> filter`. explain select * from (select * , row_number() over(partition by b order by a) as num from t ) tt where num <= 10; -- Support push the limit down to the window and generate the PartitionTopN. -- The plan change from `window -> limit` to `partitionTopN -> window -> limit `. explain select row_number() over(partition by b order by a) as num from t limit 10; -- Support push the topn down to the window and generate the PartitionTopN. -- The plan change from `window -> topn` to `partitionTopN -> window -> topn `. explain select row_number() over(partition by b order by a) as num from t order by num limit 10; ``` The FE part detail design: 1. Add the following rewrite rules: - PUSHDOWN_FILTER_THROUGH_WINDOW - PUSH_LIMIT_THROUGH_PROJECT_WINDOW - PUSH_LIMIT_THROUGH_WINDOW - PUSHDOWN_TOP_N_THROUGH_PROJECTION_WINDOW - PUSHDOWN_TOP_N_THROUGH_WINDOW 2. Add the PartitionTopN node(LogicalPlan/ PhysicalPlan/ TranslatorPlan) 3. For the rewrite plan, there are several requests that need to meet: - For the `Filter` part, only consider `</ <=/ =` conditions. And the filter conditions will be stored. - For the `Window` part, we only support one window function. And we support the `row_number`, `rank`, `dense_rank` window functions. And the `partition by` key and `order by` key can not be empty at the same time. The `Window Frame` should be `UNBOUNDED to CURRENT`. 4. For the `PhysicalPartitionTopN`, the requested property is `Any`and the output property is its children's property. That's the main details that are very important. For the other part, you can directly check the code. Issue Number #18646 BE Part #19708	2023-05-26 11:23:48 +08:00
starocean999	558f625d3b	[fix](planner) The group by part should be substituted in the same way as select part (#20019 )	2023-05-26 11:05:02 +08:00
Mryange	92a6122f74	[feature](profile)Add the filtering information of the Bloom filter in profile. (#19789 )	2023-05-26 10:56:58 +08:00
Jibing-Li	9c22fc4130	[fix](multi catalog)Support Hive partiton manually removed (#20024 ) If the user manually removed a hive partition (remove the partition dir through hdfs), doris will failed to query the hive table with an error message get file split failed for table. That is because the Hive metadata still contains the removed partition. This pr is to fix this bug. Skip the not exist dirs.	2023-05-26 10:32:45 +08:00
Jibing-Li	281a0971c8	[Fix](multi catalog, metadata)Init logType in ExternalCatalog while replay meta data to avoid NPE. Remove type variable in ExternaCatalog (#20009 ) The variable logType in ExternalCatalog is not persistent to disk, after refresh, it will be set to NULL and cause NPE. This pr is to fix the bug. Also, remove the old type variable in ExternalCatalog, use logType instead.	2023-05-26 10:31:34 +08:00
Jibing-Li	8a8d3bcb59	[improvement](multi catalog, nereids)Support collect hive table statistics by sql (#19955 ) Support collect hive external table statistics by running sql against hive table. By running sql, we could collect all the statistics collected for Olap table, including the min, max value of String column. With 3 BE (16 core, 64 GB), it cost less than 2 minutes to collect TPCH 100GB statistics for all columns of all tables. Also less than 2 minutes to collect all columns statistics for SSB 100GB tables.	2023-05-26 10:31:02 +08:00
Chengpeng Yan	5621ae08e6	[fix](Nereids) function ABS return type not same between constant folding and function signature (#20059 ) The abs return the wrong type for the integer type. Return the int type when the arg's type is integer	2023-05-26 10:24:32 +08:00
morrySnow	f1b949ad59	[fix](Nereids) local sort should not translate to unpartitioned partition (#20031 ) 1. local sort should not update current fragment partition to UNPARTITIONED 2. should set input fragment dest exchange node after create dest fragment	2023-05-26 10:18:56 +08:00
morrySnow	dca0ebb281	[fix](Nereids) constant folding to null should retain data type (#20070 )	2023-05-26 10:14:08 +08:00
Xinyi Zou	56360ba04a	[fix](memory) Load flush memtable no check memory exceed #20036	2023-05-26 09:57:00 +08:00
AlexYue	bc4e0e97f2	[enhance](S3FileWriter) abort when s3 file writer abnormally quite and optimize s3 buffer pool (#19944 ) 1. reduce s3 buffer pool's ctor cost 2. before this pr, if one s3 file writer return err when calling append or close function, the caller will not call abort function which result in one confusing DCHECK failed like the following picture	2023-05-26 09:14:38 +08:00

1 2 3 4 5 ...

10784 Commits