doris

Author	SHA1	Message	Date
Jerry Hu	6404277795	[fix](json) Add . after in JSON path to support correct token parsing (#52543 ) (#52544 ) Boost tokenizer requires explicit "." after "$" to correctly extract JSON path tokens. Without this, expressions like "$[0].key" cannot be properly split, causing issues in downstream logic. This commit ensures a "." is automatically added after "$" to maintain consistent token parsing behavior.	2025-07-03 14:36:53 +08:00
github-actions[bot]	18d2f93120	branch-2.1: [fix](function) JSON_EXTRACT_STRING should return NULL instead of the string 'null' when encountering a NULL value #51516 (#51566 ) Cherry-picked from #51516 --------- Co-authored-by: Jerry Hu <hushenggang@selectdb.com>	2025-06-13 11:07:31 +08:00
Mryange	ebcec779ec	branch-2.1: [fix](function) fix error result when input utf8 in url_encode, strright, append_trailing_char_if_absent #49127 (#50660 ) …ght, append_trailing_char_if_absent (#49127) The url_encode function previously performed a modulus operation on a signed number. Converting it to an unsigned number will fix the issue. ``` before mysql> select url_encode('编码'); +----------------------+ \| url_encode('编码') \| +----------------------+ \| %5.%23%0-%5.%10%/( \| +----------------------+ now mysql> select url_encode('编码'); +----------------------+ \| url_encode('编码') \| +----------------------+ \| %E7%BC%96%E7%A0%81 \| +----------------------+ ``` The strright function did not calculate the length according to the number of UTF-8 characters. ``` before mysql> select strright("你好世界",5); +----------------------------+ \| strright("你好世界",5) \| +----------------------------+ \| \| +----------------------------+ now mysql> select strright("你好世界",5); +----------------------------+ \| strright("你好世界",5) \| +----------------------------+ \| 你好世界 \| +----------------------------+ ``` he case of inputting a UTF-8 character was not considered. ``` mysql> select append_trailing_char_if_absent('中文', '文'); +-------------------------------------------------+ \| append_trailing_char_if_absent('中文', '文') \| +-------------------------------------------------+ \| NULL \| +-------------------------------------------------+ now mysql> select append_trailing_char_if_absent('中文', '文'); +-------------------------------------------------+ \| append_trailing_char_if_absent('中文', '文') \| +-------------------------------------------------+ \| 中文 \| +-------------------------------------------------+ ```	2025-05-07 22:37:50 +08:00
Benjaminwei	cf72fa82e2	[Improve](explode) explode function support multi param (#50310 ) ### What problem does this PR solve? backport:https://github.com/apache/doris/pull/48537 Issue Number: close #xxx Related PR: #xxx Problem Summary: ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: https://github.com/apache/doris-website/pull/1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->	2025-04-23 23:27:07 +08:00
github-actions[bot]	aa4b54952c	branch-2.1: [enhancement]Optimize GeoFunctions for const columns #34396 (#50067 ) Cherry-picked from #34396 Co-authored-by: koarz <66543806+koarz@users.noreply.github.com>	2025-04-16 14:05:46 +08:00
Mryange	d8a274251e	branch-2.1: [feature](function) support utf8 input in initcap #49846 (#49977 )	2025-04-11 15:06:23 +08:00
Mryange	aad189cf40	[feature](function) upper lower support utf8 input (#49756 ) ### What problem does this PR solve? https://github.com/apache/doris/pull/49231	2025-04-07 12:00:31 +08:00
Jerry Hu	c0bc16d88f	[fix](function) wrong result of arrays_overlap (#49403 ) (#49707 ) Pick #49403 If the two arrays have the same non-null elements, they are considered overlapping, and the result is 1. If the two arrays have no common non-null elements and either array contains a null element, the result is null. Otherwise, the result is 0. ``` select arrays_overlap([1, 2, 3], [1, null]); -- result should be 1 select arrays_overlap([2, 3], [1, null]); -- result should be null select arrays_overlap([2, 3], [1]); -- result should be 0 ``` ### What problem does this PR solve?	2025-04-04 20:58:01 +08:00
Mryange	145e393d3d	branch-2.1: [fix](function) check return type is nullptr in FunctionBasePtr::build #49737 (#49761 )	2025-04-02 20:23:41 +08:00
github-actions[bot]	1259ee5088	branch-2.1: [Feature](function) support year of week #48870 (#49012 )	2025-03-29 11:24:45 +08:00
github-actions[bot]	ce49f37a5e	branch-2.1: [fix](core) fix subreplace when inputting a large number of empty strings #49241 (#49303 ) Cherry-picked from #49241 Co-authored-by: Mryange <yanxuecheng@selectdb.com>	2025-03-20 22:56:44 +08:00
Jerry Hu	3b61f840f4	[fix](function) Undefined behavior in parse_url (#49149 ) (#49226 )	2025-03-19 17:32:47 +08:00
zzzxl	02f15a8ef0	[fix](inverted index) Fix Null Pointer Exception in function match(#45456 )(#45774 ) pick: https://github.com/apache/doris/pull/45456	2024-12-24 11:27:13 +08:00
zclllyybb	79662fcc94	[branch-2.1](functions) clean some ip functions code and make IS_IP_ADDRESS_IN_RANGE DEPENDS_ON_ARGUMENT (#45358 ) pick https://github.com/apache/doris/pull/35239 add special logic to deal smooth upgrade The origin PR is https://github.com/apache/doris/pull/35239. for branch-3.0 it was merged in 3.0.0 but forgot to register old version. now in branch-3.0 we fix it in https://github.com/apache/doris/pull/45428 which must be merged in 3.0.4. and do same thing in this PR which must be merged in 2.1.8. then: ``` FROM TO result 217- 218+ ✅ 217- 303- 💥 218+ 303- ✅ 218+ 304+ ✅ 303- 304+ ✅ ``` this is our best result.	2024-12-17 11:51:07 +08:00
Jerry Hu	1101fbaf04	[fix](column_complex) wrong type of Field returned by ColumnComplex (#43515 ) (#43860 )	2024-11-13 19:07:00 +08:00
zclllhhjj	cecd214345	[branch-2.1](Column) refactor ColumnNullable to provide flags safety (#40769 ) (#40848 ) pick https://github.com/apache/doris/pull/40769 Co-authored-by: Jerry Hu <mrhhsg@gmail.com>	2024-09-14 16:27:43 +08:00
Socrates	bb687bd69c	[cherry-pick](branch-2.1) add function regexp_extract_or_null (#39561 ) # Proposed changes pick https://github.com/apache/doris/pull/38296	2024-08-21 09:14:58 +08:00
yangshijie	5f77f909d9	[cherry-pick](branch-2.1) Pick "[feature](function) support ip functions named ipv4_to_ipv6 and cut_ipv6" (#39058 ) ## Proposed changes Issue Number: close #xxx <!--Describe your changes.--> pick https://github.com/apache/doris/pull/36883 and https://github.com/apache/doris/pull/35239	2024-08-10 18:37:11 +08:00
lihangyu	773008d6fa	[Fix](Json) fix some cast issue (#38683 ) (#39025 ) #38683	2024-08-07 22:05:43 +08:00
zclllhhjj	79a6496bb6	[branch-2.1](function) fix wrong result when convert_tz is out of bound (#37358 ) (#38313 ) ## Proposed changes pick https://github.com/apache/doris/pull/37358 before: ```sql mysql> select CONVERT_TZ(cast('0000-01-01 00:00:00.00001' as DATETIMEV1), cast('Asia/Shanghai' as VARCHAR(65533)), cast('America/Los_Angeles' as VARCHAR(65533))); +---------------------------------------------------------------------------------------------------------------------------------------------------+ \| convert_tz(cast('0000-01-01 00:00:00.00001' as DATETIME), cast('Asia/Shanghai' as VARCHAR(65533)), cast('America/Los_Angeles' as VARCHAR(65533))) \| +---------------------------------------------------------------------------------------------------------------------------------------------------+ \| q535-12-31 08:01:19 \| +---------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.12 sec) ``` now: ```sql mysql> select CONVERT_TZ(cast('0000-01-01 00:00:00.00001' as DATETIMEV1), cast('Asia/Shanghai' as VARCHAR(65533)), cast('America/Los_Angeles' as VARCHAR(65533))); +---------------------------------------------------------------------------------------------------------------------------------------------------+ \| convert_tz(cast('0000-01-01 00:00:00.00001' as DATETIME), cast('Asia/Shanghai' as VARCHAR(65533)), cast('America/Los_Angeles' as VARCHAR(65533))) \| +---------------------------------------------------------------------------------------------------------------------------------------------------+ \| NULL \| +---------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.09 sec) ```	2024-07-25 11:32:44 +08:00
zhiqiang	02716598d4	[Fix](sql function) memory overflow to the left of string address when do_money_format has small negative value #36226 (#37870 ) cherry pick from #36226 Co-authored-by: sparrow <38098988+biohazard4321@users.noreply.github.com>	2024-07-16 15:04:42 +08:00
zhangstar333	967173d7d0	[cherry-pick-2.1](table-function) pick some table functions exec performance (#34090 ) (#37778 ) ## Proposed changes pick from master: https://github.com/apache/doris/pull/33904 https://github.com/apache/doris/pull/34090 Co-authored-by: HappenLee <happenlee@hotmail.com>	2024-07-15 17:15:56 +08:00
zclllyybb	2759383365	[branch-2.1](timezone) refactor tzdata load to accelerate and unify timezone parsing (#37062 ) (#37269 ) pick https://github.com/apache/doris/pull/37062 1. revert https://github.com/apache/doris/pull/25097. we decide to rely on OS. not maintain independent tzdata anymore to keep result consistency 2. refactor timezone load. removed rwlock. before: ```sql mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates; +-------------------------------------------------------------------------------------+--------------------------------------------------------+ \| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) \| count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) \| +-------------------------------------------------------------------------------------+--------------------------------------------------------+ \| 16000000 \| 16000000 \| +-------------------------------------------------------------------------------------+--------------------------------------------------------+ 1 row in set (6.88 sec) ``` now: ```sql mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates; +-------------------------------------------------------------------------------------+--------------------------------------------------------+ \| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) \| count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) \| +-------------------------------------------------------------------------------------+--------------------------------------------------------+ \| 16000000 \| 16000000 \| +-------------------------------------------------------------------------------------+--------------------------------------------------------+ 1 row in set (2.61 sec) ``` 3. now don't support timezone offset format string like 'UTC+8', like we already said in https://doris.apache.org/docs/dev/query/query-variables/time-zone/#usage 4. support case-insensitive timezone parsing in nereids. 5. a bug when parse timezone using nereids. should check DST by input, but wrongly by now before. now fixed. doc pr: https://github.com/apache/doris-website/pull/810	2024-07-15 10:56:48 +08:00
Mingyu Chen	0cff539810	[feature](function) support new function replace_empty (#36283 ) (#36656 ) #36283	2024-06-21 16:46:22 +08:00
zhiqiang	c8f2a3f952	[fix](eq_for_null) fix incorrect logic in function eq_for_null #36004 (#36124 ) cherry pick from #36004 cherry pick from #36164	2024-06-21 14:31:21 +08:00
Mingyu Chen	b75533e72b	[branch-2.1](beut) fix BE UT (#36147 ) only for branch-2.1	2024-06-12 08:21:38 +08:00
lihangyu	e3e5f18f26	[Fix](Json type) correct cast result for json type (#34764 )	2024-05-18 18:40:17 +08:00
zhiqiang	eb7eaee386	[fix](function) money format (#34680 )	2024-05-18 18:35:29 +08:00
yangshijie	9b712b03b4	[FIX]fix is_ip_address_in_range func with const param (#34266 )	2024-05-10 14:37:20 +08:00
Chester	f7900b53ce	[enhancement](function) floor/ceil/round/round_bankers can use column as scale argument (#34391 )	2024-05-06 22:18:36 +08:00
Jensen	26d9082b9a	[Feature](function) Add function strcmp (#33272 )	2024-04-12 15:09:25 +08:00
Uniqueyou	31984bb4f0	[feature](function) support quote string function #33055	2024-04-12 15:09:25 +08:00
zclllyybb	c61d6ad1e2	[Feature] support function uuid_to_int and int_to_uuid #33005	2024-04-10 14:53:56 +08:00
zhiqiang	bf022f9d8d	[enhancement](function truncate) truncate can use column as scale argument (#32746 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-04-10 14:53:56 +08:00
zhangstar333	7486e96b12	[improve](function) add error msg if exceeded maximum default value in repeat function (#32219 ) add some error msg from repeat function, so the user could know the count is greater than default value.	2024-03-21 14:07:49 +08:00
yangshijie	8f77e6363a	[Feature](function) Support xxhash function like murmur hash function (#31193 )	2024-02-23 19:03:28 +08:00
koarz	6cf7468073	[enhancement](function) change some function nullable mode (#30991 ) change some function nullable mode	2024-02-18 14:45:25 +08:00
zclllyybb	68102fd531	[Fix](auto-partition) fix a concurrent bug of extremely long values (#31005 )	2024-02-18 14:45:25 +08:00
zclllyybb	3315c16383	[enhance](function) refactor from_format_str and support more format (#30452 )	2024-02-01 19:08:37 +08:00
TengJianPing	a525d5c5a3	[refactor](decimal) change type name Decimal128 to Decimal128V2, Decimal128I to Decimal128V3 to avoid confusion (#29265 ) change type name Decimal128 to Decimal128V2, Decimal128I to Decimal128V3 to avoid confusion	2023-12-29 10:11:44 +08:00
zclllyybb	e1587537bc	[Fix](status) fix unhandled status in exprs #28218 which marked static_cast<void> in https://github.com/apache/doris/pull/23395/files partially fixed #28160	2023-12-11 11:04:58 +08:00
wangbo	035e593b26	remove useless hash function (#26955 )	2023-11-15 20:37:21 +08:00
zhiqiang	4ebb517af0	[fix](be-ut) Fix compilation errors caused by missing opentelemetry headers (#26739 )	2023-11-10 14:58:46 +08:00
Adonis Ling	1e2a614a46	[fix](workflow) Fix failure test cases in BE UT (macOS) (#26425 ) 1. Fix memory issues in LoadStreamMgrTest. 2. Skip S3FileWriterTest by default because it depends on the environment in teamcity. 3. Fix VTimestampFunctionsTest.convert_tz_test.	2023-11-06 10:44:44 +08:00
zhiqiang	0449a240f4	[Fix](from_unixtime) Keep consistent with MySQL & bug fix (#25966 ) Bug fix: implicit convert from int32 -> int64 makes negative time stamp valid, so change signature to int64 Consistent: keep consistent with mysql.	2023-10-31 14:31:24 +08:00
zhangstar333	da4de17d5c	[improvement](function) improve date_trunc function performance when timeunit is const (#25824 ) this PR #22602 have check function. only support date_trunc(column, const), so the second must be const literal and no need to check time unit every row.	2023-10-26 09:51:21 +08:00
zclllyybb	cbc5c91aec	[fix](datetime) fix unstable str_to_date function result (#25707 ) fix unstable str_to_date function result	2023-10-23 11:52:08 +08:00
zclllyybb	9a675fcdfc	[chore](be) Add default timezone files (#25097 )	2023-10-20 13:12:24 +08:00
Guangdong Liu	9e31cb26bb	[fix](parse_url) fix `parse_url` is not working in some case to extract the HOST (#25040 ) Issue Number: close #24452	2023-10-09 00:14:58 +08:00
bobhan1	642e5cdb69	[Fix](Status) Make `Status` `[[nodiscard]]` and handle returned `Status` correctly (#23395 )	2023-09-29 22:38:52 +08:00

1 2 3 4

168 Commits