d933956449
[branch-2.1](timezone) Preload time offset in datetime ( #42395 ) ( #42607 )
...
pick https://github.com/apache/doris/pull/42395
2024-11-10 00:30:28 +08:00
9d7bc5b765
[pick](branch-2.1) pick #38215 ( #43386 )
...
pick #38215
---------
Co-authored-by: Zou Xinyi <zouxinyi@selectdb.com >
2024-11-09 22:13:05 +08:00
90da65c7b8
[fix](block-reader) Make rowsets union iterating work ( #40877 ) ( #43175 )
...
pick: #40877
2024-11-08 10:05:10 +08:00
46afbfca01
branch-2.1: [fix](ip) fix datatype serde for ipv6 with rowstore ( #43252 )
...
Cherry-picked from #43065
Co-authored-by: amory <wangqiannan@selectdb.com >
2024-11-05 20:09:14 +08:00
72bb6e79e4
[fix](index compaction)Skip writing terms with a doc frequency of 0( #43113 ) ( #43115 )
...
bp #43113
2024-11-04 17:49:56 +08:00
25d7d0b255
[fix](move-memtable) abstract multi-streams to one logical stream ( #42039 ) ( #42250 )
...
backport #42039
2024-10-22 20:26:42 +08:00
38e529cd29
[cherry-pick](branch-2.1) support decimal256 for parquet reader ( #42241 )
...
## Proposed changes
pick pr: https://github.com/apache/doris/pull/41526
2024-10-22 19:42:09 +08:00
7eec0f8fbb
[branch-2.1](datetime) Fix date floor functions overflow ( #35477 ) ( #42238 )
...
pick https://github.com/apache/doris/pull/35477
2024-10-22 15:54:53 +08:00
d5fef266ec
[fix](inverted index) Fix incorrect exception handling ( #42094 )
...
https://github.com/apache/doris/pull/41874
2024-10-19 10:45:32 +08:00
1b901f6fcc
[cherry-pick](branch-2.1) add parquet tvf cases and fix some parquet bug ( #41931 )
...
## Proposed changes
pick pr:
https://github.com/apache/doris/pull/41683
https://github.com/apache/doris/pull/41506
https://github.com/apache/doris/pull/41338
https://github.com/apache/doris/pull/39326
---------
Co-authored-by: morningman <morningman@163.com >
2024-10-17 14:20:58 +08:00
5bd33fc88c
[pick](branch-2.1) pick #41292 #41350 #41589 #41628 #41743 #41601 #41667 #41751 ( #41927 )
...
## Proposed changes
pick #41292 #41350 #41589 #41628 #41743 #41601 #41667 #41751
<!--Describe your changes.-->
---------
Co-authored-by: Pxl <pxl290@qq.com >
2024-10-16 15:41:28 +08:00
6dddd4c499
[function](cast)Make string casting to integers more like MySQL's beh… ( #41541 )
...
…avior (#38847 )
https://github.com/apache/doris/pull/38847
## Proposed changes
There are two issues here. First, the results of casting are
inconsistent between FE and BE .
```
FE
mysql [(none)]>select cast('3.000' as int);
+----------------------+
| cast('3.000' as INT) |
+----------------------+
| 3 |
+----------------------+
mysql [(none)]>set debug_skip_fold_constant = true;
BE
mysql [(none)]>select cast('3.000' as int);
+----------------------+
| cast('3.000' as INT) |
+----------------------+
| NULL |
+----------------------+
```
The second issue is that casting on BE converts '3.0' to null. Here, the
casting logic for FE and BE has been unified
<!--Describe your changes.-->
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
---------
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com >
2024-10-11 09:32:00 +08:00
0b4552f74b
[cherry-pick](branch-2.1) pick hive text write from master ( #40537 )
...
## Proposed changes
pick prs:
https://github.com/apache/doris/pull/38549
https://github.com/apache/doris/pull/40183
https://github.com/apache/doris/pull/40315
---------
Co-authored-by: Calvin Kirs <kirs@apache.org >
2024-09-27 20:57:07 +08:00
d4c1b39d03
[fix](multi table) restrict the multi tables load memory under high concurrency with a large number of tables ( #39992 ) ( #41131 )
...
pick (#39992 )
BE node was killed by OOM-killer when use multi table load under high
concurrency with a large number of tables(128 concurrency and every
concurrency load 200 tables).
This pr restricts the multi tables load memory under this issue. If
memory reaches hard limit, new task will be rejected and return
directly.
2024-09-24 16:34:32 +08:00
b52b572ade
[branch-2.1](memory) When Load ends, check memory tracker value returns is equal to 0 ( #40850 )
...
pick
#38960
#39908
#40043
#40092
#40016
#40439
---------
Co-authored-by: hui lai <1353307710@qq.com >
Co-authored-by: yiguolei <676222867@qq.com >
2024-09-15 23:47:53 +08:00
cecd214345
[branch-2.1](Column) refactor ColumnNullable to provide flags safety ( #40769 ) ( #40848 )
...
pick https://github.com/apache/doris/pull/40769
Co-authored-by: Jerry Hu <mrhhsg@gmail.com >
2024-09-14 16:27:43 +08:00
873f70c262
[fix] (compaction) fix compaction score in time series policy ( #40242 ) ( #40779 )
...
## Proposed changes
pick from master #40242
<!--Describe your changes.-->
2024-09-13 14:35:16 +08:00
023317e8a0
[fix](ut) fix SegmentTest and SegmentMoWTest asan core ( #40287 ) ( #40622 )
...
## Proposed changes
pick #40287
2024-09-10 18:37:00 +08:00
354967c09f
[branch-2.1](memory) pick reserve memory and workload group ( #40543 )
...
1. pick #38494
2. pick #39862
3. remove vdata_stream_test, master has been removed
2024-09-09 21:16:06 +08:00
87ac378c4a
[branch-2.1](be-ut) wait lazy open in ut ( #40453 )
...
## Proposed changes
LRUFileCache test need to wait lazy open done
2024-09-06 09:47:47 +08:00
cc20ecd738
Revert "[fix](compaction) fix the longest continuous rowsets cannot be selected when missing rowsets ( #38728 ) ( #39262 )" ( #40375 )
...
This reverts commit c9949f24e5c15e9529285f0e99b7ffdb1095558b.
This pr may increase the probability of full clone failure, so revert it
first.
2024-09-05 00:01:03 +08:00
ca07a00c93
Revert "[branch-2.1](hive) support hive write text table ( #38549 ) (#4… ( #40157 )
...
…0063)"
This reverts commit c6df7c21a3c09ae1664deabacb88dfcea9d94b68.
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-08-30 10:25:38 +08:00
c6df7c21a3
[branch-2.1](hive) support hive write text table ( #38549 ) ( #40063 )
...
1. Support write hive text table
2. Add SessionVariable `hive_text_compression` to write compressed hive
text table
3. Supported compression type: gzip, bzip2, snappy, lz4, zstd
pick from https://github.com/apache/doris/pull/38549
2024-08-29 16:50:40 +08:00
ac8cff34ce
[fix](ut)fix be enable_http_auth ut ( #40071 ) ( #40088 )
...
bp #40071
## Proposed changes
before pr #39577
2024-08-29 16:42:53 +08:00
9d5468d198
[branch-2.1](memory) BE memory info compatible with CgroupV2 ( #39799 )
...
pick #39256
2024-08-23 02:03:00 +08:00
8ce8887b75
[branch-2.1](memory) Refactor refresh workload groups weighted memory ratio and record refresh interval memory growth ( #39760 )
...
pick #38168
overwrites changes in #37221 on workload_group_manager.cpp. If need to
pick 37221, ignore it.
2024-08-22 17:33:11 +08:00
0e694f19db
[fix](merge-on-write) segcompaction should process delete bitmap if necessary ( #38369 ) ( #39707 )
...
## Proposed changes
Issue Number: close #xxx
cherry-pick #38369 and #38800
2024-08-22 00:42:56 +08:00
bb687bd69c
[cherry-pick](branch-2.1) add function regexp_extract_or_null ( #39561 )
...
# Proposed changes
pick https://github.com/apache/doris/pull/38296
2024-08-21 09:14:58 +08:00
fb17f204d7
[fix](http) fix http url with incorrect character notation ( #38420 ) ( #39535 )
...
## Proposed changes
pick from master #38420
2024-08-19 15:03:19 +08:00
021678c7c3
[fix](window_funnel) fix wrong result of window_funnel #38954 ( #39270 )
...
## Proposed changes
BP #38954
2024-08-16 09:59:31 +08:00
a44a274563
[Fix](parquet-reader) Fix and optimize parquet min-max filtering. ( #39375 )
...
Backport #38277 .
2024-08-15 14:12:54 +08:00
c9949f24e5
[fix](compaction) fix the longest continuous rowsets cannot be selected when missing rowsets ( #38728 ) ( #39262 )
...
pick master #38728
2024-08-13 17:41:11 +08:00
5f77f909d9
[cherry-pick](branch-2.1) Pick "[feature](function) support ip functions named ipv4_to_ipv6 and cut_ipv6" ( #39058 )
...
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
pick https://github.com/apache/doris/pull/36883 and
https://github.com/apache/doris/pull/35239
2024-08-10 18:37:11 +08:00
8a682d43ec
[fix](ut) repair segcompaction ut ( #38165 ) ( #38225 )
...
cherry-pick #38165
2024-08-09 15:52:18 +08:00
773008d6fa
[Fix](Json) fix some cast issue ( #38683 ) ( #39025 )
...
#38683
2024-08-07 22:05:43 +08:00
0603ec1d9d
[enhancement](compaction) optimizing memory usage for compaction ( #37099 ) ( #37486 )
2024-08-04 10:49:18 +08:00
79a6496bb6
[branch-2.1](function) fix wrong result when convert_tz is out of bound ( #37358 ) ( #38313 )
...
## Proposed changes
pick https://github.com/apache/doris/pull/37358
before:
```sql
mysql> select CONVERT_TZ(cast('0000-01-01 00:00:00.00001' as DATETIMEV1), cast('Asia/Shanghai' as VARCHAR(65533)), cast('America/Los_Angeles' as VARCHAR(65533)));
+---------------------------------------------------------------------------------------------------------------------------------------------------+
| convert_tz(cast('0000-01-01 00:00:00.00001' as DATETIME), cast('Asia/Shanghai' as VARCHAR(65533)), cast('America/Los_Angeles' as VARCHAR(65533))) |
+---------------------------------------------------------------------------------------------------------------------------------------------------+
| q535-12-31 08:01:19 |
+---------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.12 sec)
```
now:
```sql
mysql> select CONVERT_TZ(cast('0000-01-01 00:00:00.00001' as DATETIMEV1), cast('Asia/Shanghai' as VARCHAR(65533)), cast('America/Los_Angeles' as VARCHAR(65533)));
+---------------------------------------------------------------------------------------------------------------------------------------------------+
| convert_tz(cast('0000-01-01 00:00:00.00001' as DATETIME), cast('Asia/Shanghai' as VARCHAR(65533)), cast('America/Los_Angeles' as VARCHAR(65533))) |
+---------------------------------------------------------------------------------------------------------------------------------------------------+
| NULL |
+---------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.09 sec)
```
2024-07-25 11:32:44 +08:00
c30c1d2436
[branch-2.1] Picks "[opt](delete) Delete job should retry for failure that is not DELETE_INVALID_XXX #37834 " ( #38032 )
...
## Proposed changes
picks https://github.com/apache/doris/pull/37834 and
https://github.com/apache/doris/pull/38043
2024-07-18 14:50:30 +08:00
02716598d4
[Fix](sql function) memory overflow to the left of string address when do_money_format has small negative value #36226 ( #37870 )
...
cherry pick from #36226
Co-authored-by: sparrow <38098988+biohazard4321@users.noreply.github.com >
2024-07-16 15:04:42 +08:00
d7e84b7ee3
[Enchancement](bitmap) optimize bitmap deserialize and remove some unused code ( #37623 )
...
## Proposed changes
pick from #35789
2024-07-16 11:21:54 +08:00
967173d7d0
[cherry-pick-2.1](table-function) pick some table functions exec performance ( #34090 ) ( #37778 )
...
## Proposed changes
pick from master:
https://github.com/apache/doris/pull/33904
https://github.com/apache/doris/pull/34090
Co-authored-by: HappenLee <happenlee@hotmail.com >
2024-07-15 17:15:56 +08:00
2759383365
[branch-2.1](timezone) refactor tzdata load to accelerate and unify timezone parsing ( #37062 ) ( #37269 )
...
pick https://github.com/apache/doris/pull/37062
1. revert https://github.com/apache/doris/pull/25097 . we decide to rely
on OS. not maintain independent tzdata anymore to keep result
consistency
2. refactor timezone load. removed rwlock.
before:
```sql
mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates;
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) | count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| 16000000 | 16000000 |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
1 row in set (6.88 sec)
```
now:
```sql
mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates;
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) | count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| 16000000 | 16000000 |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
1 row in set (2.61 sec)
```
3. now don't support timezone offset format string like 'UTC+8', like we
already said in
https://doris.apache.org/docs/dev/query/query-variables/time-zone/#usage
4. support case-insensitive timezone parsing in nereids.
5. a bug when parse timezone using nereids. should check DST by input,
but wrongly by now before. now fixed.
doc pr: https://github.com/apache/doris-website/pull/810
2024-07-15 10:56:48 +08:00
8930df3b31
[Feature](iceberg-writer) Implements iceberg partition transform. ( #37692 )
...
## Proposed changes
Cherry-pick iceberg partition transform functionality. #36289 #36889
---------
Co-authored-by: kang <35803862+ghkang98@users.noreply.github.com >
Co-authored-by: lik40 <lik40@chinatelecom.cn >
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Mingyu Chen <morningman@163.com >
2024-07-13 16:07:50 +08:00
cf2fb6945a
[branch-2.1](memory) Refactor LRU cache policy memory tracking ( #37658 )
...
pick
#36235
#35965
2024-07-11 21:04:01 +08:00
62e0230523
[branch-2.1](memory) Add ThreadMemTrackerMgr BE UT ( #37654 )
...
## Proposed changes
pick #35518
2024-07-11 21:03:49 +08:00
fed632bf4a
[fix](move-memtable) check segment num when closing each tablet ( #36753 ) ( #37536 )
...
cherry-pick #36753 and #37660
2024-07-11 20:33:44 +08:00
9f4e7346fb
[fix](compaction) fixing the inaccurate statistics of concurrent compaction tasks ( #37318 ) ( #37496 )
2024-07-10 22:23:25 +08:00
afcc6170f6
[fix](txn_manager) Add ingested rowsets to unused rowsets when removing txn ( #37417 )
...
Generally speaking, as long as a rowset has a version, it can be
considered not to be in a pending state. However, if the rowset was
created through ingesting binlogs, it will have a version but should
still be considered in a pending state because the ingesting txn has not
yet been committed.
This PR updates the condition for determining the pending state. If a
rowset is COMMITTED, the txn should be allowed to roll back even if a
version exists.
Cherry-pick #36551
2024-07-10 14:25:44 +08:00
5280e277e7
[chore](be) Acquire and check MD5 digest of the file to download ( #37418 )
...
Cherry-pick #35807 , #36621 , #36726
2024-07-08 18:55:35 +08:00
ceef9ee123
[feature](serde) support presto compatible output format ( #37039 ) ( #37253 )
...
bp #37039
2024-07-04 13:56:05 +08:00