Commit Graph

1273 Commits

Author SHA1 Message Date
17016b9797 [improvement](decimal) use new way for decimal arithmetic precision promotion (#27787)
* [DNM](decimal) use new way for decimal arithmetic precision promotion

* [improvement](decimal) [DNM](decimal) use new way for decimal arithmetic precision promotion
1. [DNM](decimal) use new way for decimal arithmetic precision promotion
2. throw exception if it overflows for decimal arithmetics
3. throw exception if it overflows when casting among number types

* fix compile error of gcc

* improvement

---------

Co-authored-by: morrySnow <morrysnow@126.com>
2023-12-05 12:54:40 +08:00
358d73a0ae [FIX](complextype) fix empty quote with complex type (#27942) 2023-12-05 12:25:26 +08:00
1ed99c4d8a [Improvement](inverted index) improve inverted index bkd performance in high concurrent scenario (#27820)
Improve BKD performance by enable bkd reader cache and improvement of fast compare and visit in compressed values in BKD tree.
2023-12-05 11:39:53 +08:00
1afdbfe723 [enhance](BE) Refactor TaskWorkerPool (#27555) 2023-12-04 21:46:10 +08:00
4d1aa131ee [Feature](datatype) add be ut codes for IPv4/v6 (#26534)
Add unit test codes for IP types
2023-12-04 15:25:02 +08:00
a64656748b [Enhancenment](wal) disable group commit when streamload size is too large (#27781) 2023-12-03 23:05:11 +08:00
Pxl
f65103e2a6 [Chore](runtime-filter) unify interfaces of bloom filter and remove some unused code (#27822)
* unify interfaces of bloom filter and remove some unused code
2023-12-02 07:42:55 +08:00
Pxl
d969047b50 [Refactor](join) refactor of hash join (#27557)
Improve the performance under the tpch data set by reconstructing the join related code and the use of hash table

Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: BiteTheDDDDt <pxl290@qq.com>
2023-11-28 19:46:00 +08:00
f565f60bc3 [refactor](standard)BE:Initialize pointer variables in the class to nullptr by default (#27587) 2023-11-28 13:02:30 +08:00
13b26ee920 [Fix](core) Fix wal space back pressure core and add regression test (#27311) 2023-11-27 15:10:26 +08:00
553e4a8903 [feature-wip](merge-on-write) MOW table support different primary keys and sort keys (#24788) 2023-11-24 16:37:30 +08:00
dd65cc1d14 [opt](MergedIO) no need to merge large columns (#27315)
1. Fix a profile bug of `MergeRangeFileReader`, and add a profile `ApplyBytes` to show the total bytes  of ranges.
2. There's no need to merge large columns, because `MergeRangeFileReader` will increase the copy time.
2023-11-23 19:15:47 +08:00
5d548935e0 [improvement](insert) support schema change and decommission for group commit (#26359) 2023-11-17 21:41:38 +08:00
0eabe9a651 [Test](orc-reader) Add orc submodule's unit tests. (#26878) 2023-11-16 09:53:42 +08:00
3ad865fef9 [refactor](storage) Expressing the types of computation layer and storage layer in PrimitiveTypeTraits (#26191) 2023-11-15 21:34:49 +08:00
035e593b26 remove useless hash function (#26955) 2023-11-15 20:37:21 +08:00
d2eea9b3ae [chore](macOS) Reduce the size of executables on macOS arm64 (#26894)
Like #15641, we should reduce the size of executables on macOS arm64. Otherwise, we can not run doris_be and doris_be_test with ASAN build type on macOS arm64 now.
2023-11-14 12:21:08 +08:00
f6a9914bc7 [feature](move-memtable) support auto partition in sink v2 (#26914) 2023-11-14 11:39:44 +08:00
de6ecd2035 [fix](tls) Manually track memory in Allocator instead of mem hook and ThreadContext life cycle to manual control (#26904)
Manually track query/load/compaction/etc. memory in Allocator instead of mem hook.
Can still use Mem Hook when cannot manually track memory code segments and find memory locations during debugging.
This will cause memory tracking loss for Query, loss less than 10% compared to the past, but this is expected to be more controllable.
Similarly, Mem Hook will no longer track unowned memory to the orphan mem tracker by default, so the total memory of all MemTrackers will be less than before.
Not need to get memory size from jemalloc in Mem Hook each memory alloc and free, which would lose performance in the past.
Not require caching bthread local in pthread local for memory hook, in the past this has caused core dumps inside bthread, seems to be a bug in bthread.
ThreadContext life cycle to manual control
In the past, ThreadContext was automatically created when it was used for the first time (this was usually in the Jemalloc Hook when the first malloc memory), and was automatically destroyed when the thread exited.
Now instead of manually controlling the create and destroy of ThreadContext, it is mainly created manually when the task thread start and destroyed before the task thread end.
Run 43 clickbench query tests.
Use MemHook in the past:
2023-11-14 10:30:42 +08:00
b19abac5e2 [fix](move-memtable) pass num local sink to backends (#26897) 2023-11-14 08:28:49 +08:00
8cf360fff7 [refactor](closure) remove ref count closure using auto release closure (#26718)
1. closure should be managed by a unique ptr and released by brpc , should not hold by our code. If hold by our code, we need to wait brpc finished during cancel or close.
2. closure should be exception safe, if any exception happens, should not memory leak.
3. using a specific callback interface to be implemented by Doris's code, we could write any code and doris should manage callback's lifecycle.
4. using a weak ptr between callback and closure. If callback is deconstruted before closure'Run, should not core.
2023-11-12 11:57:46 +08:00
4ebb517af0 [fix](be-ut) Fix compilation errors caused by missing opentelemetry headers (#26739) 2023-11-10 14:58:46 +08:00
d767804815 [feature](merge-cloud) Decouple rowset id generator and local rowsets gc implementation (#25921) 2023-11-10 10:07:02 +08:00
5f62a4462d [Enhancement](wal) Add wal space back pressure (#26483) 2023-11-09 12:29:05 +08:00
74e452f19c [bug](bitmap) fix bitmap value copy operator not call reset (#26451)
when a empty bitmap assign to other bitmap
the other bitmap should reset self firstly, and then set empty type.
2023-11-09 10:05:09 +08:00
58bf79f79e [fix](move-memtable) pass load stream num to backends (#26198) 2023-11-08 16:16:33 +08:00
519b48648e [fix](move-memtable) handle status when possible (#26526) 2023-11-08 10:09:06 +08:00
32b36d3c9c [refactor](move-memtable) rename proto OpenStreamSink to OpenLoadStream (#26527) 2023-11-07 22:41:20 +08:00
1e2a614a46 [fix](workflow) Fix failure test cases in BE UT (macOS) (#26425)
1. Fix memory issues in LoadStreamMgrTest.
2. Skip S3FileWriterTest by default because it depends on the environment in teamcity.
3. Fix VTimestampFunctionsTest.convert_tz_test.
2023-11-06 10:44:44 +08:00
8bd06aff7e [Chore](MoW) remove unused code about rowset tree (#26282) 2023-11-02 20:25:27 +08:00
a5ef90dacc [enhancement](recover) support skipping missing version in select by session variable (#25654) 2023-11-02 20:01:51 +08:00
a4e415ab09 [feature](hive)Support hive tables after alter type. (#25138)
1.Reconstruct the logic of decode to read parquet. The parquet  reader first reads the data according to the parquet physical type, and then performs a type conversion.

2.Support hive alter table.
2023-11-02 00:24:21 +08:00
3e10e5af39 [Fix](Serde) Fix content displayed by complex types in MySQL Client (#25946)
This pr makes three changes to the display of complex types:
1. NULL value in complex types refers to being displayed as `null`, not `NULL`
2. struct type is displayed as "column_name": column_value
3. Time types such as `datetime` and `date`, are displayed with double quotes in complex types. like
    `{1, "2023-10-26 12:12:12"}`

This pr also do a code refactor:
1. nesting_level is set to a member variable of the `DataTypeSerDe`, rather than a parameter in methods.

What's more, this pr fix a bug that fileSize is not correct, introduced by this pr: #25854
2023-11-01 23:48:55 +08:00
bfca1bf206 [Enhancement] Implement format methods for Status (#26133) 2023-11-01 16:55:25 +08:00
0449a240f4 [Fix](from_unixtime) Keep consistent with MySQL & bug fix (#25966)
Bug fix: implicit convert from int32 -> int64 makes negative time stamp valid, so change signature to int64
Consistent: keep consistent with mysql.
2023-10-31 14:31:24 +08:00
8f320944a8 [fix](move-memtable) fix DeltaWriterV2 profile use-after-free (#26110)
The sink who creates the delta writer may be closed while other sinks still using this delta writer.
The parent profile is deconstructed and when the last sink trying to update the profile, it will meet use-after-free.

To address this issue, we record the profile number in delta writer,
and the last sink who close the delta writer will create and update the profile.
2023-10-31 13:52:18 +08:00
6761dc4113 [coverage](test) improve test coverage (#26096)
improve test coverage
2023-10-30 18:01:55 +08:00
6a85f46ff3 [refactor](move-memtable) rename open_stream_sink rpc to open_load_stream (#25883) 2023-10-29 10:07:14 +08:00
cedab51676 [enhancement](UT) add unit test cases about bitmap (#25867)
* [fix](bitmap) incorrect result of operator ==

* [enhancement](UT) add unit test cases about bitmap
2023-10-27 11:27:14 +08:00
2679fa4ea7 [improvement](tablet clone) furthur repair replicas should be check even if they are versions catchup (#25551) 2023-10-26 18:14:40 +08:00
da4de17d5c [improvement](function) improve date_trunc function performance when timeunit is const (#25824)
this PR #22602 have check function.
only support date_trunc(column, const), so the second must be const literal
and no need to check time unit every row.
2023-10-26 09:51:21 +08:00
693982fd1a [feature](decimal) support decimal256 (#25386) 2023-10-25 15:47:51 +08:00
9160834606 [FIX](resize) fix array and map offsets resize with default value (#25669) 2023-10-24 02:50:01 -05:00
08832d9f3a [Fix](exec) Fix date dict dead loop. (#25570) 2023-10-24 02:51:43 +08:00
cbc5c91aec [fix](datetime) fix unstable str_to_date function result (#25707)
fix unstable str_to_date function result
2023-10-23 11:52:08 +08:00
9519d7ede9 [enhancement](be-ut)Add more indexed column reader be unit test (#25652)
Added more unit tests
1. key exists or does not exist in a single page
2. key exists or does not exist in multiple pages
3. key is between two pages.
2023-10-23 10:12:53 +08:00
Pxl
642c149e6a remove datetime_value and move vecdatetime_value to doris namespace (#25695)
remove datetime_value and move vecdatetime_value to doris namespace
2023-10-20 22:08:17 +08:00
9b64286f2d [enhancement](mow-ut)Add compaction commit delete bitmap unit test (#25569) 2023-10-20 19:54:12 +08:00
9a675fcdfc [chore](be) Add default timezone files (#25097) 2023-10-20 13:12:24 +08:00
b964ab76b3 [refactor](shuffle) Simplify hash partitioning strategy (#25596) 2023-10-19 19:28:22 +08:00