Commit Graph

4545 Commits

Author SHA1 Message Date
ff54b45775 [fix](partial-update) should hold tablet meta lock before calling lookup_row_key() (#19964) 2023-05-24 16:37:27 +08:00
Pxl
3ba7c2336b [Chore](build) change CMAKE_CXX_STANDARD from 17 to 20 #19987 2023-05-24 16:16:42 +08:00
e5eed53b89 [improvement](bitmap) Use shared_ptr in BitmapValue to avoid deep copying (#19101)
Currently bitmapvalue type is copied between columns, it cost a lot of memory. Use a shared ptr in bitmap value to avoid copy data.
2023-05-24 16:13:01 +08:00
c730033595 [improvement](exchange) data stream sender stop sending data to receiver if it returns eos early (#19847)
For broadcast join, only one build fragment instance will build hash table, other fragment instances just receive and throw away build side data, this is waste of memory and cpu.

This PR improve this condition, data stream receiver tells sender that it does not need data from sender, and sender stops sending anydata to it.
2023-05-24 15:11:32 +08:00
14b4c7abf9 [fix](hashtable) Check query cancel status during build hash table #19970
should cancel query during hash table build stage if the query is cancelled.
2023-05-24 14:24:03 +08:00
cf7a74f6ec [fix](memory) query check cancel while waiting for memory in Allocator, and optimize log (#19967)
After the query check process memory exceed limit in Allocator, it will wait up to 5s.
Before, Allocator will not check whether the query is canceled while waiting for memory, this causes the query to not end quickly.
2023-05-24 11:08:48 +08:00
08ec5e2eb5 [fix](function) fix result column is nullable type when fast execute (#19889) 2023-05-24 10:27:50 +08:00
5247e2866f [fix](status) fix function code_as_string to private (#19960)
* rebase

* update format

---------

Co-authored-by: ZI-MA <chime316@qq.com>
2023-05-23 23:27:55 +08:00
e3929820d9 [performance](load) use vector instead of skiplist when insert agg keys (#19099) 2023-05-23 20:11:50 +08:00
a434a49f71 [Bug](decimal) fix mod function (#19925)
Bug:
select id, kdcml * ktint, kdcml / ktint, kdcml % ktint from expr_test order by id;
+------+-------------------+-------------------+-----------------------+
| id | kdcml * ktint | kdcml / ktint | kdcml % ktint |
+------+-------------------+-------------------+-----------------------+
| NULL | NULL | NULL | NULL |
| 1 | 24.395 | 24.395 | -4702111234474983.74 |
| 2 | 68.968 | 17.242 | -4702111234474983.74 |
| 3 | 146.268 | 16.252 | -4702111234474983.74 |
| 4 | 275.772 | 17.235 | -4702111234474983.74 |
| 5 | 487.470 | 19.498 | -4702111234474983.74 |
| 6 | 827.244 | 22.979 | -4702111234474983.74 |
| 7 | 1364.860 | 27.854 | -4702111234474983.74 |
| 8 | 2205.928 | 34.467 | -4702111234474983.74 |
| 9 | 3509.595 | 43.328 | -4702111234474983.74 |
| 10 | 5514.790 | 55.147 | -4702111234474983.74 |
| 11 | 8578.988 | 70.900 | -4702111234474983.74 |
| 12 | 13235.484 | 91.913 | -4702111234474983.74 |
| 13 | 24.395 | 24.395 | -4702111234474983.74 |
| 14 | 68.968 | 17.242 | -4702111234474983.74 |
| 15 | 146.268 | 16.252 | -4702111234474983.74 |
| 16 | 275.772 | 17.235 | -4702111234474983.74 |
| 17 | 487.470 | 19.498 | -4702111234474983.74 |
| 18 | 827.244 | 22.979 | -4702111234474983.74 |
| 19 | 1364.860 | 27.854 | -4702111234474983.74 |
| 20 | 2205.928 | 34.467 | -4702111234474983.74 |
| 21 | 3509.595 | 43.328 | -4702111234474983.74 |
| 22 | 5514.790 | 55.147 | -4702111234474983.74 |
| 23 | 8578.988 | 70.900 | -4702111234474983.74 |
| 24 | 13235.484 | 91.913 | -4702111234474983.74 |
2023-05-23 18:24:31 +08:00
6efe6ef6e8 [Enhancement](scanner) allocate blocks in scanner_context on demand and free them on close (#19389)
Firstly, to reduce memory usage, we do not pre-allocate blocks, instead we lazily allocate block when upper call get_free_block. And when upper call return_free_block to return free block, we add the block to a queue for memory reuse, and we will free the blocks in the queue when the scanner_context was closed instead of destructed.
Secondly, to limit the memory usage of the scanner, we introduce a variable _free_blocks_capacity to indicate the current number of free blocks available to the scanners. The number of scanners that can be scheduled will be calculated based on this value.

ssb flat test
previous
lineorder 1.2G:
load time: 3s, query time: 0.355s
lineorder 5.8G:
load time: 330s, query time: 0.970s
load time: 349s, query time: 0.949s
load time: 349s, query time: 0.955s
load time: 360s, query time: 0.889s (pipeline enabled)
after
lineorder 1.2G:
load time: 3s, query time: 0.349s
lineorder 5.8G:
load time: 342s, query time: 0.929s
load time: 337s, query time: 0.913s
load time: 345s, query time: 0.946s
load time: 346s, query time: 0.865s (pipeline enabled)
2023-05-23 18:17:21 +08:00
c0ad588801 [enhancement](page cache) use separate pk index cache (#19864) 2023-05-23 14:02:12 +08:00
da66a64e09 [fix](merge-on-write) return error st if check_pk_in_pre_segments failed (#19736) 2023-05-23 11:04:15 +08:00
fe111207a9 [Fix](lazy_open) Fix lazy open null point (#19829) 2023-05-23 09:17:46 +08:00
30417e06d4 [enhance](fs) use bvar to monitor s3 file reader& writer (#19607)
remove useless prefix
2023-05-22 23:01:21 +08:00
3dcdadcea6 [Improvement](function) support decimalv3 for function least and greatest (#19931) 2023-05-22 22:48:44 +08:00
53ba46e404 [Fix][Refactor] Fix 'not member call on null pointer of type 'doris::TextConverter' error in ubsan env and refactor text converter. (#19849)
Fix 'not member call on null pointer of type doris::TextConverter' error in ubsan env and refactor text converter.
2023-05-22 21:00:19 +08:00
6762af3c9b [Improve](struct)improve struct support into outfile (#19894)
support select into outfile for struct type
2023-05-22 18:45:56 +08:00
Pxl
9945067e3c [Bug](function) make VcompoundPred optimization work well (#19870)
make VcompoundPred optimization work well
#19818 this pr try to enable VcompoundPred optimization but get wrong result on tpcds q28.
The reason is some nullable logic on mysql need special handling.

mysql [regression_test_tpcds_sf1_p1]>select null and false;
+----------------+
| NULL AND FALSE |
+----------------+
|              0 |
+----------------+
1 row in set (0.00 sec)

mysql [regression_test_tpcds_sf1_p1]>select null and true;
+---------------+
| NULL AND TRUE |
+---------------+
| NULL          |
+---------------+
1 row in set (0.00 sec)

mysql [regression_test_tpcds_sf1_p1]>select null or false;
+---------------+
| NULL OR FALSE |
+---------------+
| NULL          |
+---------------+
1 row in set (0.00 sec)

mysql [regression_test_tpcds_sf1_p1]>select null or true;
+--------------+
| NULL OR TRUE |
+--------------+
|            1 |
+--------------+
1 row in set (0.00 sec)
2023-05-22 18:32:17 +08:00
Pxl
d64be9565d [Bug](function) fix function in get wrong result when input const column (#19791)
fix function in get wrong result when input const column
2023-05-22 10:58:29 +08:00
f57b6adba9 [improvement](libhdfs) Use keytab and principal to login kerberos (#19841)
User keytab and princpal to login kerberos.
And user does not need to execute kinit manually anymore.
2023-05-22 10:54:21 +08:00
a6bd014b8a [FIX](serde)pb ut is not stable #19907 2023-05-22 09:03:02 +08:00
1c950d6930 [fix](config) fix memory config enable_query_memroy_overcommit spell problem #19898 2023-05-22 00:32:20 +08:00
76c358b3e3 [revert](memory) revert page no use Allocator && default disable ChunkAllocator (#19905)
default chunk allocator reserve is 0. At this time, it is meaningless to enable chunk allocator, it will only waste memory.
2023-05-21 22:16:41 +08:00
33fd965b5c [feature-wip](resouce-group) Supports memory soft isolation of resource group (#19802)
create resource groups name properties(
    'enable_memory_overcommit' = 'true' // whether to enable memory soft isolation
)
2023-05-21 19:33:57 +08:00
512806f902 [fix](ubsan) UBSAN avoid thread local switch 2023-05-20 07:14:43 +08:00
47509e65c9 [fix](memory)page no use Allocator, avoid ckbench oom (#19877) 2023-05-19 21:23:31 +08:00
5547bbbaef [decimalv3](function) support function width_bucket (#19806) 2023-05-19 20:28:59 +08:00
abde8bf26a [chore](build) Fix the compilation errors on macOS (arm64) (#19859)
Some errors raise when building the codebase on macOS (arm64).
2023-05-19 18:50:47 +08:00
65807f888b [fix](memory) Remind log if vm/overcommit_memory=2 when be start (#19795)
Expect vm overcommit memory value to be 1, system will no longer throw bad_alloc, memory alloc are always accepted,
memory limit check is handed over to Doris Allocator, make sure throw exception position is controllable,
otherwise bad_alloc can be thrown anywhere and it will be difficult to achieve exception safety.
2023-05-19 15:01:08 +08:00
7d1844d380 [FIX](Array)fix be master comapitible with fe1.2 #19850
when upgrade be first , fe is 1.2 , array contains_nulls is set not in thrift ,so would has core in be
Issue Number: close #xxx
2023-05-19 14:10:29 +08:00
c4900eb658 [Bug](DecimalV3) fix decimalv3 functions (#19801) 2023-05-19 14:10:01 +08:00
609b20bd02 [Feature](planner) use partial update in update from & delete from (#19262) 2023-05-19 09:46:29 +08:00
3e010bbee7 [improvement](profile) add profile counter 'BytesSent' for VDataBufferSender (#19826) 2023-05-19 08:46:50 +08:00
1d01136b1b [Fix](parquet-reader) Fix partition field conjuncts not work. (#19837)
Fix partition field conjuncts not work.
Add predicate_partition_columns in _slot_id_to_filter_conjuncts(single slot conjuncts) to _filter_conjuncts, others should had been added from not_single_slot_filter_conjuncts.
2023-05-19 08:44:02 +08:00
f32deb18e9 [Update](build) change clucene from thirdparty to git module (#19352) 2023-05-19 08:25:51 +08:00
3d6a13605d [improvement](stacktrace) do not captute stack trace for txn error codes (#19817) 2023-05-18 23:58:56 +08:00
481e9aebdb [Refactor](spark load) remove parquet scanner (#19251) 2023-05-18 19:19:13 +08:00
ef0657c072 [Bug](pipeline) RegressionTest failed release resouce cause DCHECK failed (#19783)
RegressionTest failed release resouce cause DCHECK failed
2023-05-18 18:57:25 +08:00
e242d7dfcc [refactor-WIP](TaskWorkerPool) add DropTableTaskPool for DROP_TABLE task (#19793) 2023-05-18 18:25:13 +08:00
07bbf741fb [enhence](memory) gc inverted index cache when there is not enough memory (#19622)
Support to gc inverted index cache when there is not enough memory.
previous problem: The inverted index cache (InvertedIndexSearcherCache and InvertedIndexQueryCache) may use 20% memory which can't be released.
2023-05-18 16:41:51 +08:00
fd4fa5c64e [Optimize](row store) optimize serialization and deserialization (#19691)
1. Get DataTypeSerde in advance to avoid get temporary DataTypeSerde iterate each column
2. Iterate the original row once is enoungh for deserializing by introducing a map for record the index of each column's unique id
2023-05-18 16:22:38 +08:00
294599ee45 [feature](jsonb) rename JSONB type name and function name to JSON (#19774)
To be more compatible with MySQL, rename JSONB type name and function name to JSON.

The old JSONB type name and jsonb_xx function can still be used for backward compatibility.

There is a function jsonb_extract remained since json_extract is used by json string function and more work need to change it. It will be changed further.
2023-05-18 16:16:52 +08:00
068a32bc49 [Improvement](memory) faststring use Allocator #19762
After the outer catch exception, faststring resize reserve build may throw a memory alloc failure exception from the Allocator.

Currently page body compress will catch memory alloc failure exception
2023-05-18 15:00:49 +08:00
7c8b7878cd [fix](memory) Print all query/load memory before memory GC when memory_debug=true (#19720) 2023-05-18 14:55:47 +08:00
303bee6fa3 [Fix](single replica load) add inverted index copy for single replica load (#19663)
* [Fix](single replica load) add inverted index copy for single replica load
2023-05-18 14:13:41 +08:00
851886cc18 [minor](datev2) remove datev2 because datev2 is used by default (#19777) 2023-05-18 13:36:11 +08:00
943e5fb7e5 [improvement](MOW) use seperated cache for mow pk cache (#19686)
In mow, primary key cache have a big impact on load performance, so we add a new cache type to seperate
it from page cache to make it more flexible in some cases
2023-05-18 13:27:09 +08:00
62458ed0f4 [enhancement](compaction) not core when init failed (#19754) 2023-05-18 12:06:22 +08:00
6a5b590873 [refactor-WIP](TaskWorkerPool) add CreateTableTaskPool class for CREATE_TABLE task (#19734) 2023-05-18 11:43:09 +08:00