Commit Graph

4999 Commits

Author SHA1 Message Date
Pxl
f87fad97e1 [Bug](storage) add lock on base tablet when create_tablet #21915 2023-07-19 00:47:19 +08:00
d2b199955a [bugfix](deserialize ) pack struct to avoid parse wrong content for file header (#21907)
Recently we encountered one strange bug where the log is file length is not match. file=/mnt/hdd01/master/NO_AVX2/doris.HDD/snapshot/20230713122303.26.72000/45832/536215111/45832.hdr, file_length=, real_file_length=0 when running restore P2 case, after checking the file on the remote storage we doubt it's the local file deserialize who caused this situation.
Then we analyzed the layout for the struct and the content of the hdr file then we found out that it must be the wrong layout which cause reading wrong content.
2023-07-18 22:32:41 +08:00
a9ea138caf [fix](two level hash table) fix dead loop when converting to two level hash table for zero value (#21899)
When enable two level hash table , if there is zero value in the existing one level hash table, it will cause dead loop when converting to two level hash table, because the PartitionedHashTable::_is_partitioned flag is not set correctly when doing the converting.
2023-07-18 19:50:30 +08:00
c6063ed92f [Revert](lazy open) revert lazy open and add case (#21821) 2023-07-18 19:41:33 +08:00
2013dcd0e9 [refactor](load) cleanup segment flush logic in beta rowset writer (#21635) 2023-07-18 18:17:57 +08:00
c36d225a27 [feature](profile) add process hashtable time in join node (#21878)
add process hashtable time in join node
2023-07-18 18:09:42 +08:00
Pxl
3089e4b3b6 [Bug](excution) fix ScannerContext is done make query failed (#21923)
fix ScannerContext is done make query failed
2023-07-18 17:58:00 +08:00
Pxl
19492b06c1 [Bug](decimalv3) fix failed on test_dup_tab_decimalv3 due to wrong precision (#21890)
fix failed on test_dup_tab_decimalv3 due to wrong precision
2023-07-18 12:53:09 +08:00
Pxl
b3d3ffa2de [Bug](pipeline) adjust scanner scheduler.submit and _num_scheduling_ctx maintain (#21843)
adjust scanner scheduler.submit and _num_scheduling_ctx maintain
2023-07-18 11:55:21 +08:00
726e0d5ebf [fix](load) fix dead loop in _handle_mem_exceed_limit function when reduce memory for load (#21886) 2023-07-18 09:49:36 +08:00
37ca133845 [feature](profile)add monotonice timer for pipeline task #21791
Add monotonice timer for piplinetask;

WaitBfTime

Task1BeginExecuteTime
Task2EosTime
Task3SrcPendingFinishOverTime
Task4DstPendingFinishOverTime
Task5TotalTime
Task6ClosePipelineTime
2023-07-18 07:57:14 +08:00
12784f863d [fix](Export) Fixed the bug that would be core when exporting large amounts of data (#21761)
A heap-buffer-overflow error occurs when exporting large amounts of data to orc format.
Reserve 50B for buffer to avoid this problem.
2023-07-18 00:06:38 +08:00
29146c680e [refactor](profile)add node id info in pipeline profile (#21823) 2023-07-17 15:24:02 +08:00
c98d2bf0d3 Fix incorrect priority queue refactor (#21857) 2023-07-17 14:56:24 +08:00
5fc0a84735 [improvement](catalog) reduce the size thrift params for external table query (#21771)
### 1
In previous implementation, for each FileSplit, there will be a `TFileScanRange`, and each `TFileScanRange`
contains a list of `TFileRangeDesc` and a `TFileScanRangeParams`.
So if there are thousands of FileSplit, there will be thousands of `TFileScanRange`, which cause the thrift
data send to BE too large, resulting in:

1. the rpc of sending fragment may fail due to timeout
2. FE will OOM

For a certain query request, the `TFileScanRangeParams` is the common part and is same of all `TFileScanRange`.
So I move this to the `TExecPlanFragmentParams`.
After that, for each FileSplit, there is only a list of `TFileRangeDesc`.

In my test, to query a hive table with 100000 partitions, the size of thrift data reduced from 151MB to 15MB,
and the above 2 issues are gone.

### 2
Support when setting `max_external_file_meta_cache_num` <=0, the file meta cache for parquet footer will
not be used.
Because I found that for some wide table, the footer is too large(1MB after compact, and much more after
deserialized to thrift), it will consuming too much memory of BE when there are many files.

This will be optimized later, here I just support to disable this cache.
2023-07-17 13:37:02 +08:00
03b575842d [Feature](table function) support explode_json_array_json (#21795) 2023-07-17 11:40:02 +08:00
c409fa0f58 [Feature](Compaction)Support full compaction (#21177) 2023-07-16 13:21:15 +08:00
a7eb186801 [Bug](CSVReader) fix null pointer coredump in CSVReader in p2 (#20811) 2023-07-15 22:50:10 +08:00
d2ff68ac2d [fix](unique-key) fix query results show duplicate key for unique key table after upgrading (#21814) 2023-07-15 17:17:41 +08:00
7f50c07219 [Opt](exec) opt the outer join performance in TPCDS Q95 (#21806) 2023-07-14 18:42:08 +08:00
7a6ae12ebb [imporve](bloomfilter) refactor runtime_filter_mgr with bloomfilter and fix bug in change_to_bloom_filter (#21783) 2023-07-14 17:47:32 +08:00
b013f8006d [enhancement](multi-table) enable mullti table routine load on pipeline engine (#21729) 2023-07-14 12:16:32 +08:00
c07e2ada43 [imporve](udaf) refactor java-udaf executor by using for loop (#21713)
refactor java-udaf executor by using for loop
2023-07-14 11:37:19 +08:00
ebe771d240 [refactor](executor) remove unused variable 2023-07-14 10:35:59 +08:00
ca6e33ec0c [feature](table-value-functions)add catalogs table-value-function (#21790)
mysql> select * from catalogs() order by CatalogId;
2023-07-14 10:25:16 +08:00
cbddff0694 [FIX](map) fix map key-column nullable for arrow serde #21762
arrow is not support key column has null element , but doris default map key column is nullable , so need to deal with if doris map row if key column has null element , we put null to arrow
2023-07-14 00:30:07 +08:00
254f76f61d [Agg](exec) support aggregation_node limit short circuit (#21767) 2023-07-14 00:29:19 +08:00
6fd8f5cd2f [Fix](parquet-reader) Fix parquet string column min max statistics issue which caused query result incorrectly. (#21675)
In parquet, min and max statistics may not be able to handle UTF8 correctly.
Current processing method is using min_value and max_value statistics introduced by PARQUET-1025 if they are used.
If not, current processing method is temporarily ignored. A better way is try to read min and max statistics if it contains 
only ASCII characters. I will improve it in the future PR.
2023-07-14 00:09:41 +08:00
fd6553b218 [Fix](MoW) Fix bug about caculating all committed rowsets delete bitmaps when do comapction (#21760) 2023-07-13 21:10:15 +08:00
2c83e5a538 [fix](merge-on-write) fix be core and delete unused pending publish info for async publish when tablet dropped (#21793) 2023-07-13 21:09:51 +08:00
35fa9496e7 [fix](merge-on-write) fix wrong result when query with prefix key predicate (#21770) 2023-07-13 19:56:00 +08:00
abc21f5d77 [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index (#21310)
* process differently for normal bloom filter index and ngram bf index

* fix review comments for readbility

* add test case

* add testcase for delete condition
2023-07-13 17:31:45 +08:00
e167394dc1 [Fix](pipeline) close sink when fragment context destructs (#21668)
Co-authored-by: airborne12 <airborne12@gmail.com>
2023-07-13 11:52:24 +08:00
9cad929e96 [Fix](rowset) When a rowset is cooled down, it is directly deleted. This can result in data query misses in the second phase of a two-phase query. (#21741)
* [Fix](rowset) When a rowset is cooled down, it is directly deleted. This can result in data query misses in the second phase of a two-phase query.

related pr #20732

There are two reasons for moving the logic of delayed deletion from the Tablet to the StorageEngine. The first reason is to consolidate the logic and unify the delayed operations. The second reason is that delayed garbage collection during queries can cause rowsets to remain in the "stale rowsets" state, preventing the timely deletion of rowset metadata, It may cause rowset metadata too large.

* not use unused rowsets
2023-07-13 11:46:12 +08:00
cf016f210d Revert "[imporve](bloomfilter) refactor runtime_filter_mgr with bloomfilter (#21715)" (#21763)
This reverts commit 925da90480f60afc0e5333a536d41e004234874e.
2023-07-13 10:44:20 +08:00
00c48f7d46 [opt](regression case) add more index change case (#21734) 2023-07-12 21:52:48 +08:00
7f133b7514 [fix](partial-update) transient rowset writer should not trigger segcompaction when build rowset (#21751) 2023-07-12 21:47:07 +08:00
be55cb8dfc [Improve](jsonb_extract) support jsonb_extract multi parse path (#21555)
support jsonb_extract multi parse path
2023-07-12 21:37:36 +08:00
da67d08bca [fix](compile) fix be compile error (#21765)
* [fix](compile) fix be compile error

* remove warning
2023-07-12 21:14:04 +08:00
3163841a3a [FIX](serde)Fix decimal for arrow serde (#21716) 2023-07-12 19:15:48 +08:00
f0d08da97c [enhancement](merge-on-write) split delete bitmap from tablet meta (#21456) 2023-07-12 19:13:36 +08:00
9d96e18614 [fix](multi-table-load) fix memory leak when processing multi-table routine load (#21611)
* use naked ptr to prevent loop ref

* add comments
2023-07-12 17:32:56 +08:00
d86c67863d Remove unused code (#21735) 2023-07-12 14:48:13 +08:00
ff42cd9b49 [feature](hive)add read of the hive table textfile format array type (#21514) 2023-07-11 22:37:48 +08:00
925da90480 [imporve](bloomfilter) refactor runtime_filter_mgr with bloomfilter (#21715)
Reduced the granularity of the lock. In the past, the entire map was locked
map(string) --> map(int)
The bf does not need to init_with_fixed_length
2023-07-11 22:35:30 +08:00
4b30485d62 [improvement](memory) Refactor doris cache GC (#21522)
Abstract CachePolicy, which controls the gc of all caches.
Add stale sweep to all lru caches, including page caches, etc.
I0710 18:32:35.729460 2945318 mem_info.cpp:172] End Full GC Free, Memory 3866389992 Bytes. cost(us): 112165339, details: FullGC:
  FreeTopMemoryQuery:
     - CancelCostTime: 1m51s
     - CancelTasksNum: 1
     - FindCostTime: 0.000ns
     - FreedMemory: 2.93 GB
  WorkloadGroup:
  Cache name=DataPageCache:
     - CostTime: 15.283ms
     - FreedEntrys: 9.56K
     - FreedMemory: 691.97 MB
     - PruneAllNumber: 1
     - PruneStaleNumber: 1
2023-07-11 20:21:31 +08:00
da86d2ff65 [fix](mow) fix flush_single_block core in calc_segment_delete_bitmap (#21619) 2023-07-11 15:56:57 +08:00
d3317aa33b [Fix](executor)Fix scan entity core #21696
After the last time to call scan_task.scan_func(),the should be ended, this means PipelineFragmentContext could be released.
Then after PipelineFragmentContext is released, visiting its field such as query_ctx or _state may cause core dump.
But it can only explain core 2

void ScannerScheduler::_task_group_scanner_scan(ScannerScheduler* scheduler,
                                                taskgroup::ScanTaskTaskGroupQueue* scan_queue) {
    while (!_is_closed) {
        taskgroup::ScanTask scan_task;
        auto success = scan_queue->take(&scan_task);
        if (success) {
            int64_t time_spent = 0;
            {
                SCOPED_RAW_TIMER(&time_spent);
                scan_task.scan_func();
            }
            scan_queue->update_statistics(scan_task, time_spent);
        }
    }
}
2023-07-11 15:56:13 +08:00
d0eb4d7da3 [Improve](hash-fun)improve nested hash with range #21699
Issue Number: close #xxx

when cal array hash, elem size is not need to seed hash
hash = HashUtil::zlib_crc_hash(reinterpret_cast<const char*>(&elem_size),
                                                   sizeof(elem_size), hash);
but we need to be care [[], [1]] vs [[1], []], when array nested array , and nested array is empty, we should make hash seed to
make difference
2. use range for one hash value to avoid virtual function call in loop.
which double the performance. I make it in ut

column: array[int64]
50 rows , and single array has 10w elements
2023-07-11 14:40:40 +08:00
Pxl
ca71048f7f [Chore](status) avoid empty error msg on status (#21454)
avoid empty error msg on status
2023-07-11 13:48:16 +08:00