789210bc38
[chore](format) Refactor BaseTablet _full_name by using fmt replacing stringstream ( #25400 )
...
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com >
2023-10-13 03:59:03 -05:00
ac8fbdd53c
[pipelineX](fix) Fix use-after-free in shuffling ( #25409 )
2023-10-13 16:57:34 +08:00
37dbda6209
[pipelineX](refactor) Use class template to simplify join ( #25369 )
2023-10-13 16:51:55 +08:00
f4e2eb6564
remove unused code and adjust clang-tidy checks ( #25405 )
...
remove unused code and adjust clang-tidy checks
2023-10-13 16:27:37 +08:00
2ec53ff60e
[fix](multi-table) fix single stream multi table load can not finish ( #25379 )
2023-10-13 15:47:16 +08:00
283bd59eba
[improvement](scanner) Remove the predicate that is always true for the segment ( #25366 )
...
By utilizing the zonemap index of the segment, we can ascertain if a predicate is always true. For example, if the segment’s maximum value is 100 and the predicate is col < 101, then this predicate is always true for this segment.
2023-10-13 15:25:38 +08:00
9cc0e9526a
[enhancement](merge-on-write) consider version count on size-based cu compaction policy ( #25352 )
2023-10-13 14:52:21 +08:00
fc40788018
[enhancement](merge-on-write) refine tablet meta_lock usage and add some trace log ( #25124 )
2023-10-13 14:22:07 +08:00
6757d2f361
Revert "[Enhancement](show-backends-disks) Add show backends disks ( #24229 )" ( #25389 )
...
This reverts commit 21223e65c59c23cfcb9e8ab610ea321168bcb75a.
2023-10-13 14:08:45 +08:00
6f9a084d99
[Fix](Outfile) Use data_type_serde to export data to parquet file format ( #24998 )
2023-10-13 13:58:34 +08:00
26f50f4f0f
fix heap-use-after-free on map_agg ( #25380 )
...
fix heap-use-after-free on map_agg
2023-10-13 00:19:25 +08:00
1073ef22f3
[fix](insert) improve group_commit related tests ( #25319 )
2023-10-12 21:19:29 +08:00
21223e65c5
[Enhancement](show-backends-disks) Add show backends disks ( #24229 )
...
* Add statement to query disk information corresponding to data directory of BE node
[msyql]->'show backends disks;'
+-----------+-------------+------------------------------+---------+----------+---------------+-------------+-------------------+---------+
| BackendId | Host | RootPath | DirType | DiskState| TotalCapacity | UsedCapacity| AvailableCapacity | UsedPct |
+-----------+-------------+------------------------------+---------+----------+---------------+-------------+-------------------+---------+
| 10002 | 10.xx.xx.90 | /home/work/output/be/storage | STORAGE | ONLINE | 7.049 TB | 2.478 TB | 4.571 TB | 35.16 % |
| 10002 | 10.xx.xx.90 | /home/work/output/be | DEPLOY | ONLINE | 7.049 TB | 2.478 TB | 4.571 TB | 35.16 % |
| 10002 | 10.xx.xx.90 | /home/work/output/be/log | LOG | ONLINE | 7.049 TB | 2.478 TB | 4.571 TB | 35.16 % |
+-----------+-------------+------------------------------+---------+----------+---------------+-------------+-------------------+---------+
2023-10-12 20:24:45 +08:00
1a0344df16
[Improvement](hash) refactor of hash map context ( #24966 )
...
refactor of hash map context
2023-10-12 18:10:21 +08:00
be27d4d921
[fix](broker-load) fix use_count() issue when doing broker load in debug mode ( #25288 )
...
When executing broker load in ASAN mode, BE may crash with error:
```
F20231010 18:18:17.044978 185490 block.cpp:694] Check failed: d.column->use_count() == 1 (3 vs. 1)
*** Check failure stack trace: ***
@ 0x55e9d94c4e46 google::LogMessage::SendToLog()
@ 0x55e9d94c1410 google::LogMessage::Flush()
@ 0x55e9d94c5689 google::LogMessageFatal::~LogMessageFatal()
@ 0x55e9c509f80d doris::vectorized::Block::clear_column_data()
@ 0x55e9b6c170b3 doris::PlanFragmentExecutor::get_vectorized_internal()
@ 0x55e9b6c147e6 doris::PlanFragmentExecutor::open_vectorized_internal()
@ 0x55e9b6c12d9a doris::PlanFragmentExecutor::open()
@ 0x55e9b6c18426 doris::PlanFragmentExecutor::execute()
@ 0x55e9b6945cca doris::FragmentMgr::_exec_actual()
@ 0x55e9b696456c doris::FragmentMgr::exec_plan_fragment()::$_0::operator()()
```
It may happen when there is column maping like:
```
(k1,v2,v3,v4,v5,v6,v7,v8)
set (k2=v4,k3=v4,k4=v4)
```
in load stmt.
Case is covered by Baidu test cases
2023-10-12 17:04:29 +08:00
013eafc1d7
[Enhancement](filter) support only min/max runtime filter in BE ( #25290 )
...
this PR #25193 have achieve about FE.
eg: select count() from lineorder join supplier on lo_partkey < s_suppkey;
will have a max filter after build hash table , so could use it to filter probe table data.
2023-10-12 16:59:52 +08:00
e17f3b72dd
[fix](load) handle Status in beta rowset writer ( #25293 )
2023-10-12 16:58:53 +08:00
bdb64eab73
[feature](meta) queries as table valued function ( #25052 ) ( #25052 )
...
1. Add queries view as table function.
2. Proxy result to other FEs and return merged results back to BE.
Co-authored-by: yiguolei <676222867@qq.com >
2023-10-12 16:26:14 +08:00
22684dedff
[pipelineX](pick) pick PRs from pipeline ( #25340 )
2023-10-12 14:35:32 +08:00
2664d1cffb
[chore](vec) Make this copy constructor of StringRef explicit ( #25337 )
2023-10-12 14:12:46 +08:00
f14e4311c4
[Chore](check) add length check for BufferWritable ( #25322 )
...
add length check for BufferWritable
2023-10-12 10:51:50 +08:00
7447ac71b5
[minor](format) fix BE code format ( #25328 )
2023-10-12 10:34:36 +08:00
022762d5f0
[fix](memory) Fix work load group GC and add logs to locate slow GC #24975
...
Fix work load group GC, add cancel load and add logs.
Unify the format and change all to lowercase of GC logs, avoid unnecessary trouble when grep or less
Add logs to help locate the cause of slow GC.
2023-10-12 10:33:56 +08:00
2014e16cfb
[fix](es catalog)fix es http timeout ( #25273 )
2023-10-12 10:21:55 +08:00
7ca63665b4
[fix](agg) garbled characters in result of map_agg ( #25318 )
2023-10-12 10:10:55 +08:00
58d96ecdbf
[Improve](status) avoid print too may stack log for DATA_QUALITY_ERROR code ( #25292 )
2023-10-12 09:58:51 +08:00
46ab4346ca
[Opt](parquet reader) Optimize the performance of reading decimal in parquet reader. ( #25012 )
...
Optimize the performance of reading decimal in parquet reader.
- Static dispatch `DecimalScaleParams`.
- Optimize `memcpy`, static dispatch copy size in fixed length cases.
- Use right shift bit operator to convert decimals.
2023-10-12 09:53:08 +08:00
e41b03e530
[Fix](multi-catalog) delete hdfs hedged configs at BE side. ( #25094 )
...
Issue Number: close #25093
We can set hdfs hedged configs when creating catalog, just like this:
```
CREATE CATALOG `test_ctl` PROPERTIES (
...
"dfs.client.hedged.read.threadpool.size" = "128",
"dfs.client.hedged.read.threshold.millis" = "500",
...
);
```
It is redundant to set these configs at BE side, and it will brings an occasional bug at #25093 .
2023-10-11 23:25:30 +08:00
73c3e3ab55
[Feature](x-load) support config min replica num for loading data ( #21118 )
2023-10-11 21:07:35 +08:00
ba87f7d3a3
[fix](pipelineX) add table sink and some fix in pipelineX ( #25314 )
2023-10-11 20:18:08 +08:00
e94fca4949
[enhancement](bvar) add metrics to monitor load throughput ( #25189 )
...
to monitor realtime load throughput of the BE
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com >
2023-10-11 19:56:33 +08:00
f960b8c989
[bugfix](stream receiver) be will core during stop because receiver is not closed ( #25298 )
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2023-10-11 19:49:40 +08:00
c6b1c903e4
[fix](Regression-test) fix that the String type in a nested type should contain double quotes and add regression-test ( #25115 )
2023-10-11 18:30:26 +08:00
e514d52232
[fix](point-query) Support mow table with sequence column ( #25308 )
2023-10-11 18:22:16 +08:00
73f632a4e3
[fix](move-memtable) handle error in LoadStreamWriter::close ( #24805 )
2023-10-11 16:54:42 +08:00
3db33207d4
[pipelineX](fix) Fix nullable types for set operator ( #25294 )
2023-10-11 16:50:54 +08:00
bb670118f5
[coverage](test) Delete unused function to improve test coverage ( #25233 )
2023-10-11 11:50:51 +08:00
cdf5f0fe68
[fix](pipelineX) mark join column should be nullable ( #25275 )
2023-10-11 11:35:43 +08:00
2f706cc84b
[compile](simdjson reader) use __AVX2__ macro to decide whether use simdjson to parse ( #25165 )
2023-10-11 10:50:13 +08:00
8e66dbc4a8
[enhancement](log) add some decheck log to debug ( #25210 )
2023-10-11 10:33:13 +08:00
5be29f859a
[enhancement](node) add filter in partition sort node in BE #25188
...
add filter in partition sort node in BE
2023-10-11 10:30:15 +08:00
2ed5245014
[FIX](array_function) fix array_map function with array index function without checkout arg… #25226
2023-10-11 10:23:33 +08:00
7b22ae0c80
[pipelineX](feature) Support set operation operator ( #25251 )
...
---------
Co-authored-by: zhaochangle <zhaochangle@selectdb.com >
2023-10-11 10:22:45 +08:00
be11b48407
[fix](load) fix MemTableWriter::active_memtable_mem_consumption ( #25207 )
2023-10-10 22:33:50 +08:00
fb3b888ff1
[prune](partition)support prune partition when is auto partition with function call ( #24747 )
...
now create table use auto create partition:
AUTO PARTITION BY RANGE date_trunc(event_day, 'day')
so the value of event_day will be insert into partition of date_trunc(event_day, 'day'),
eg: select * from partition_range where date_trunc(event_day,"day")= "2023-08-07 11:00:00";
we can prune some partitions by invoke function of date_trunc("2023-08-07 11:00:00","day" );
2023-10-10 20:39:43 +08:00
913282b29b
[refactor](column) remove get_data_type in IColumn ( #25242 )
2023-10-10 20:27:15 +08:00
62a6b132be
[Fix](func numbers) Remove backend_nums argument of numbers function ( #25200 )
2023-10-10 20:25:58 +08:00
ba1edcf2dc
[fix](stack trace) Optimize stack trace output ( #24933 )
...
Status prints the stack trace, the first four frame pointers are removed, it doesn't make sense.
Optimize stack trace field order.
example:
0# doris::PlanFragmentExecutor::cancel(doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at doris/core/be/src/common/status.h:0
1# doris::FragmentMgr::cancel_query_unlocked(doris::TUniqueId const&, doris::PPlanFragmentCancelReason const&, std::unique_lock<std::mutex> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at doris/cor
e/be/src/runtime/fragment_mgr.cpp:984
2# doris::FragmentMgr::cancel_query(doris::TUniqueId const&, doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at doris/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/
../../../../include/x86_64-linux-gnu/c++/11/bits/gthr-default.h:778
3# long doris::MemTrackerLimiter::free_top_memory_query<doris::MemTrackerLimiter::TrackerLimiterGroup>(long, doris::MemTrackerLimiter::Type, std::vector<doris::MemTrackerLimiter::TrackerLimiterGroup, std::allocator<doris::MemTrackerLimiter::TrackerLimiterGroup> >&, s
td::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)> const&, doris::RuntimeProfile*) at doris/ldb_toolchain/bin/../
lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
4# doris::MemTrackerLimiter::free_top_memory_query(long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, doris::RuntimeProfile*, doris::Mem
TrackerLimiter::Type) at doris/core/be/src/runtime/memory/mem_tracker_limiter.cpp:362
5# doris::MemInfo::process_full_gc() at doris/core/be/src/util/mem_info.cpp:198
6# doris::Daemon::memory_gc_thread() at doris/core/be/src/common/daemon.cpp:0
7# doris::Thread::supervise_thread(void*) at doris/ldb_toolchain/bin/../usr/include/pthread.h:562
8# start_thread
9# __clone
2023-10-10 18:23:07 +08:00
5f95e97c56
[fix](function) array distance should return null when result is nan ( #25214 )
2023-10-10 04:41:51 -05:00
6ca0f3fa5f
[Bug](writer) Fix ub in async writer ( #25218 )
2023-10-10 16:00:45 +08:00