913282b29b
[refactor](column) remove get_data_type in IColumn ( #25242 )
2023-10-10 20:27:15 +08:00
62a6b132be
[Fix](func numbers) Remove backend_nums argument of numbers function ( #25200 )
2023-10-10 20:25:58 +08:00
ba1edcf2dc
[fix](stack trace) Optimize stack trace output ( #24933 )
...
Status prints the stack trace, the first four frame pointers are removed, it doesn't make sense.
Optimize stack trace field order.
example:
0# doris::PlanFragmentExecutor::cancel(doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at doris/core/be/src/common/status.h:0
1# doris::FragmentMgr::cancel_query_unlocked(doris::TUniqueId const&, doris::PPlanFragmentCancelReason const&, std::unique_lock<std::mutex> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at doris/cor
e/be/src/runtime/fragment_mgr.cpp:984
2# doris::FragmentMgr::cancel_query(doris::TUniqueId const&, doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at doris/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/
../../../../include/x86_64-linux-gnu/c++/11/bits/gthr-default.h:778
3# long doris::MemTrackerLimiter::free_top_memory_query<doris::MemTrackerLimiter::TrackerLimiterGroup>(long, doris::MemTrackerLimiter::Type, std::vector<doris::MemTrackerLimiter::TrackerLimiterGroup, std::allocator<doris::MemTrackerLimiter::TrackerLimiterGroup> >&, s
td::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)> const&, doris::RuntimeProfile*) at doris/ldb_toolchain/bin/../
lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
4# doris::MemTrackerLimiter::free_top_memory_query(long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, doris::RuntimeProfile*, doris::Mem
TrackerLimiter::Type) at doris/core/be/src/runtime/memory/mem_tracker_limiter.cpp:362
5# doris::MemInfo::process_full_gc() at doris/core/be/src/util/mem_info.cpp:198
6# doris::Daemon::memory_gc_thread() at doris/core/be/src/common/daemon.cpp:0
7# doris::Thread::supervise_thread(void*) at doris/ldb_toolchain/bin/../usr/include/pthread.h:562
8# start_thread
9# __clone
2023-10-10 18:23:07 +08:00
5f95e97c56
[fix](function) array distance should return null when result is nan ( #25214 )
2023-10-10 04:41:51 -05:00
6ca0f3fa5f
[Bug](writer) Fix ub in async writer ( #25218 )
2023-10-10 16:00:45 +08:00
7434f80300
[pipelineX](refactor) Refactor pending finish dependency ( #25181 )
2023-10-10 11:56:02 +08:00
880d0d7e70
[Bug](pipeline) Support the auto partition in pipeline load ( #25176 )
2023-10-10 11:51:12 +08:00
6ad22721cb
fix: ubsan compile bug ( #25199 )
2023-10-10 11:46:33 +08:00
39669c6df2
[feature](pipelineX) add runtimefliter in pipelineX multicast sink ( #25120 )
2023-10-10 10:41:08 +08:00
f5b826b66d
[fix](mark join) mark join column should be nullable ( #24910 )
2023-10-10 10:10:36 +08:00
b8621364d2
[FIX](serde)fix scale with decimalv2 in mysql writer which get real scale #25190
2023-10-10 09:09:57 +08:00
90ad48cdb7
[feature](pipelineX) add node id and profilev2 in pipelineX ( #25084 )
2023-10-10 09:09:26 +08:00
b58010c48e
[fix](export) BufferWritable must be committed before deconstruct ( #25185 )
...
F20231009 16:03:47.659968 3342535 string_buffer.hpp:48] Check failed: _now_offset == 0
*** Check failure stack trace: ***
@ 0x561a6f8e21e6 google::LogMessage::SendToLog()
@ 0x561a6f8de7b0 google::LogMessage::Flush()
@ 0x561a6f8e2a29 google::LogMessageFatal::~LogMessageFatal()
@ 0x561a4a409233 doris::vectorized::BufferWritable::~BufferWritable()
@ 0x561a6e202853 doris::vectorized::VCSVTransformer::write()
@ 0x561a6e1f19ba doris::vectorized::VFileResultWriter::_write_file()
@ 0x561a6e1f1522 doris::vectorized::VFileResultWriter::append_block()
@ 0x561a6e121bed
The error will occur in DEBUG mode, and doing export will invalid data.
It has been covered by baidu case.
2023-10-09 22:39:45 +08:00
53b46b7e6c
[FIX](filter) update for filter_by_select logic ( #25007 )
...
this pr is aim to update for filter_by_select logic and change delete limit
only support scala type in delete statement where condition
only support column nullable and predict column support filter_by_select logic, because we can not push down non-scala type to storage layer to pack in predict column but do filter logic
2023-10-09 21:27:40 +08:00
4de3df6a46
[refactor](column) remove unused method and column definitions ( #25152 )
...
remove unused method and column definitions
using primitive type in predicate column to check datev1 and datev2
2023-10-09 17:14:35 +08:00
af707e5244
[pipelineX](fix) fix external table scan operator ( #25166 )
2023-10-09 16:33:27 +08:00
e1b9854f90
[bugfix](thirdparty) Upgrade aws s3 sdk to prevent mem leak ( #25106 )
...
During the use of the AWS S3 SDK, we found that there is a memory leak. According to the official issue, upgrading the SDK should resolve the issue.
2023-10-09 16:08:50 +08:00
d7b6fe57df
[Bug](java-udf) fix java-udf memory leak ( #25151 )
2023-10-09 15:10:56 +08:00
5a55e47acd
[Enhancement](Load) stream tvf support two phase commit ( #23800 )
2023-10-09 14:15:56 +08:00
9e31cb26bb
[fix](parse_url) fix parse_url is not working in some case to extract the HOST ( #25040 )
...
Issue Number: close #24452
2023-10-09 00:14:58 +08:00
451e299151
[Opt](performance) Optimize timeround with minute / second ( #25073 )
2023-10-08 23:14:23 +08:00
5c020be4d2
[Bug](join) corner case cause the mark join + null aware left join core dump in regression test in pipeline query engine ( #25087 )
2023-10-08 22:50:12 +08:00
9d8b993c51
[fix](fs) fix remove error log failed ( #25108 )
2023-10-08 22:15:37 +08:00
7e9ffad933
[fix](ES catalog)Doris cannot parse ES date field without time zone ( #24864 )
...
1. Add support for Doris to parse ES date field without time zone info. eg: `2023-04-17T23:01:18.151`, this time will be treated as UTC time, since ES assumes that the time zone for time fields without time zones is UTC.
2. Change local time zone convertion from system local time zone to session variable time zone.
2023-10-08 19:28:08 +08:00
b91335dbb8
[refactor](columndecimal) is_decimal_v2 member is useless because column decimal could detect by itself ( #25110 )
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2023-10-08 18:09:19 +08:00
c3d9f42a3e
[fix](scanner) fix load cannot end when set exec_mem_limit ( #25090 )
2023-10-08 17:07:30 +08:00
6fe060b79e
[fix](streamload) fix http_stream retry mechanism ( #24978 )
...
If a failure occurs, doris may retry. Due to ctx->is_read_schema is a global variable that has not been reset in a timely manner, which may cause exceptions.
---------
Co-authored-by: yiguolei <676222867@qq.com >
2023-10-08 11:16:21 +08:00
feb1cbe9ed
[bug](partition_sort)partition sort need sort all data in two phase global ( #24960 )
...
#24886 this PR have mark phase in FE, now add those change in BE.
partition sort need sort all data in two pahse global
2023-10-08 10:46:43 +08:00
4e8cde127c
[Enhance](catalog)add table cache in paimon jni ( #25014 )
...
- fix get old schema after refresh paimon table
- add table cache in paimon jni
2023-10-08 10:36:18 +08:00
239df5860b
[enhancement](tablet_meta_lock) add more trace for write lock of tablet's _meta_lock ( #25095 )
2023-10-08 10:28:10 +08:00
f66708db0e
[log](load) PUBLISH_TIMEOUT should not print stacktrace ( #25080 )
2023-10-08 10:16:25 +08:00
0df32c8e3e
[Fix](Outfile) Use data_type_serde to export data to csv file format ( #24721 )
...
Modify the outfile logic, use the data type serde framework.
2023-10-07 22:50:44 +08:00
cb0076e585
[fix](insert) fix group commit be ut ( #24968 )
2023-10-07 19:50:05 +08:00
8953179c11
[fix](multi-table) fix multi table task cannot end ( #25056 )
...
When exec multi table task, it can not end when exec plan error, which causes other routine load task can not submit.
2023-10-07 19:45:42 +08:00
59261174d5
[chore](unused) Remove unused variable CPU_HARD_LIMIT in task_group.cc ( #25076 )
...
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com >
2023-10-07 03:36:13 -05:00
335804bb25
[fix](pipelinex) fix multi cast sink without init ( #25066 )
2023-10-07 15:49:03 +08:00
7b2ff38401
query cpu hard limit based on doris scheduler ( #24844 )
2023-10-07 12:03:07 +08:00
0631ed61b0
[feature](profilev2) Preliminary support for profilev2. ( #24881 )
...
You can set the level of counters on the backend using ADD_COUNTER_WITH_LEVEL/ADD_TIMER_WITH_LEVEL. The profile can then merge counters with level 1.
set profile_level = 1;
such as
sql
select count(*) from customer join item on c_customer_sk = i_item_sk
profile
Simple profile
PLAN FRAGMENT 0
OUTPUT EXPRS:
count(*)
PARTITION: UNPARTITIONED
VRESULT SINK
MYSQL_PROTOCAL
7:VAGGREGATE (merge finalize)
| output: count(partial_count(*))[#44 ]
| group by:
| cardinality=1
| TotalTime: avg 725.608us, max 725.608us, min 725.608us
| RowsReturned: 1
|
6:VEXCHANGE
offset: 0
TotalTime: avg 52.411us, max 52.411us, min 52.411us
RowsReturned: 8
PLAN FRAGMENT 1
PARTITION: HASH_PARTITIONED: c_customer_sk
STREAM DATA SINK
EXCHANGE ID: 06
UNPARTITIONED
TotalTime: avg 106.263us, max 118.38us, min 81.403us
BlocksSent: 8
5:VAGGREGATE (update serialize)
| output: partial_count(*)[#43 ]
| group by:
| cardinality=1
| TotalTime: avg 679.296us, max 739.395us, min 554.904us
| BuildTime: avg 33.198us, max 48.387us, min 28.880us
| ExecTime: avg 27.633us, max 40.278us, min 24.537us
| RowsReturned: 8
|
4:VHASH JOIN
| join op: INNER JOIN(PARTITIONED)[]
| equal join conjunct: c_customer_sk = i_item_sk
| runtime filters: RF000[bloom] <- i_item_sk(18000/16384/1048576)
| cardinality=17,740
| vec output tuple id: 3
| vIntermediate tuple ids: 2
| hash output slot ids: 22
| RowsReturned: 18.0K (18000)
| ProbeRows: 18.0K (18000)
| ProbeTime: avg 862.308us, max 1.576ms, min 666.28us
| BuildRows: 18.0K (18000)
| BuildTime: avg 3.8ms, max 3.860ms, min 2.317ms
|
|----1:VEXCHANGE
| offset: 0
| TotalTime: avg 48.822us, max 67.459us, min 30.380us
| RowsReturned: 18.0K (18000)
|
3:VEXCHANGE
offset: 0
TotalTime: avg 33.162us, max 39.480us, min 28.854us
RowsReturned: 18.0K (18000)
PLAN FRAGMENT 2
PARTITION: HASH_PARTITIONED: c_customer_id
STREAM DATA SINK
EXCHANGE ID: 03
HASH_PARTITIONED: c_customer_sk
TotalTime: avg 753.954us, max 1.210ms, min 499.470us
BlocksSent: 64
2:VOlapScanNode
TABLE: default_cluster:tpcds.customer(customer), PREAGGREGATION: ON
runtime filters: RF000[bloom] -> c_customer_sk
partitions=1/1, tablets=12/12, tabletList=1550745,1550747,1550749 ...
cardinality=100000, avgRowSize=0.0, numNodes=1
pushAggOp=NONE
TotalTime: avg 18.417us, max 41.319us, min 10.189us
RowsReturned: 18.0K (18000)
---------
Co-authored-by: yiguolei <676222867@qq.com >
2023-10-07 11:16:53 +08:00
83a9d07288
[refactor](segment iterator) remove some code to make the logic more clear ( #25050 )
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2023-10-07 11:14:28 +08:00
bd582aee75
[pipelineX](minor) refine code ( #25015 )
2023-10-07 10:45:33 +08:00
a9d12f7b82
[Debug](float) Add clang debug tune float accuracy ( #25041 )
2023-10-07 09:34:50 +08:00
c2b46e4df7
[fix](move-memtable) exclude rpc memory in flush mem-tracker ( #24722 )
2023-10-05 22:10:53 +08:00
db6c16058a
[improve](move-memtable) always share load streams ( #24763 )
2023-10-05 22:09:59 +08:00
93eedaff62
[opt](function) Use Dict to opt the function of time_round ( #25029 )
...
Before:
select hour_floor(`@timestamp`, 7) as t, count() as cnt from httplogs_date group by t order by t limit 10;
+---------------------+--------+
| t | cnt |
+---------------------+--------+
| 1998-04-30 21:00:00 | 324 |
| 1998-05-01 04:00:00 | 286156 |
| 1998-05-01 11:00:00 | 266130 |
| 1998-05-01 18:00:00 | 483765 |
| 1998-05-02 01:00:00 | 276706 |
| 1998-05-02 08:00:00 | 169945 |
| 1998-05-02 15:00:00 | 223593 |
| 1998-05-02 22:00:00 | 272616 |
| 1998-05-03 05:00:00 | 188689 |
| 1998-05-03 12:00:00 | 184405 |
+---------------------+--------+
10 rows in set (3.39 sec)
after:
select hour_floor(`@timestamp`, 7) as t, count() as cnt from httplogs_date group by t order by t limit 10;
+---------------------+--------+
| t | cnt |
+---------------------+--------+
| 1998-04-30 21:00:00 | 324 |
| 1998-05-01 04:00:00 | 286156 |
| 1998-05-01 11:00:00 | 266130 |
| 1998-05-01 18:00:00 | 483765 |
| 1998-05-02 01:00:00 | 276706 |
| 1998-05-02 08:00:00 | 169945 |
| 1998-05-02 15:00:00 | 223593 |
| 1998-05-02 22:00:00 | 272616 |
| 1998-05-03 05:00:00 | 188689 |
| 1998-05-03 12:00:00 | 184405 |
+---------------------+--------+
10 rows in set (2.19 sec)
2023-10-04 23:34:24 +08:00
4ce5213b1c
[fix](insert) Fix test_group_commit_stream_load and add more regression in test_group_commit_http_stream ( #24954 )
2023-10-03 20:56:24 +08:00
6e836fe381
[fix](jdbc catalog) fix jdbc catalog read bitmap data crash ( #25034 )
2023-10-03 20:52:47 +08:00
10f0c63896
[FIX](complex-type) fix agg table with complex type with replace state ( #24873 )
...
fix agg table with complex type with replace state
2023-10-03 16:32:58 +08:00
f8a3034dca
[Opt](performance) refactor and opt time round floor function ( #25026 )
...
refactor and opt time round floor function
2023-10-01 11:51:26 +08:00
642e5cdb69
[Fix](Status) Make Status [[nodiscard]] and handle returned Status correctly ( #23395 )
2023-09-29 22:38:52 +08:00
d23bedf170
[fix](single-replica-load) fix duplicated done run in request_slave_tablet_pull_rowset ( #25013 )
...
BE will crash because done run twice when try_offer() failed in
request_slave_tablet_pull_rowset.
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com >
2023-09-28 21:08:18 +08:00