doris

Author	SHA1	Message	Date
airborne12	5d2739b5c5	[Fix](submodule) revert clucene version wrong rollback (#21523 )	2023-07-05 19:10:15 +08:00
Mingyu Chen	242a35fa80	[fix](s3) fix s3 fs benchmark tool (#21401 ) 1. fix concurrency bug of s3 fs benchmark tool, to avoid crash on multi thread. 2. Add `prefetch_read` operation to test prefetch reader. 3. add `AWS_EC2_METADATA_DISABLED` env in `start_be.sh` to avoid call ec2 metadata when creating s3 client. 4. add `AWS_MAX_ATTEMPTS` env in `start_be.sh` to avoid warning log of s3 sdk.	2023-07-05 16:20:58 +08:00
HappenLee	39590f95b0	[pipeline](load) return error status in pipeline load (#21303 )	2023-07-05 16:13:32 +08:00
abmdocrt	d8a549fe61	[Fix](Comment) Comment should be in English (#20964 )	2023-07-05 15:41:34 +08:00
Xinyi Zou	38c8657e5e	[improve](memory) more grace logging for memory exceed limit (#21311 ) more grace logging for Allocator and MemTracker when memory exceed limit fix bthread grace exit.	2023-07-05 14:59:06 +08:00
Pxl	f02bec8ad1	[Chore](runtime filter) change runtime filter dcheck to error status or exception (#21475 ) change runtime filter dcheck to error status or exception	2023-07-05 14:03:55 +08:00
DeadlineFen	0469c02202	[Test](regression) Temporarily disable quickTest for SHOW CREATE TABLE to adapt to enable_feature_binlog=true (#21247 )	2023-07-05 10:12:02 +08:00
zhangstar333	122f5f6c2d	[enchanment](udf) add more info when download jar package failed (#21440 ) when download jar package, some times show the checksum is not equal, but the root reason is unknown, now add some error msg if failed.	2023-07-04 20:35:35 +08:00
Xinyi Zou	3b73604f74	[fix](memory) fix jemalloc purge arena dirty pages core dump (#21486 ) Issue Number: close #xxx jemalloc/jemalloc#2470 Occasional core dump during stress test.	2023-07-04 20:35:13 +08:00
Mryange	81ee4d7402	[performance](group_concat) avoid extra copy in group_concat (#21432 ) avoid extra copy in group_concat	2023-07-04 20:21:44 +08:00
Mingyu Chen	13fb69550a	[improvement](kerberos) disable hdfs fs handle cache to renew kerberos ticket at fix interval (#21265 ) Add a new BE config `kerberos_ticket_lifetime_seconds`, default is 86400. Better set it same as the value of `ticket_lifetime` in `krb5.conf` If a HDFS fs handle in cache is live longer than HALF of this time, it will be set as invalid and recreated. And the kerberos ticket will be renewed.	2023-07-04 17:13:34 +08:00
Ashin Gau	9adbca685a	[opt](hudi) use spark bundle to read hudi data (#21260 ) Use spark-bundle to read hudi data instead of using hive-bundle to read hudi data. Advantage for using spark-bundle to read hudi data: 1. The performance of spark-bundle is more than twice that of hive-bundle 2. spark-bundle using `UnsafeRow` can reduce data copying and GC time of the jvm 3. spark-bundle support `Time Travel`, `Incremental Read`, and `Schema Change`, these functions can be quickly ported to Doris Disadvantage for using spark-bundle to read hudi data: 1. More dependencies make hudi-dependency.jar very cumbersome(from 138M -> 300M) 2. spark-bundle only provides `RDD` interface and cannot be used directly	2023-07-04 17:04:49 +08:00
morrySnow	90dd8716ed	[refactor](multicast) change the way multicast do filter, project and shuffle (#21412 ) Co-authored-by: Jerry Hu <mrhhsg@gmail.com> 1. Filtering is done at the sending end rather than the receiving end 2. Projection is done at the sending end rather than the receiving end 3. Each sender can use different shuffle policies to send data	2023-07-04 16:51:07 +08:00
hqx871	09f414e0f4	fix lru cache handle field order (#21435 ) For LRUHandle, all fields should be put ahead of key_data. The LRUHandle is allocated using malloc and starting from field key_data is for key data.	2023-07-04 16:10:05 +08:00
Jerry Hu	b5da3f74f5	[improvement](join) avoid unnecessary copying in _build_output_block (#21360 ) If the source columns are mutually exclusive within a temporary block, there is no need to duplicate the data.	2023-07-04 12:13:49 +08:00
Xinyi Zou	b86dd11a7d	[fix](pipeline) refactor olap table sink close (#20771 ) For pipeline, olap table sink close is divided into three stages, try_close() --> pending_finish() --> close() only after all node channels are done or canceled, pending_finish() will return false, close() will start. this will avoid block pipeline on close(). In close, check the index channel intolerable failure status after each node channel failure, if intolerable failure is true, the close will be terminated in advance, and all node channels will be canceled to avoid meaningless blocking.	2023-07-04 11:27:51 +08:00
Kaijie Chen	b1c16b96d6	[refactor](load) move validator out of VOlapTableSink (#21460 )	2023-07-04 10:16:56 +08:00
TengJianPing	938c0765cd	[improvement](memory) improve inserting sparse rows into string column (#21420 ) For the following test, which simulate hash join outputing 435699854 rows from 5131 buiding rows: { auto col = doris::vectorized::ColumnString::create(); constexpr int build_rows = 5131; constexpr int output_rows = 435699854; std::string str("01234567"); for (int i = 0; i < build_rows; ++i) { col->insert_data(str.data(), str.size()); } int indices[output_rows]; for (int i = 0; i < output_rows; ++i) { indices[i] = i % build_rows; } auto col2 = doris::vectorized::ColumnString::create(); doris::MonotonicStopWatch watch; watch.start(); col2->insert_indices_from(*col, indices, indices + output_rows); watch.stop(); LOG(WARNING) << "string column insert_indices_from, rows: " << output_rows << ", time: " << doris::PrettyPrinter::print(watch.elapsed_time(), doris::TUnit::TIME_NS); } The ColumnString::insert_indices_from inserting time improve from 6s665ms to 3s158ms: W0702 23:08:39.672044 1277989 doris_main.cpp:545] string column insert_indices_from, rows: 435699854, time: 3s153ms W0702 23:09:36.368853 1282061 doris_main.cpp:545] string column insert_indices_from, rows: 435699854, time: 3s158ms W0703 00:30:26.093307 1468640 doris_main.cpp:545] string column insert_indices_from, rows: 435699854, time: 6s761ms W0703 00:31:21.043638 1472937 doris_main.cpp:545] string column insert_indices_from, rows: 435699854, time: 6s665ms	2023-07-04 09:34:10 +08:00
ZenoYang	790b771a49	[improvement](execute) Eliminate virtual function calls when serializing and deserializing aggregate functions (#21427 ) Eliminate virtual function calls when serializing and deserializing aggregate functions. For example, in AggregateFunctionUniq::deserialize_and_merge method, calling read_pod_binary(ref, buf) in the for loop generates a large number of virtual function calls. void deserialize_and_merge(AggregateDataPtr __restrict place, BufferReadable& buf, Arena* arena) const override { auto& set = this->data(place).set; UInt64 size; read_var_uint(size, buf); set.rehash(size + set.size()); for (size_t i = 0; i < size; ++i) { KeyType ref; read_pod_binary(ref, buf); set.insert(ref); } } template <typename Type> void read_pod_binary(Type& x, BufferReadable& buf) { buf.read(reinterpret_cast<char*>(&x), sizeof(x)); } BufferReadable has only one subclass, VectorBufferReader, so it is better to implement the BufferReadable class directly. The following sql was tested on SSB-flat dataset: SELECT COUNT (DISTINCT lo_partkey), COUNT (DISTINCT lo_suppkey) FROM lineorder_flat; before: MergeTime: 415.398ms after opt: MergeTime: 174.660ms	2023-07-04 09:26:37 +08:00
Pxl	f7c724f8a3	[Bug](excution) avoid core dump on filter_block_internal and add debug information (#21433 ) avoid core dump on filter_block_internal and add debug information	2023-07-03 18:10:30 +08:00
TengJianPing	7e02566333	[fix](pipeline) fix coredump caused by uncaught exception (#21387 ) For pipeline engine, ExecNode::create_tree may throw exception sometimes, e.g. SELECT MIN(-3.40282347e+38) FROM t0; will throw a exception bacause of invalid decimal precision. * Query id: 346886bf48494e77-96eeea5361233618 * * Aborted at 1688101183 (unix time) try "date -d @1688101183" if you are using GNU date * * Current BE git commitID: 2fcb0e090b * * SIGABRT unknown detail explain (@0x13ef42) received by PID 1306434 (TID 1306918 OR 0x7ff0763e1700) from PID 1306434; stack trace: * terminate called recursively 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t, void) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:413 1# 0x00007FFA8780E090 in /lib/x86_64-linux-gnu/libc.so.6 2# raise at ../sysdeps/unix/sysv/linux/raise.c:51 3# abort at /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81 4# __gnu_cxx::__verbose_terminate_handler() [clone .cold] at ../../../../libstdc++-v3/libsupc++/vterminate.cc:75 5# __cxxabiv1::__terminate(void ()()) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48 6# 0x000055B6C30C7401 in /mnt/ssd01/doris-master/VEC_ASAN/be/lib/doris_be 7# 0x000055B6C30C7554 in /mnt/ssd01/doris-master/VEC_ASAN/be/lib/doris_be 8# doris::vectorized::create_decimal(unsigned long, unsigned long, bool) at /home/zcp/repo_center/doris_master/doris/be/src/vec/data_types/data_type_decimal.cpp:167 9# doris::vectorized::DataTypeFactory::create_data_type(doris::TypeDescriptor const&, bool) at /home/zcp/repo_center/doris_master/doris/be/src/vec/data_types/data_type_factory.cpp:185 10# doris::vectorized::AggFnEvaluator::AggFnEvaluator(doris::TExprNode const&) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exprs/vectorized_agg_fn.cpp:79 11# std::unique_ptr<doris::vectorized::AggFnEvaluator, std::default_delete<doris::vectorized::AggFnEvaluator> > doris::vectorized::AggFnEvaluator::create_unique<doris::TExprNode const&>(doris::TExprNode const&) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exprs/vectorized_agg_fn.h:49 12# doris::vectorized::AggFnEvaluator::create(doris::ObjectPool, doris::TExpr const&, doris::TSortInfo const&, doris::vectorized::AggFnEvaluator*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exprs/vectorized_agg_fn.cpp:92 13# doris::vectorized::AggregationNode::init(doris::TPlanNode const&, doris::RuntimeState) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/vaggregation_node.cpp:158 14# doris::ExecNode::create_tree_helper(doris::RuntimeState, doris::ObjectPool, std::vector<doris::TPlanNode, std::allocator<doris::TPlanNode> > const&, doris::DescriptorTbl const&, doris::ExecNode, int, doris::ExecNode*) at /home/zcp/repo_center/doris_master/doris/be/src/exec/exec_node.cpp:276 15# doris::ExecNode::create_tree(doris::RuntimeState, doris::ObjectPool, doris::TPlan const&, doris::DescriptorTbl const&, doris::ExecNode) at /home/zcp/repo_center/doris_master/doris/be/src/exec/exec_node.cpp:231 16# doris::pipeline::PipelineFragmentContext::prepare(doris::TPipelineFragmentParams const&, unsigned long) at /home/zcp/repo_center/doris_master/doris/be/src/pipeline/pipeline_fragment_context.cpp:253 17# doris::FragmentMgr::exec_plan_fragment(doris::TPipelineFragmentParams const&, std::function<void (doris::RuntimeState, doris::Status)> const&)::$_1::operator()(int) const at /home/zcp/repo_center/doris_master/doris/be/src/runtime/fragment_mgr.cpp:895 18# doris::FragmentMgr::exec_plan_fragment(doris::TPipelineFragmentParams const&, std::function<void (doris::RuntimeState, doris::Status)> const&)::$_0::operator()() const at /home/zcp/repo_center/doris_master/doris/be/src/runtime/fragment_mgr.cpp:926 19# void std::__invoke_impl<void, doris::FragmentMgr::exec_plan_fragment(doris::TPipelineFragmentParams const&, std::function<void (doris::RuntimeState, doris::Status)> const&)::$_0&>(std::__invoke_other, doris::FragmentMgr::exec_plan_fragment(doris::TPipelineFragmentParams const&, std::function<void (doris::RuntimeState, doris::Status*)> const&)::$_0&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61	2023-07-03 10:58:13 +08:00
Gabriel	a3d34e1e08	[decimalv2](compatibility) add config to allow invalid decimalv2 literal (#21327 )	2023-07-03 10:55:27 +08:00
Pxl	59c1bbd163	[Feature](materialized view) support query match mv with agg_state on nereids planner (#21067 ) * support create mv contain aggstate column * update * update * update * support query match mv with agg_state on nereids planner update * update * update	2023-07-03 10:19:31 +08:00
Pxl	f90e8fcb26	[Chore](storage) add debug info for TabletColumn::get_aggregate_function (#21408 )	2023-07-03 10:02:44 +08:00
Jerry Hu	ca0953ea51	[improvement](join) Serialize build keys in a vectorized (columnar) way (#21361 ) There is a significant performance improvement in serializing keys in the aggregate node through vectorization. Now, applying it to the join node also brings performance improvement.	2023-07-03 09:29:10 +08:00
Kaijie Chen	1c961f2272	[refactor](load) move generate_delete_bitmap from memtable to beta rowset writer (#21329 )	2023-07-01 17:22:45 +08:00
Mingyu Chen	4ad3a7a8de	[fix](exec) run exec_plan_fragment in pthread to avoid BE crash (#21343 ) If there is only one fragment of a query plan, FE will call `exec_plan_fragment` rpc to BE. And on BE side, the `exec_plan_fragment()` will be executed directly in bthread, but it may call some JNI method like `AttachCurrentThread()`, which will return error in bthread. So I modify the `exec_plan_fragment` to make sure it will be executed in pthread pool.	2023-07-01 12:29:22 +08:00
ZhangYu0123	1fe04b7242	[Chore](metrics) remove trace metrics code using runtime profile instead (#21394 ) * commit * fix * format	2023-07-01 12:18:23 +08:00
Pxl	88cbea2b56	[Bug](agg-state) fix core dump on not nullable argument for aggstate's nested argument (#21331 ) fix core dump on not nullable argument for aggstate's nested argument	2023-06-30 18:20:25 +08:00
amory	b7d6a70868	[FIX](datatype) Implement hash func with array/map/struct type (#21334 ) we do not Implement any hash functions in array/map/struct column , so we use sql like this will make be core select * from ( select bdp.nc_num, collect_list(distinct(bd.catalog_name)) as catalog_name, material_qty from dataease.bu_delivery_product bdp left join dataease.bu_trans_transfer btt on bdp.delivery_product_id = btt.delivery_product_id left join dataease.bu_delivery bd on bdp.delivery_id = bd.delivery_id where bd.val_status in ('10', '20', '30', '90') and bd.delivery_type in (0, 1, 2) group by nc_num, material_qty union ALL select bdp.nc_num, collect_list(distinct(bd.catalog_name)) as catalog_name, material_qty from dataease.bu_trans_transfer btt left join dataease.bu_delivery_product bdp on bdp.delivery_product_id = btt.delivery_product_id left join dataease.bu_delivery bd on bdp.delivery_id = bd.delivery_id where bd.val_status in ('10', '20', '30', '90') and bd.delivery_type in (0, 1, 2) group by nc_num, material_qty ) aa; core :	2023-06-30 17:11:35 +08:00
Kaijie Chen	53f90cb2e3	[fix](load) fix tablet id in RowsetWriterContext (#21336 )	2023-06-30 14:59:43 +08:00
Xinyi Zou	25b5bab22d	[fix](memory) Fix hash table buf initialize null pointer (#21315 ) When compiling FunctionArrayEnumerateUniq::_execute_by_hash, AllocatorWithStackMemory::free(buf) will be called when delete HashMapContainer. the gcc compiler will think that size > N and buf is not heap memory, and report an error ' void free(void*)' called on unallocated object 'hash_map' This only fails on doris docker + gcc 11.1, no problem on doris docker + clang 16.0.1, no problem on ldb_toolchanin gcc 11.1 and clang 16.0.1.	2023-06-30 14:50:53 +08:00
AlexYue	1ac724c2dd	[enhance](BufferedReader) don't blocking wait on buffered reader's condition variable (#21153 )	2023-06-30 14:34:27 +08:00
Liqf	d76fa427a3	[improve](jsonb)Invalid json path prompts an error instead of null (#19646 ) 1. Invalid json path prompts an error instead of null： before： ```sql mysql> SELECT jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[a]'); +-------------------------------------------------------------+ \| jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[a]') \| +-------------------------------------------------------------+ \| NULL \| +-------------------------------------------------------------+ 1 row in set (0.01 sec) ``` now ```sql mysql> SELECT jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[a]'); ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INVALID_ARGUMENT]Json path error: Invalid Json Path for value: $[a] ``` 2. fix some problem: https://github.com/apache/doris/pull/19185 a. support negative numbers ```sql mysql> SELECT jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[-2]'); +--------------------------------------------------------------+ \| jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[-2]') \| +--------------------------------------------------------------+ \| "a" \| +--------------------------------------------------------------+ 1 row in set (0.02 sec) ``` b. Avoid using unnecessary memory 3. Supplementary regression test	2023-06-30 14:29:21 +08:00
yongjinhou	df23ab3f29	[Enhancement](tvf) Add authentication for workload group tvf (#21323 )	2023-06-30 12:56:23 +08:00
abmdocrt	2fcb0e090b	[Fix](Snapshot) Shoule use false instead of 0 in while loop (#20966 )	2023-06-30 10:22:51 +08:00
TengJianPing	33fa5dd1e9	[fix](cast) fix coredump of cast string of invalid datetime (#21350 ) For sql like select cast("627492340" as datetime); the string is an invalid datetime, function DateV2Value<T>::from_date_str cast it as datetime 2062-74-92 23:40:00, with an out-of-range month and day value, which cause memory violation in function DateV2Value<T>::format_datetime when trying to access s_days_in_month. ==256444==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a7c1a5cff8 at pc 0x55a7e5aa3d2a bp 0x7f3b805f0370 sp 0x7f3b805f0368 READ of size 4 at 0x55a7c1a5cff8 thread T390 (FragmentMgrThre) #0 0x55a7e5aa3d29 in doris::vectorized::DateV2Value<doris::vectorized::DateTimeV2ValueType>::format_datetime(unsigned int, bool) const /home/zcp/repo_center/doris_master/doris/be/src/vec/runtime/vdatetime_value.cpp:1821:31 #1 0x55a7e5aa3052 in doris::vectorized::DateV2Value<doris::vectorized::DateTimeV2ValueType>::from_date_str(char const, int, int) /home/zcp/repo_center/doris_master/doris/be/src/vec/runtime/vdatetime_value.cpp:1968:5 #2 0x55a7d48f0c49 in bool doris::vectorized::read_datetime_v2_text_impl<unsigned long>(unsigned long&, doris::vectorized::ReadBuffer&, unsigned int) /home/zcp/repo_center/doris_master/doris/be/src/vec/io/io_helper.h:309:19 #3 0x55a7ddb21642 in bool doris::vectorized::try_read_datetime_v2_text<unsigned long>(unsigned long&, doris::vectorized::ReadBuffer&, unsigned int) /home/zcp/repo_center/doris_master/doris/be/src/vec/io/io_helper.h:409:12 #4 0x55a7ddb215ec in bool doris::vectorized::try_parse_impl<doris::vectorized::DataTypeDateTimeV2, unsigned int, void>(doris::vectorized::DataTypeDateTimeV2::FieldType&, doris::vectorized::ReadBuffer&, DateLUTImpl const, unsigned int) /home/zcp/repo_center/doris_master/doris/be/src/vec/functions/function_cast.h:839:16 #5 0x55a7ddb21c84 in auto doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto)::operator()<std::integral_constant<bool, false>, std::integral_constant<bool, true>>(void, auto) const /home/zcp/repo_center/doris_master/doris/be/src/vec/functions/function_cast.h:1340:38 #6 0x55a7ddb216f7 in void* std::__invoke_impl<doris::Status, doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto), std::integral_constant<bool, false>, std::integral_constant<bool, true>>(std::__invoke_other, auto&&, std::integral_constant<bool, false>&&, std::integral_constant<bool, true>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14 #7 0x55a7ddb2167f in std::__invoke_result<void, std::integral_constant<bool, false>, std::integral_constant<bool, true>>::type std::__invoke<doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto), std::integral_constant<bool, false>, std::integral_constant<bool, true>>(void&&, std::integral_constant<bool, false>&&, std::integral_constant<bool, true>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 #8 0x55a7ddb20d14 in std::__detail::__variant::__gen_vtable_impl<std::__detail::__variant::_Multi_array<std::__detail::__variant::__deduce_visit_result<doris::Status> ()(doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto)&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&)>, std::integer_sequence<unsigned long, 0ul, 1ul>>::__visit_invoke(doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto)&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1013:11 #9 0x55a7ddb20c15 in decltype(auto) std::__do_visit<std::__detail::__variant::__deduce_visit_result<doris::Status>, doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto), std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>>(auto&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1714:14 #10 0x55a7ddb20b6a in decltype(auto) std::visit<doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto), std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>>(void&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1769:9 #11 0x55a7ddb205ff in doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void) /home/zcp/repo_center/doris_master/doris/be/src/vec/functions/function_cast.h:1321:23 #12 0x55a7ddb1f2c7 in doris::vectorized::FunctionConvertFromString<doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute_impl(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/functions/function_cast.h:1417:20	2023-06-30 10:12:31 +08:00
airborne12	a3033bff42	[Fix](s3FileWriter) fix bytes_appended bug for s3_file_writer (#21348 )	2023-06-29 22:06:49 +08:00
Xin Liao	2643f3a167	[fix](merge-on-write) fix dead lock when publish (#21339 )	2023-06-29 20:58:47 +08:00
airborne12	d00326549f	[Fix](inverted index) fix a bundle of inverted index exception process errors (#21328 ) 1. fix alloc/free mismatch problem. 2. fix buffer use after free problem.	2023-06-29 19:47:55 +08:00
zhangdong	41ccf77c7d	[feature][fix](fs)(s3)add fs_s3 benchmark tool and fix s3 file writer bug (#20926 ) 1. Fix bug that the field of s3_file_write_bufferpool is not initialized, causing undefined behavior. 2. add fs_s3 benchmark tool，Reference to the usage of tools https://github.com/apache/doris/pull/20770 And opt the output: `sh bin/run-fs-benchmark.sh --conf=conf/s3.conf --fs_type=s3 --operation=single_read --threads=1 --iterations=1` ``` ------------------------------------------------------------------------------------------------------------------------------ Benchmark Time CPU Iterations UserCounters... ------------------------------------------------------------------------------------------------------------------------------ S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1 7366 ms 123 ms 1 ReadRate(B/S)=12.1823M/s ReadTime(S)=7.36572 ReadTotal(B)=89.7314M S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1 6163 ms 116 ms 1 ReadRate(B/S)=14.5597M/s ReadTime(S)=6.16299 ReadTotal(B)=89.7314M S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1 6048 ms 110 ms 1 ReadRate(B/S)=14.8366M/s ReadTime(S)=6.04796 ReadTotal(B)=89.7314M S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_mean 6526 ms 116 ms 3 ReadRate(B/S)=13.8596M/s ReadTime(S)=6.52556 ReadTotal(B)=89.7314M S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_median 6163 ms 116 ms 3 ReadRate(B/S)=14.5597M/s ReadTime(S)=6.16299 ReadTotal(B)=89.7314M S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_stddev 730 ms 6.68 ms 3 ReadRate(B/S)=1.45914M/s ReadTime(S)=0.729876 ReadTotal(B)=0 S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_cv 11.18 % 5.75 % 3 ReadRate(B/S)=10.53% ReadTime(S)=11.18% ReadTotal(B)=0.00% S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_max 7366 ms 123 ms 3 ReadRate(B/S)=14.8366M/s ReadTime(S)=7.36572 ReadTotal(B)=89.7314M S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_min 6048 ms 110 ms 3 ReadRate(B/S)=12.1823M/s ReadTime(S)=6.04796 ReadTotal(B)=89.7314M ```	2023-06-29 19:03:49 +08:00
Jack Drogon	8c532e8808	[fix](restore) work around, ingest binlog after backup/restore which local_tablet.partition_id is not correct, use req.partition_id (#21288 ) * work around, ingest binlog after backup/restore which local_tablet.partition_id is not correct, use by req.partition_id Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>	2023-06-29 17:19:02 +08:00
Pxl	45f1909bc3	[Bug](lateral-view) make lateral view function's nullable mode work (#21242 ) make lateral view function's nullable mode work	2023-06-29 10:50:07 +08:00
Jerry Hu	7f0e37069f	[improvement](olap) filter the whole segment by dictionary (#21239 )	2023-06-29 10:34:29 +08:00
Jack Drogon	3f99b91ddf	[fix](gc_binlog) Fix tablet gc_binlogs nullptr (#21158 )	2023-06-29 10:10:33 +08:00
Pxl	f8cfe5e579	[Bug](pipeline) add DCHECK for _instance_to_sending_by_pipeline = false on _send_rpc (#21169 ) add DCHECK for _instance_to_sending_by_pipeline = false on _send_rpc	2023-06-29 10:03:57 +08:00
Xiangyu Wang	86af533e83	[Enhancement](heartbeat) make heartbeat ok when config repeated host-ip pairs (#21228 )	2023-06-28 23:12:06 +08:00
DongLiang-0	a6b51ec19a	[Feature](avro) Support Apache Avro file format (#19990 ) support read avro file by hdfs() or s3() . ```sql select * from s3( "uri" = "http://127.0.0.1:9312/test2/person.avro", "ACCESS_KEY" = "ak", "SECRET_KEY" = "sk", "FORMAT" = "avro"); +--------+--------------+-------------+-----------------+ \| name \| boolean_type \| double_type \| long_type \| +--------+--------------+-------------+-----------------+ \| Alyssa \| 1 \| 10.0012 \| 100000000221133 \| \| Ben \| 0 \| 5555.999 \| 4009990000 \| \| lisi \| 0 \| 5992225.999 \| 9099933330 \| +--------+--------------+-------------+-----------------+ select * from hdfs( "uri" = "hdfs://127.0.0.1:9000/input/person2.avro", "fs.defaultFS" = "hdfs://127.0.0.1:9000", "hadoop.username" = "doris", "format" = "avro"); +--------+--------------+-------------+-----------+ \| name \| boolean_type \| double_type \| long_type \| +--------+--------------+-------------+-----------+ \| Alyssa \| 1 \| 8888.99999 \| 89898989 \| +--------+--------------+-------------+-----------+ ``` current avro reader only support common data type, the complex data types will be supported later.	2023-06-28 21:15:35 +08:00
Xinyi Zou	d2c42ec638	[fix](memory) Purge Jemalloc arena dirty pages when memory insufficient (#21237 ) Jemalloc dirty page only use madvise MADV_FREE, memory is not release back to system, RSS won't reduce in time, So when the process memory exceed limit or system available memory is insufficient, manually transfer dirty page to the muzzy page, which will call MADV_DONTNEED to release the physical memory back to the system. https://jemalloc.net/jemalloc.3.html#opt.dirty_decay_ms	2023-06-28 16:49:45 +08:00
Xinyi Zou	0396f78590	[fix](memory) Remove ChunkAllocator & fix Allocator no use mmap (#21259 )	2023-06-28 16:10:24 +08:00

1 2 3 4 5 ...

4770 Commits