Commit Graph

2703 Commits

Author SHA1 Message Date
c75e63a2a5 [Improvement](scan) Use scanner to do projection of scan node (#29124) 2023-12-27 16:00:52 +08:00
2d2f14bc75 [fix](paimon) use SlotDescriptor to parse the required fields (#28990)
Before this PR, Paimon has created the schema of `VectorTable` by accessing meta information. However, once the schema of `VectorTable` in java is not same as `Block` in c++, BE will crashed, and there is no good way to troubleshoot errors.
2023-12-27 15:45:53 +08:00
cfed36afbf [Fix](topn opt) prevent from merge __TEMP__ column in segment iterator (#29121) 2023-12-27 15:42:48 +08:00
9ff8bd2e9c [Enhancement](Wal)Support dynamic wal space limit (#27726) 2023-12-27 11:51:32 +08:00
6d26aca4ca [fix](pipeline) sort_merge should throw exception in has_next_block if got failed status (#29076)
Test in regression-test/suites/datatype_p0/decimalv3/test_decimalv3_overflow.groovy::249 sometimes failed when there are multiple BEs and FE process report status slowly for some reason.

explain select k1, k2, k1 * k2 from test_decimal128_overflow2 order by 1,2,3
--------------

+----------------------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                                                            |
+----------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                                                            |
|   OUTPUT EXPRS:                                                                                                            |
|     k1[#5]                                                                                                                 |
|     k2[#6]                                                                                                                 |
|     (k1 * k2)[#7]                                                                                                          |
|   PARTITION: UNPARTITIONED                                                                                                 |
|                                                                                                                            |
|   HAS_COLO_PLAN_NODE: false                                                                                                |
|                                                                                                                            |
|   VRESULT SINK                                                                                                             |
|      MYSQL_PROTOCAL                                                                                                        |
|                                                                                                                            |
|   111:VMERGING-EXCHANGE                                                                                                    |
|      offset: 0                                                                                                             |
|                                                                                                                            |
| PLAN FRAGMENT 1                                                                                                            |
|                                                                                                                            |
|   PARTITION: HASH_PARTITIONED: k1[#0], k2[#1]                                                                              |
|                                                                                                                            |
|   HAS_COLO_PLAN_NODE: false                                                                                                |
|                                                                                                                            |
|   STREAM DATA SINK                                                                                                         |
|     EXCHANGE ID: 111                                                                                                       |
|     UNPARTITIONED                                                                                                          |
|                                                                                                                            |
|   108:VSORT                                                                                                                |
|   |  order by: k1[#5] ASC, k2[#6] ASC, (k1 * k2)[#7] ASC                                                                   |
|   |  offset: 0                                                                                                             |
|   |                                                                                                                        |
|   102:VOlapScanNode                                                                                                        |
|      TABLE: regression_test_datatype_p0_decimalv3.test_decimal128_overflow2(test_decimal128_overflow2), PREAGGREGATION: ON |
|      partitions=1/1 (test_decimal128_overflow2), tablets=8/8, tabletList=22841,22843,22845 ...                             |
|      cardinality=6, avgRowSize=0.0, numNodes=1                                                                             |
|      pushAggOp=NONE                                                                                                        |
|      projections: k1[#0], k2[#1], (k1[#0] * k2[#1])                                                                        |
|      project output tuple id: 1                                                                                            |
+----------------------------------------------------------------------------------------------------------------------------+
36 rows in set (0.03 sec)
Why failed:

Multiple BEs
Fragments 0 and 1 are MUST on different BEs
Pipeline task of VOlapScanNode which executes k1*k2 failed sets query status to cancelled
Pipeline task of VSort call try close, send Cancelled status to VMergeExchange
sort_curso did not throw exception when it meets error
2023-12-27 10:06:01 +08:00
6440fbfab6 [feature](scan) Implement parallel scanning by dividing the tablets based on the row range (#28967)
* [feature](scan) parallel scann on dup/mow mode

* fix bugs
2023-12-26 17:18:41 +08:00
f30e50676e [opt](scanner) optimize the number of threads of scanners (#28640)
1. Remove `doris_max_remote_scanner_thread_pool_thread_num`, use `doris_scanner_thread_pool_thread_num` only.
2. Set the default value `doris_scanner_thread_pool_thread_num` as `std::max(48, CpuInfo::num_cores() * 4)`
2023-12-26 10:24:12 +08:00
137f785698 [fix](parquet_reader) misused bool pointer (#28986)
Signed-off-by: pengyu <pengyu@selectdb.com>
2023-12-25 22:58:08 +08:00
c2c5df9341 [opt](assert_num_rows) support filter in AssertNumRows operator and fix some explain (#28935)
* NEED

* Update pipeline x

* fix pipelinex compile
2023-12-25 22:47:23 +08:00
0af9371a96 [fix](hash join) fix column ref DCHECK failure of hash join node block mem reuse (#28991)
Introduced by #28851, after evaluating build side expr, some columns in resulting block may be referenced more than once in the same block.

e.g. coalesce(col_a, 'string') if col_a is nullable but actually contains no null values, in this case funcition coalesce will insert a new nullable column which references the original col_a.
2023-12-25 22:19:01 +08:00
7081139bdc [fix](block) fix be core while mutable block merge may cause different row size between columns in origin block (#27943) 2023-12-25 20:35:22 +08:00
e9e1e2894b [performance](variant) support topn 2phase read for variant column (#28318)
[performance](variant) support topn 2phase read for variant column
2023-12-25 11:50:41 +08:00
f374beaa4e [fix](log) regularise some BE error type and fix a load task check #28729 2023-12-25 10:45:19 +08:00
e326ebb63e [feature](pipelineX) control exchange sink by memory usage (#28814) 2023-12-25 10:31:50 +08:00
b7ae7a07c7 [fix](join) incorrect result of left semi/anti join with empty build side (#28898) 2023-12-25 09:07:38 +08:00
1545c36d16 Revert "[bugfix](scannercore) scanner will core in deconstructor during collect profile (#28727)" (#28931)
This reverts commit 4066de375efe6ff8e156a61df4f9316b3d9eaa4e.
2023-12-24 20:37:33 +08:00
db1da161f5 [optimize](zonemap) skip zonemap if predicate does not support_zonemap (#28595)
* [optimize](zonemap) skip zonemap if predicate does not support_zonemap #27608 (#28506)
2023-12-24 19:34:13 +08:00
96d4778f2e [fix](parquet) the end offset of column chunk may be wrong in parquet metadata (#28891) 2023-12-23 22:21:04 +08:00
2014396707 [fix](block) add block columns size dcheck (#28539) 2023-12-23 15:21:53 +08:00
e51f75e424 [FIX](map)fix map with rowstore table (#28877) 2023-12-23 12:11:06 +08:00
4066de375e [bugfix](scannercore) scanner will core in deconstructor during collect profile (#28727) 2023-12-23 11:09:46 +08:00
fa0ad56817 [exec](compress) use FragmentTransmissionCompressionCodec control the exchange compress behavior (#28818) 2023-12-22 19:50:57 +08:00
aca8406e31 [refactor](executor)remove scan group #28847 2023-12-22 17:05:50 +08:00
d75300f166 [fix](hash join) fix stack overflow caused by evaluate case expr on huge build block (#28851) 2023-12-22 15:45:12 +08:00
012e66729a [improvement](executor) Add tvf and regression test for Workload Scheduler (#28733)
1 Add select workload schedule policy tvf
2 Add reg test
2023-12-22 12:09:51 +08:00
0af6bd6390 [fix](group-commit) check if wal need recovery is abnormal (#28769) 2023-12-22 11:06:11 +08:00
0b9b1be1f1 [fix](function) Fix from_second functions overflow and wrong result (#28685) 2023-12-22 10:22:49 +08:00
e74ff95087 [fix](compaction) compaction should catch exception when vertical block reader read next block (#28625) 2023-12-21 20:30:37 +08:00
0070909d30 [fix](group commit)Fix the issue of duplicate addition of wal path when encouter exception (#28691) 2023-12-21 20:27:33 +08:00
4f1aebb8e8 (topN)runtime_predicate is only triggered when the column name is obtained (#28419)
Issue Number: close #27485
2023-12-21 18:08:23 +08:00
bcf2683b9d [fix](scanner) fix concurrency bugs when scanner is stopped or finished (#28650)
`ScannerContext` will schedule scanners even after stopped, and confused with `_is_finished` and `_should_stop`.
 Only Fix the concurrency bugs when scanner is stopped or finished reported in https://github.com/apache/doris/pull/28384
2023-12-21 10:37:58 +08:00
970e1c8475 [fix](group_commit) fix group commit cancel stuck (#28749) 2023-12-21 10:32:21 +08:00
007f152e5e [Improve](compile) add __AVX2__ macro for JsonbParser (#28754)
* [Improve](compile) add `__AVX2__` macro for JsonbParser

* throw exception instead of CHECK
2023-12-21 10:25:26 +08:00
18ad8562f2 [refactor](broadcastbuffer) using a queue to remove ref and unref codes (#28698)
Co-authored-by: yiguolei <yiguolei@gmail.com>Add a new class broadcastbufferholderqueue to manage holders
Using shared ptr to manage holders, not use ref and unref, it is too difficult to maintain.
2023-12-20 21:23:25 +08:00
280a01b815 [pipelineX](improvement) Support global runtime filter (#28692) 2023-12-20 20:06:26 +08:00
504693be7f [bug](coredump) Fix coredump in aggregation node's destruction(#28684)
fix coredump in aggregation node's destruction
2023-12-20 20:02:48 +08:00
36857006cd [Fix](json reader) fix json reader crash due to fmt::format_to (#28737)
```
4# __gnu_cxx::__verbose_terminate_handler() [clone .cold] at ../../../../libstdc++-v3/libsupc++/vterminate.cc:75
5# __cxxabiv1::__terminate(void (*)()) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
6# 0x00005622F33D22B1 in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be
7# 0x00005622F33D2404 in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be
8# fmt::v7::detail::error_handler::on_error(char const*) in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be
9# char const* fmt::v7::detail::parse_replacement_field<char, fmt::v7::detail::format_handler<fmt::v7::detail::buffer_appender<char>, char, fmt::v7::basic_format_context<fmt::v7::detail::buffer_appender<char>, char> >&>(char const*, char const*, fmt::v7::detail::format_handler<fmt::v7::detail::buffer_appender<char>, char, fmt::v7::basic_format_context<fmt::v7::detail::buffer_appender<char>, char> >&) in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be
10# void fmt::v7::detail::vformat_to<char>(fmt::v7::detail::buffer<char>&, fmt::v7::basic_string_view<char>, fmt::v7::basic_format_args<fmt::v7::basic_format_context<fmt::v7::detail::buffer_appender<fmt::v7::type_identity<char>::type>, fmt::v7::type_identity<char>::type> >, fmt::v7::detail::locale_ref) in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be
11# doris::vectorized::NewJsonReader::_append_error_msg(rapidjson::GenericValue<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:924
12# doris::vectorized::NewJsonReader::_set_column_value
```
2023-12-20 19:58:30 +08:00
7b96730e87 [fix](block) fix nullptr in MutableBlock::allocated_bytes (#28738) 2023-12-20 19:46:13 +08:00
e8d0569d8b [refine](pipelineX)Make the 'set ready' logic of SenderQueue in pipelineX the same as that in the pipeline (#28488) 2023-12-20 19:26:00 +08:00
2b2d3d0eb1 [fix](meta_scanner) fix meta_scanner process ColumnNullable (#28711) 2023-12-20 17:41:38 +08:00
0c9c32c52d [Feature](datatype) update be ut codes and fix bugs for IPv4/v6 (#28670) 2023-12-20 14:38:46 +08:00
bcc32b5b26 [feature](invert index) match_regexp feature added (#28257) 2023-12-20 14:30:35 +08:00
c72191eb9e [refactor](profile&names) using dst_id in pipelinex profile to be same as non pipeline; rename some function names (#28626)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-12-19 17:44:29 +08:00
111185407c [Improve](tvf)jni-avro support split file (#27933) 2023-12-19 16:37:34 +08:00
b142ade69e [refactor](renamefile) rename some files according to the class names (#28606) 2023-12-19 14:10:11 +08:00
e362bf674f [regression-test](memtable) test memtable flush is high priority for vtable writerV1 (#28502) 2023-12-19 12:33:12 +08:00
Pxl
d6514618b2 [Improvement](decimal) reduce overhead on disable check decimal overflow (#28249)
reduce overhead on disable check decimal overflow
2023-12-19 10:12:30 +08:00
Pxl
89d728290d [Chore](execute) remove some unused code and adjust check_row_nums #28576 2023-12-19 09:55:50 +08:00
6503aaf7db [feature](planner) allow HLL and QUANTILE_STATE types on duplicate and unique table (#28546) 2023-12-19 09:54:24 +08:00
66fbb22ad7 [fix](group commit) Fix some wal problems on group commit (#28554) 2023-12-19 09:51:03 +08:00