doris

Author	SHA1	Message	Date
TengJianPing	0436013baf	[fix](decimal) fix cast decimal overflow and add test cases for casting decimalv2 to decimalv3 (#29165 )	2023-12-27 20:58:37 +08:00
TengJianPing	5f71691401	[fix](read) fix unexpected overflow of uninitialized column data in VStatisticsIterator::next_batch (#29141 )	2023-12-27 20:58:02 +08:00
lihangyu	6d817bc253	[fix](topn opt) avoid using topn runtime predicate which segment does not contain such column(column unique id) when pruning segment (#29148 )	2023-12-27 20:31:03 +08:00
Gabriel	c75e63a2a5	[Improvement](scan) Use scanner to do projection of scan node (#29124 )	2023-12-27 16:00:52 +08:00
Gabriel	cd1e109cc3	[debug string](pipeline) Add necessary debug info (#29119 )	2023-12-27 15:57:22 +08:00
Ashin Gau	2d2f14bc75	[fix](paimon) use SlotDescriptor to parse the required fields (#28990 ) Before this PR, Paimon has created the schema of `VectorTable` by accessing meta information. However, once the schema of `VectorTable` in java is not same as `Block` in c++, BE will crashed, and there is no good way to troubleshoot errors.	2023-12-27 15:45:53 +08:00
lihangyu	cfed36afbf	[Fix](topn opt) prevent from merge __TEMP__ column in segment iterator (#29121 )	2023-12-27 15:42:48 +08:00
airborne12	6f5672f318	[Refact](inverted index) refactor inverted index writer init (#29072 )	2023-12-27 12:49:26 +08:00
TengJianPing	3e5c8d9949	[fix](read) remove logic of estimating count of rows to read in segment iterator to avoid wrong result of unique key. (#29109 )	2023-12-27 12:25:14 +08:00
abmdocrt	9ff8bd2e9c	[Enhancement](Wal)Support dynamic wal space limit (#27726 )	2023-12-27 11:51:32 +08:00
zhiqiang	6d26aca4ca	[fix](pipeline) sort_merge should throw exception in has_next_block if got failed status (#29076 ) Test in regression-test/suites/datatype_p0/decimalv3/test_decimalv3_overflow.groovy::249 sometimes failed when there are multiple BEs and FE process report status slowly for some reason. explain select k1, k2, k1 * k2 from test_decimal128_overflow2 order by 1,2,3 -------------- +----------------------------------------------------------------------------------------------------------------------------+ \| Explain String(Nereids Planner) \| +----------------------------------------------------------------------------------------------------------------------------+ \| PLAN FRAGMENT 0 \| \| OUTPUT EXPRS: \| \| k1[#5] \| \| k2[#6] \| \| (k1 * k2)[#7] \| \| PARTITION: UNPARTITIONED \| \| \| \| HAS_COLO_PLAN_NODE: false \| \| \| \| VRESULT SINK \| \| MYSQL_PROTOCAL \| \| \| \| 111:VMERGING-EXCHANGE \| \| offset: 0 \| \| \| \| PLAN FRAGMENT 1 \| \| \| \| PARTITION: HASH_PARTITIONED: k1[#0], k2[#1] \| \| \| \| HAS_COLO_PLAN_NODE: false \| \| \| \| STREAM DATA SINK \| \| EXCHANGE ID: 111 \| \| UNPARTITIONED \| \| \| \| 108:VSORT \| \| \| order by: k1[#5] ASC, k2[#6] ASC, (k1 * k2)[#7] ASC \| \| \| offset: 0 \| \| \| \| \| 102:VOlapScanNode \| \| TABLE: regression_test_datatype_p0_decimalv3.test_decimal128_overflow2(test_decimal128_overflow2), PREAGGREGATION: ON \| \| partitions=1/1 (test_decimal128_overflow2), tablets=8/8, tabletList=22841,22843,22845 ... \| \| cardinality=6, avgRowSize=0.0, numNodes=1 \| \| pushAggOp=NONE \| \| projections: k1[#0], k2[#1], (k1[#0] * k2[#1]) \| \| project output tuple id: 1 \| +----------------------------------------------------------------------------------------------------------------------------+ 36 rows in set (0.03 sec) Why failed: Multiple BEs Fragments 0 and 1 are MUST on different BEs Pipeline task of VOlapScanNode which executes k1*k2 failed sets query status to cancelled Pipeline task of VSort call try close, send Cancelled status to VMergeExchange sort_curso did not throw exception when it meets error	2023-12-27 10:06:01 +08:00
Guangming Lu	a8e6676640	[Bug](security) BE download_files function exists log print sensitive msg #28592 (#28594 )	2023-12-26 21:59:47 +08:00
Jerry Hu	6440fbfab6	[feature](scan) Implement parallel scanning by dividing the tablets based on the row range (#28967 ) * [feature](scan) parallel scann on dup/mow mode * fix bugs	2023-12-26 17:18:41 +08:00
Kaijie Chen	4a60d01dc7	[improve](move-memtable) increase load_stream_flush_token_max_tasks (#29011 )	2023-12-26 17:08:49 +08:00
zhengyu	1964a77d6c	[enhencement](config) change default memtable size & loadStreamPerNode & default load parallelism (#28977 ) We change memtable size from 200MB to 100MB to achieve smoother flush performance. We change loadStreamPerNode from 20 to 60 to avoid stream rpc to be the bottleneck when enable memtable_on_sink_node. We change default s3&broker load parallelsim to make the most of CPUs on moderm multi-core systems. Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2023-12-26 16:22:52 +08:00
Kaijie Chen	31db633624	[improve](load) add profile for WaitFlushLimitTime (#29013 )	2023-12-26 15:41:54 +08:00
zzzxl	52eeee347f	[opt](compound) Optimize by deleting the compound expr after obtaining the final result (#28934 )	2023-12-26 14:10:53 +08:00
plat1ko	c8ed14f11c	[enhance](tablet) Reduce log in tablet meta (#28719 )	2023-12-26 13:37:30 +08:00
lihangyu	92660bb1b2	[chore](config) modify `variant_ratio_of_defaults_as_sparse_column` from 0.95 to 1 (#28984 ) since sparse column is not stable at present	2023-12-26 10:24:43 +08:00
Ashin Gau	f30e50676e	[opt](scanner) optimize the number of threads of scanners (#28640 ) 1. Remove `doris_max_remote_scanner_thread_pool_thread_num`, use `doris_scanner_thread_pool_thread_num` only. 2. Set the default value `doris_scanner_thread_pool_thread_num` as `std::max(48, CpuInfo::num_cores() * 4)`	2023-12-26 10:24:12 +08:00
lihangyu	75a45484b6	[chore](config) modify `tablet_schema_cache_recycle_interval` from 24h to 1h (#28980 ) To prevent from too many tablet schema cache in memory and lead to performance issue when hold lock to erase item	2023-12-26 00:34:58 +08:00
plat1ko	cefae3dc90	[bug](storage) Fix gc rowset bug (#28979 )	2023-12-26 00:29:03 +08:00
py023	137f785698	[fix](parquet_reader) misused bool pointer (#28986 ) Signed-off-by: pengyu <pengyu@selectdb.com>	2023-12-25 22:58:08 +08:00
zhiqiang	c2c5df9341	[opt](assert_num_rows) support filter in AssertNumRows operator and fix some explain (#28935 ) * NEED * Update pipeline x * fix pipelinex compile	2023-12-25 22:47:23 +08:00
TengJianPing	0af9371a96	[fix](hash join) fix column ref DCHECK failure of hash join node block mem reuse (#28991 ) Introduced by #28851, after evaluating build side expr, some columns in resulting block may be referenced more than once in the same block. e.g. coalesce(col_a, 'string') if col_a is nullable but actually contains no null values, in this case funcition coalesce will insert a new nullable column which references the original col_a.	2023-12-25 22:19:01 +08:00
caiconghui	7081139bdc	[fix](block) fix be core while mutable block merge may cause different row size between columns in origin block (#27943 )	2023-12-25 20:35:22 +08:00
walter	91e5b47439	[fix](hdfs) Fix HdfsFileSystem::exists_impl crash (#28952 ) Calling hdfsGetLastExceptionRootCause without initializing ThreadLocalState will crash. This PR modifies the condition for determining the existence of a hdfs file, because hdfsExists will set errno to ENOENT when the file does not exist, we can use this condition to check whether a file existence rather than check the existence of the root cause.	2023-12-25 19:18:01 +08:00
Kaijie Chen	c2eabbd441	[fix](load) fix nullptr when getting memtable flush running count (#28942 ) * [fix](load) fix nullptr when getting memtable flush running count * style	2023-12-25 13:49:18 +08:00
lihangyu	e9e1e2894b	[performance](variant) support topn 2phase read for variant column (#28318 ) [performance](variant) support topn 2phase read for variant column	2023-12-25 11:50:41 +08:00
zclllyybb	f374beaa4e	[fix](log) regularise some BE error type and fix a load task check #28729	2023-12-25 10:45:19 +08:00
Mryange	3273e0e635	[refactor](pipelineX)do not override dependency() function in pipelineX (#28848 )	2023-12-25 10:36:31 +08:00
Mryange	24b1b4d96b	[fix](pipelineX) fix use global rf when there no shared_scans (#28869 )	2023-12-25 10:35:22 +08:00
Mryange	e326ebb63e	[feature](pipelineX) control exchange sink by memory usage (#28814 )	2023-12-25 10:31:50 +08:00
zzzxl	d42fd68d6b	[opt](invert index) Empty strings are not written to the index in the case of TOKENIZED (#28822 )	2023-12-25 10:23:07 +08:00
Jerry Hu	b7ae7a07c7	[fix](join) incorrect result of left semi/anti join with empty build side (#28898 )	2023-12-25 09:07:38 +08:00
Gavin Chou	bade50db56	[chore](test) Add testing util sync point (#28924 )	2023-12-24 21:59:11 +08:00
huanghaibin	145683ccdb	[improvement](group commit) make get column function more reliable when replaying wal (#28900 )	2023-12-24 21:17:39 +08:00
yiguolei	1545c36d16	Revert "[bugfix](scannercore) scanner will core in deconstructor during collect profile (#28727 )" (#28931 ) This reverts commit 4066de375efe6ff8e156a61df4f9316b3d9eaa4e.	2023-12-24 20:37:33 +08:00
Kang	db1da161f5	[optimize](zonemap) skip zonemap if predicate does not support_zonemap (#28595 ) * [optimize](zonemap) skip zonemap if predicate does not support_zonemap #27608 (#28506)	2023-12-24 19:34:13 +08:00
Xin Liao	dfbf082e06	[fix](merge-on-write) migration may cause duplicate keys for mow table (#28923 )	2023-12-23 23:37:00 +08:00
Ashin Gau	96d4778f2e	[fix](parquet) the end offset of column chunk may be wrong in parquet metadata (#28891 )	2023-12-23 22:21:04 +08:00
zhannngchen	de6c7a792e	[fix](chore) update dcheck to avoid core during stress test (#28895 )	2023-12-23 18:49:57 +08:00
nanfeng	2014396707	[fix](block) add block columns size dcheck (#28539 )	2023-12-23 15:21:53 +08:00
amory	e51f75e424	[FIX](map)fix map with rowstore table (#28877 )	2023-12-23 12:11:06 +08:00
yiguolei	4066de375e	[bugfix](scannercore) scanner will core in deconstructor during collect profile (#28727 )	2023-12-23 11:09:46 +08:00
zhengyu	43776465d9	[fix](segcompaction) disable segcompaction by default (#28906 ) Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2023-12-23 07:43:41 +08:00
Kaijie Chen	3b830f89a7	[improve](move-memtable) avoid using heavy work pool during append data (#28745 )	2023-12-22 22:51:30 +08:00
Kaijie Chen	f781f0cf24	[improve](load) limit delta writer flush task parallelism (#28883 )	2023-12-22 21:50:56 +08:00
Kaijie Chen	b1c5747f56	[improve](load) remove extra layer of heavy work pool in tablet_writer_add_block (#28550 )	2023-12-22 20:10:50 +08:00
Kaijie Chen	18c9ebce95	[improve](move-memtable) tweak load stream flush token num and max tasks (#28884 )	2023-12-22 20:08:47 +08:00

1 2 3 4 5 ...

6430 Commits