doris

Author	SHA1	Message	Date
Adonis Ling	8bfb89c100	[feature-wip](array-type) Add some regression tests for nested array (#12322 ) #11392 made _input_block in each BetaRowsetReaders sharable. However, for some types (e.g. nested array with more than 1 depth), the _column_vector_batches in RowBlockV2 can be nested which means that there is a ColumnVectorBatch inside another ColumnVectorBatch. In this case, the data of inner ColumnVectorBatch may be corrupted because the data of _input_block is copied shallowly to the _output_block.	2022-09-05 14:05:24 +08:00
Jerry Hu	7b352c93ff	[improvement](sink) avoid frequent allocation and deallocation when serializing block (#12310 )	2022-09-05 12:23:43 +08:00
TaoZex	7929500608	[typo](docs)The table_function calling reset() function should set _eos to false #12323	2022-09-05 08:29:19 +08:00
morrySnow	7f10fa9768	[fix](compile)compile error when use clang on aarch64 platform (#12319 )	2022-09-05 08:28:51 +08:00
Gabriel	d5e5afe437	[Bug](function) disable LUT for yearweek (#12324 )	2022-09-05 08:27:43 +08:00
xy720	62561834a8	[Feature](array-type) Support is-null-predicate for array type (#12237 )	2022-09-03 11:37:57 +08:00
xy720	e7303c12c7	[Enhancement](array-type) Support Floating/Decimal type for array aggregation functions (#12271 )	2022-09-03 09:55:56 +08:00
Pxl	a8c8ebf5cf	[Enhancement](compaction) empty string optimize for binary dict code (#12259 ) improve write empty string perfomance.	2022-09-02 14:25:19 +08:00
Ashin Gau	202ad5c659	[feature-wip](parquet-reader) bug fix, the number of rows are different among columns in a block (#12228 ) 1. `ExprContext` is delete in `ParquetReader::close()`, but it has not been closed, so the `DCHECH` in `~ExprContext()` is failed. the lifetime of `ExprContext` is managed by scan node, so we should not delete its pointer in `ParquetReader::close()`. 2. `RowGroupReader::next_batch` will update `_read_rows` in every column loop, and does not ensure the number of rows in every column are equal. 3. The skipped row ranges are variables in stack, which are released when calling `ArrayColumnReader::read_column_data`, so we should copy them out.	2022-09-02 09:50:25 +08:00
Mingyu Chen	3ce305134a	[fix](scan) fix potential wrong cancel when sql has limit (#12224 )	2022-09-01 19:11:40 +08:00
Gabriel	3bcab8bbef	[feature](function) support now/current_timestamp functions with precision (#12219 ) * [feature](function) support now/current_timestamp functions with precision	2022-09-01 14:35:12 +08:00
pengxiangyu	c5481dfdf7	[fix](remote)Fix bug for Segment::open() in case: config::file_cache_type (#12249 ) * fix bug for Segment::open() in case: config::file_cache_type * fix bug for Segment::open() in case: config::file_cache_type	2022-09-01 14:16:41 +08:00
TengJianPing	f294d33332	[bugfix](index) index page should not be bitshuffle decoded (#12231 ) * [bugfix](index) index page should not be bitshuffle decoded * minor change	2022-09-01 11:56:44 +08:00
camby	fc05d54f0d	[fix](array-type) array_sort function with empty input #12175 Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-09-01 10:54:09 +08:00
HappenLee	8c8078ad28	[fix](projections) get error row_descriptor when have projections on ExecNode (#12232 ) When ExecNode's projections is not empty, it use output row descriptor to initialize the block before doing projection. But we should use original row descriptor. This PR fix it.	2022-09-01 10:48:10 +08:00
yixiutt	60a2fa7dea	[Improvement](compaction) copy row in batch in VCollectIterator&VGenericIterator (#12214 ) In VCollectIterator&VGenericIterator, use insert_range_from to copy rows in a block which is continuous to save cpu cost. If rows in rowset and segment are non overlapping, this whill improve 30% throughput of compaction.If rows are completely overlapping such as load two same files, the throughput goes nearly same as before. Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-09-01 10:20:17 +08:00
Gabriel	90fb3b7783	[Improvement](load) accelerate tablet sink (#12174 )	2022-09-01 10:08:09 +08:00
Jibing-Li	ec4863b63a	[feature-wip](new-scan)Add new file scan node (#12048 ) Related pr: #11582 This is the new file scan node and scanner for external hms catalog.	2022-09-01 10:01:20 +08:00
luozenglin	65051d67cf	[fix](yearweek) fixed the yearweek result error when mode is set to 1 (#12234 )	2022-09-01 09:46:38 +08:00
Ashin Gau	1cc9eeeb1a	[feature-wip](parquet-reader) read and generate array column (#12166 ) Read and generate parquet array column. When D=1, R=0, representing an empty array. Empty array is not a null value, so the NullMap for this row is false, the offset for this row is [offset_start, offset_end) whose `offset_start == offset_end`, and offset_end is the start offset of the next row, so there is no value in the nested primitive column. When D=0, R=0, representing a null array, and the NullMap for this row is true.	2022-08-31 17:08:12 +08:00
HappenLee	573e5476dd	[Opt](load) Speed up the vectorized load (#12146 ) * [Opt](load) Speed up the vectorized load	2022-08-31 16:23:36 +08:00
zxealous	254cb321b9	[optimize](remote) Optimize cache reader use a pre-created buffer when downloading the cache (#12165 ) * optimize cache reader * add description for config * optimize cache reader * optimize cache reader	2022-08-31 10:15:40 +08:00
Xinyi Zou	f72d2559cf	[fix](compile) Fix compile error '<unknown>' may be used uninitialized in PODArray::insert_prepare #12202	2022-08-31 09:12:28 +08:00
Mingyu Chen	22430cd7bb	[feature](stmt) add ADMIN COPY TABLET stmt for local debug (#12176 ) Add a new stmt ADMIN COPY TABLET for easy copy a tablet to local env to reproduce problem. See document for more details.	2022-08-31 09:06:49 +08:00
Zhengguo Yang	fdd236cc9b	[fix](simdjson) disable simdjson by default, wait for fix the bug (#12179 )	2022-08-30 20:24:46 +08:00
xy720	370b92a2ea	[Bug](compile) fix clang build for be (#12183 )	2022-08-30 17:58:54 +08:00
Kikyou1997	9a74ad1702	[feature](Nereids)add the ability of projection on each ExecNode and add column prune on OlapScan (#11842 ) We have added logical project before, but to actually finish the prune to reduce the data IO, we need to add related supports in translator and BE. This PR: - add projections on each ExecNode in BE - translate PhysicalProject into projections on PlanNode in FE - do column prune on ScanNode in FE Co-authored-by: HappenLee <happenlee@hotmail.com>	2022-08-30 16:17:10 +08:00
Mingyu Chen	a16cf0e2c8	[feature-wip](scan) add profile for new olap scan node (#12042 ) Copy most of profiles from VOlapScanNode and VOlapScanner to NewOlapScanNode and NewOlapScanner. Fix some blocking bug of new scan framework. TODO: Memtracker Opentelemetry spen The new framework is still disabled by default, so it will not effect other feature.	2022-08-30 10:55:48 +08:00
Xinyi Zou	8370115cf6	[enhancement](memtracker) Improve performance of tracking real physical memory of PODArray #12168	2022-08-30 10:22:12 +08:00
yiguolei	2f192019d3	[bugfix](delete hanlder) delete predicate is merged and could not find schema cause core dump (#12161 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-08-30 09:18:21 +08:00
Pxl	67e94d2aea	[Enhancement](compaction) add compaction use time count (#12141 )	2022-08-30 09:18:02 +08:00
Xinyi Zou	09b8d32421	[fix](memtracker) Fix mem limit exceed return wrong format (#12139 )	2022-08-29 21:07:02 +08:00
Zhengguo Yang	ed131b8eb0	[Bugfix](coredump) fix coredump cause by fmt::format param malformt (#12138 ) fix coredump cause by fmt::format param malformt	2022-08-29 12:45:22 +08:00
Gabriel	af09c1f4eb	[Improvement](window funnel) restrict timestamp to datetime type in window funnel (#12123 )	2022-08-29 12:14:04 +08:00
Gabriel	5f7d6e8f2b	[Refactor](predicate) Unify Conditions and ColumnPredicate (#11985 )	2022-08-29 12:11:22 +08:00
minghong	62e3bd338e	[refactor](BE) return error status when vslot_ref contains invalid slot_id (#12106 ) In current implementation, we detect invalid slot at execute phase. At execute phase, it is hard to get useful information for further debug. This pr moves error detection ahead to prepare phase, so that we can log related tuple descriptors.	2022-08-29 12:07:08 +08:00
plat1ko	db07e51cd3	[refactor](status) Refactor status handling in agent task (#11940 ) Refactor TaggableLogger Refactor status handling in agent task: Unify log format in TaskWorkerPool Pass Status to the top caller, and replace some OLAPInternalError with more detailed error message Status Premature return with the opposite condition to reduce indention	2022-08-29 12:06:01 +08:00
pengxiangyu	ac425d4bf3	[fix](remote)Fix bug for cache reader (#12104 )	2022-08-29 11:28:17 +08:00
carlvinhust2012	44c4a45f72	[fix](array-type) fix the wrong data when use stream load to import '\N' (#12102 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-08-29 09:53:37 +08:00
Ashin Gau	dec576a991	[feature-wip](parquet-reader) generate null values and NullMap for parquet column (#12115 ) Generate null values and NullMap for the nullable column by analyzing the definition levels.	2022-08-29 09:30:32 +08:00
Gabriel	6e6269c682	[Improvement](load) accelerate streamload and compaction (#12119 ) * [Improvement](load) accelerate streamload and compaction	2022-08-28 23:10:47 +08:00
pengxiangyu	a6e2e2f3bc	[feature](remote)Add cache files cleaner for remote olap files (#11959 )	2022-08-26 23:59:36 +08:00
Ashin Gau	0b5bb565a7	[feature-wip](parquet-reader) parquet dictionary decoder (#11981 ) Parse parquet data with dictionary encoding. Using the PLAIN_DICTIONARY enum value is deprecated in the Parquet 2.0 specification. Prefer using RLE_DICTIONARY in a data page and PLAIN in a dictionary page for Parquet 2.0+ files. refer: https://github.com/apache/parquet-format/blob/master/Encodings.md	2022-08-26 19:24:37 +08:00
Zhengguo Yang	f3f17eb222	[Bugfix](load) fix be will coredump when parsing malformed json file using simdjson (#12062 ) * [Bugfix](load) fix be will coredump when parsing malformed json file using simdjson	2022-08-26 18:01:19 +08:00
carlvinhust2012	fba2658a1d	[fix](array-type) fix the be core dump when use collect_list result to insert (#12045 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-08-26 18:00:43 +08:00
Pxl	3af0745c8f	[Bug](function) fix aggFnParams set not correct (#12006 )	2022-08-26 14:29:56 +08:00
Xinyi Zou	22157077e9	[fix](memtracker) Optimize the return msg of process memory limit exceed #12086 Return the real process memory information when the process exceeds mem limit Optimize the memory exceed limit log printing logic process tracker does not participate in process memory limit.	2022-08-26 14:28:46 +08:00
Xinyi Zou	9caaa4bfbd	[fix](memory) fix set disable_chunk_allocator_in_vec=false performance #12092	2022-08-26 14:28:12 +08:00
yiguolei	ccff3f5711	[bugfix](light weight schema change) support delete condition in schema change (#11869 ) * [bugfix](light weight schema change) support delete condition in schema change Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-08-26 11:45:55 +08:00
Xinyi Zou	82ca62dfcc	[fix](memory) Fix disable_mem_pools to disable cache #12087	2022-08-26 11:43:19 +08:00

1 2 3 4 5 ...

2709 Commits