doris

Author	SHA1	Message	Date
walter	d1dbe7bfc8	[fix](reader) fix leak in Level1Iteartor (#23612 ) _merge_next() and _normal_next() leak _cur_child when _cur_child->next() returns failure.	2023-08-29 23:32:24 +08:00
zy-kkk	030df6db35	[fix](odbc) fix odbc insert string data to sqlserve (#23364 )	2023-08-29 21:47:50 +08:00
Siyang Tang	1ac0ff0ea9	[feature](delete-predicate) support delete sub predicate v2 (#22442 ) New structure for delete sub predicate. Delete sub predicate uses a string type condition_str to stored temporarily now and fields will be extracted from it using std::regex, which may introduces stack overflow when matching a extremely large string(bug of libc). Now we attempt to use a new PB structure to hold the delete sub predicate, to avoid that problem. message DeleteSubPredicatePB { optional int32 column_unique_id = 1; optional string column_name = 2; optional string op = 3; optional string cond_value = 4; } Currently, 2 versions of sub predicate will both be filled. For query, we use the v2, and during compaction we still use v1. The old rowset meta with delete predicates which had sub predicate v1 will be attempted to convert to v2 when read from PB. Moreover, efforts will be made to rewrite these meta with the new delete sub predicate. Make preparation to use column unique id to specify a column globally. Using the column unique id rather than the column name to identify a column is vital for flexible schema change. The rewritten delete predicate will attach column unique id.	2023-08-29 19:37:23 +08:00
zhangstar333	94a8fa6bc9	[bug](function) fix explode_number function return wrong rows (#23603 ) before the explode_number function result is random with const value. because the _cur_size is reset, so it's can't insert values to column.	2023-08-29 19:02:49 +08:00
huanghaibin	82a4f114e4	[improvement](compaction) add an option on delete stale rowset by judging _stale_rs_metas size when doing compaction (#23448 )	2023-08-29 17:40:37 +08:00
huanghaibin	1410a15a61	[fix](compaction) print column name when checking block ColumnPtr is nullptr on get block byte (#23338 )	2023-08-29 17:24:48 +08:00
lihangyu	0cece561f9	[refactor](segment iterator) remove std::map in iterator use std::vector instead and not rely on unique id to idenfy position (#23505 )	2023-08-29 16:43:32 +08:00
amory	f7a3d2778a	[FIX](array)update array olapconvertor and support array nested other complex type (#23489 ) * update array olapconvertor and support array nested other complex type * update for inverted index	2023-08-29 16:18:11 +08:00
amory	993659cd0b	[FIX](serde) fix handle serde error #23565	2023-08-29 14:55:35 +08:00
Qi Chen	97eb2b9172	[Fix](multi-catalog) Fix broker load reader and hdfs reader issue. (#23529 ) Broker load with broker sometimes will throw 'Invalid orc post script length'. hdfs query sometimes will throw 'Invalid orc post script length'.	2023-08-29 13:45:48 +08:00
Gabriel	7dcde4d529	[bug](decimal) Use max value as result if overflow (#23602 ) * [bug](decimal) Use max value as result if overflow * update	2023-08-29 13:26:25 +08:00
Pxl	7913354f78	add column number check for vsorted_run_merger (#23584 )	2023-08-29 10:41:59 +08:00
yuxuan-luo	0128dd42d9	[fix](regexp_extract_all) fix be OOM when quering with regexp_extrac… (#23284 )	2023-08-29 10:34:12 +08:00
abmdocrt	da9eb79ac4	[Enhancement](Schema hash) Remove schema hash in tablet info (#23516 )	2023-08-29 10:05:12 +08:00
Kaijie Chen	d863cc3a12	[fix](move-memtable) fix tablets to commit (#23577 )	2023-08-29 09:49:07 +08:00
Yongqiang YANG	9c65b7ab96	[improvement](column_reader) move load once to index reader to reduce (#23537 ) memory footprint of column reader	2023-08-29 09:34:27 +08:00
huanghaibin	fbf8499999	[improvement](compaction) reduce the memory using on vertical compaction (#23388 )	2023-08-28 21:54:21 +08:00
HHoflittlefish777	35a1404bbe	[fix](load) add error handle when load data dir (#23457 )	2023-08-28 19:33:50 +08:00
HHoflittlefish777	392437008c	[Improvement](ColumnReader) optimize memory using of ColumnReader meta (#23528 )	2023-08-28 17:57:59 +08:00
Siyang Tang	650cc25ea4	[fix](light-schema-change) fix schema consistency check failed (#23283 )	2023-08-28 16:40:30 +08:00
Gabriel	29b94c4ed7	[pipeline](refactor) refine pipeline fragment context (#23478 )	2023-08-28 15:55:02 +08:00
TengJianPing	7e7cfd17bf	[fix](tablet sink) check data valid of tablet sink data (#23530 ) Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>	2023-08-28 15:54:12 +08:00
Pxl	3049533e63	[Bug](materialized-view) fix core dump on create materialized view when diffrent mv column have same reference base column (#23425 ) * Remove redundant predicates on scan node update fix core dump on create materialized view when diffrent mv column have same reference base column Revert "update" This reverts commit d9ef8dca123b281dc8f1c936ae5130267dff2964. Revert "Remove redundant predicates on scan node" This reverts commit f24931758163f59bfc47ee10509634ca97358676. * update * fix * update * update	2023-08-28 14:40:51 +08:00
Gabriel	28a2e71084	[pipelineX](refactor) refine codes (#23521 ) * [pipelineX](refactor) refine codes * update * update	2023-08-28 14:38:07 +08:00
Jerry Hu	c05319b8eb	[fix](agg) incorrect result of bitmap_agg and bitmap_union (#23558 )	2023-08-28 14:22:19 +08:00
Jerry Hu	5be8d57f52	[fix](be-ut) fix ColumnFixedLenghtObjectTest on 32 bits system (#23519 )	2023-08-28 14:02:05 +08:00
TengJianPing	962221cb18	[test](log) add log for debug case failure (#23506 )	2023-08-28 10:45:25 +08:00
Kang	981586155c	[Improvement][json] optimize performance of json_extract by reusing json path object (#23430 ) * reuse json path to speed up json function * fix typo * clang format * path reentry safe * fix compile error * fix bug of continue	2023-08-27 17:39:10 +08:00
Adonis Ling	e0bf621fe0	[chore](build) Fix compilation errors for BE UT (#23535 ) Issue Number: close #23536 This issue was introduced by #23414 .	2023-08-27 11:52:13 +08:00
Chenyang Sun	153e8f0f72	[imporvement](table property) support for alter table property: skip wirte index , single compaction (#23475 )	2023-08-26 23:52:09 +08:00
Xinyi Zou	ba351af452	[enhancement](thirdparty) upgrade thirdparty libs - again (#23414 ) submit again #23290 (not upgrade brpc, because bthread local has error) protobuf 3.15.0 -> 21.11 glog 0.4.0 -> 0.6.0 lz4 1.9.3 -> 1.9.4 curl 7.79.0 -> 8.2.1 zstd 1.5.2 -> 1.5.5 arrow 7.0.0 -> 13.0.0 abseil 20220623.1 -> 20230125.3 orc 1.7.2 -> 1.9.0 jemalloc for arrow 5.2.1 -> 5.3.0 xsimd 7.0.0 -> 13.0.0 opentelemetry-proto 0.19.0 -> 1.0.0 opentelemetry 1.8.3 -> 1.10.0 new: c-ares -> 1.19.1 grpc -> 1.54.3	2023-08-26 22:59:10 +08:00
Lightman	30e3c5bbe6	[bugfix](file cache) Fix the init file cache coredump (#23464 ) * [bugfix](file cache) Fix the init file cache coredump * fix compile	2023-08-26 16:50:50 +08:00
Mingyu Chen	40be6a0b05	[fix](hive) do not split compress data file and support lz4/snappy block codec (#23245 ) 1. do not split compress data file Some data file in hive is compressed with gzip, deflate, etc. These kinds of file can not be splitted. 2. Support lz4 block codec for hive scan node, use lz4 block codec instead of lz4 frame codec 4. Support snappy block codec For hadoop snappy 5. Optimize the `count()` query of csv file For query like `select count() from tbl`, only need to split the line, no need to split the column. Need to pick to branch-2.0 after this PR: #22304	2023-08-26 12:59:05 +08:00
Yongqiang YANG	bc020112fc	[enhancement](routineload) add debug conf and set broker.name.ttl = 0 (#23302 ) * set broker.name.ttl = 0 * add debug config for librdkafka	2023-08-26 10:56:35 +08:00
Tiewei Fang	f32efe5758	[Fix](Outfile) Fix that it does not report error when export table to S3 with an incorrect ak/sk/bucket (#23441 ) Problem: It will return a result although we use wrong ak/sk/bucket name, such as: ```sql mysql> select * from demo.student -> into outfile "s3://xxxx/exp_" -> format as csv -> properties( -> "s3.endpoint" = "https://cos.ap-beijing.myqcloud.com", -> "s3.region" = "ap-beijing", -> "s3.access_key"= "xxx", -> "s3.secret_key" = "yyyy" -> ); +------------+-----------+----------+----------------------------------------------------------------------------------------------------+ \| FileNumber \| TotalRows \| FileSize \| URL \| +------------+-----------+----------+----------------------------------------------------------------------------------------------------+ \| 1 \| 3 \| 26 \| s3://xxxx/exp_2ae166e2981d4c08-b577290f93aa82ba_ \| +------------+-----------+----------+----------------------------------------------------------------------------------------------------+ 1 row in set (0.15 sec) ``` The reason for this is that we did not catch the error returned by `close()` phase.	2023-08-26 00:19:30 +08:00
slothever	f66f161017	[fix](multi-catalog)fix hive table with cosn location issue (#23409 ) Sometimes, the partitions of a hive table may on different storage, eg, some is on HDFS, others on object storage(cos, etc). This PR mainly changes: 1. Fix the bug of accessing files via cosn. 2. Add a new field `fs_name` in TFileRangeDesc This is because, when accessing a file, the BE will get a hdfs client from hdfs client cache, and different file in one query request may have different fs name, eg, some of are `hdfs://`, some of are `cosn://`, so we need to specify fs name for each file, otherwise, it may return error: `reason: IllegalArgumentException: Wrong FS: cosn://doris-build-1308700295/xxxx, expected: hdfs://[172.xxxx:4007](http://172.xxxxx:4007/)`	2023-08-26 00:16:00 +08:00
Qi Chen	8af1e7f27f	[Fix](orc-reader) Fix incorrect result if null partition fields in orc file. (#23369 ) Fix incorrect result if null partition fields in orc file. ### Root Cause Theoretically, the underlying file of the hive partition table should not contain partition fields. But we found that in some user scenarios, the partition field will exist in the underlying orc/parquet file and are null values. As a result, the pushed down partition field which are null values. filter incorrectly. ### Solution we handle this case by only reading non-partition fields. The parquet reader is already handled this way, this PR handles the orc reader.	2023-08-26 00:13:11 +08:00
Qi Chen	a3a951c71d	[Fix](multi-catalog) Fix load string dict issue for transactional hive tables. (#23306 ) Fix load string dict issue for transactional hive tables. The column name need to pass 'row.column_name'. apache/doris-thirdparty#112	2023-08-26 00:09:12 +08:00
Kaijie Chen	2b6d876280	[feature](move-memtable)[6/7] add options to enable memtable on sink node (#23470 ) Co-authored-by: Siyang Tang <82279870+TangSiyang2001@users.noreply.github.com>	2023-08-25 22:32:22 +08:00
zzzxl	6e6da733c6	[fix](invert index) fix the keyword type index length limit (#23503 )	2023-08-25 21:34:11 +08:00
zhangdong	17e7c1ca53	[fix](fqdn)Fqdn with ipv6 (#22454 ) now,`hostname_to_ip` only can resolve `ipv4`,Therefore, a method is provided to parse ipv4 or ipv6 based on parameters。 when `_heartbeat` call `hostname_to_ip`,Resolve to ipv4 or ipv6, determined by `BackendOptions.is_bind_ipv6` Decision Additionally, a method is provided to first attempt to parse the host into ipv4, and then try ipv6 if it fails	2023-08-25 21:24:55 +08:00
Qi Chen	29273771f7	[Fix](multi-catalog) Fix hive incorrect result by disable string dict filter if exprs contain null expr. (#23361 ) Issue Number: close #21960 Fix hive incorrect result by disable string dict filter if exprs contain null expr.	2023-08-25 21:16:43 +08:00
Jerry Hu	9d1c702b3a	[improvement](function) do not use hyperscan for non-const partterns in like function (#23495 )	2023-08-25 20:40:23 +08:00
Gabriel	49a32c2ee0	[pipelineX](fix) fix two phase execution and add test cases (#23353 )	2023-08-25 17:57:35 +08:00
Jerry Hu	f80b067990	[fix](column) add unimplemented function of ColumnFixedLengthObject (#23468 )	2023-08-25 17:38:01 +08:00
TengJianPing	1312c12236	Revert "[fix](testcase) fix test case failure of insert null value into not null column (#20963 )" (#23462 ) * Revert "[fix](testcase) fix test case failure of insert null value into not null column (#20963)" This reverts commit 55a6649da962fb170ddb40fea8ef26bdc552a51a. Mannual Revert "fix in strict mode, return error for insert if datatype convert fails (#20378)" This mannual reverts commit 1b94b6368f5e871c9a0fe53dd7c64409079a4c9d * fix case failure	2023-08-25 16:47:14 +08:00
Gabriel	5c37be16fe	[pipelineX](correctness) Fix close problem for local state (#23479 )	2023-08-25 14:19:27 +08:00
Pxl	b96b8f4370	[Bug](jdbc) support get_default on complex type (#23325 ) support get_default on complex type	2023-08-25 14:08:24 +08:00
Kaijie Chen	d8e499cb55	[fix](UT) fix flaky test in LoadStreamMgrTest (#23459 )	2023-08-25 13:53:20 +08:00
Gabriel	59acf61ec5	[pipelineX](pick) pick 2 PR from pipeline engine (#23463 )	2023-08-25 13:26:05 +08:00

1 2 3 4 5 ...

5457 Commits