doris

Author	SHA1	Message	Date
Tiewei Fang	f32efe5758	[Fix](Outfile) Fix that it does not report error when export table to S3 with an incorrect ak/sk/bucket (#23441 ) Problem: It will return a result although we use wrong ak/sk/bucket name, such as: ```sql mysql> select * from demo.student -> into outfile "s3://xxxx/exp_" -> format as csv -> properties( -> "s3.endpoint" = "https://cos.ap-beijing.myqcloud.com", -> "s3.region" = "ap-beijing", -> "s3.access_key"= "xxx", -> "s3.secret_key" = "yyyy" -> ); +------------+-----------+----------+----------------------------------------------------------------------------------------------------+ \| FileNumber \| TotalRows \| FileSize \| URL \| +------------+-----------+----------+----------------------------------------------------------------------------------------------------+ \| 1 \| 3 \| 26 \| s3://xxxx/exp_2ae166e2981d4c08-b577290f93aa82ba_ \| +------------+-----------+----------+----------------------------------------------------------------------------------------------------+ 1 row in set (0.15 sec) ``` The reason for this is that we did not catch the error returned by `close()` phase.	2023-08-26 00:19:30 +08:00
slothever	f66f161017	[fix](multi-catalog)fix hive table with cosn location issue (#23409 ) Sometimes, the partitions of a hive table may on different storage, eg, some is on HDFS, others on object storage(cos, etc). This PR mainly changes: 1. Fix the bug of accessing files via cosn. 2. Add a new field `fs_name` in TFileRangeDesc This is because, when accessing a file, the BE will get a hdfs client from hdfs client cache, and different file in one query request may have different fs name, eg, some of are `hdfs://`, some of are `cosn://`, so we need to specify fs name for each file, otherwise, it may return error: `reason: IllegalArgumentException: Wrong FS: cosn://doris-build-1308700295/xxxx, expected: hdfs://[172.xxxx:4007](http://172.xxxxx:4007/)`	2023-08-26 00:16:00 +08:00
Qi Chen	8af1e7f27f	[Fix](orc-reader) Fix incorrect result if null partition fields in orc file. (#23369 ) Fix incorrect result if null partition fields in orc file. ### Root Cause Theoretically, the underlying file of the hive partition table should not contain partition fields. But we found that in some user scenarios, the partition field will exist in the underlying orc/parquet file and are null values. As a result, the pushed down partition field which are null values. filter incorrectly. ### Solution we handle this case by only reading non-partition fields. The parquet reader is already handled this way, this PR handles the orc reader.	2023-08-26 00:13:11 +08:00
Qi Chen	a3a951c71d	[Fix](multi-catalog) Fix load string dict issue for transactional hive tables. (#23306 ) Fix load string dict issue for transactional hive tables. The column name need to pass 'row.column_name'. apache/doris-thirdparty#112	2023-08-26 00:09:12 +08:00
Qi Chen	29273771f7	[Fix](multi-catalog) Fix hive incorrect result by disable string dict filter if exprs contain null expr. (#23361 ) Issue Number: close #21960 Fix hive incorrect result by disable string dict filter if exprs contain null expr.	2023-08-25 21:16:43 +08:00
Jerry Hu	9d1c702b3a	[improvement](function) do not use hyperscan for non-const partterns in like function (#23495 )	2023-08-25 20:40:23 +08:00
Jerry Hu	f80b067990	[fix](column) add unimplemented function of ColumnFixedLengthObject (#23468 )	2023-08-25 17:38:01 +08:00
TengJianPing	1312c12236	Revert "[fix](testcase) fix test case failure of insert null value into not null column (#20963 )" (#23462 ) * Revert "[fix](testcase) fix test case failure of insert null value into not null column (#20963)" This reverts commit 55a6649da962fb170ddb40fea8ef26bdc552a51a. Mannual Revert "fix in strict mode, return error for insert if datatype convert fails (#20378)" This mannual reverts commit 1b94b6368f5e871c9a0fe53dd7c64409079a4c9d * fix case failure	2023-08-25 16:47:14 +08:00
Pxl	b96b8f4370	[Bug](jdbc) support get_default on complex type (#23325 ) support get_default on complex type	2023-08-25 14:08:24 +08:00
zclllyybb	84792d0886	fix compile of master (#23467 )	2023-08-25 11:47:39 +08:00
Kang	8ef6b4d996	[fix](json) fix json int128 overflow (#22917 ) * support int128 in jsonb * fix jsonb int128 write * fix jsonb to json int128 * fix json functions for int128 * add nereids function jsonb_extract_largeint * add testcase for json int128 * change docs for json int128 * add nereids function jsonb_extract_largeint * clang format * fix check style * using int128_t = __int128_t for all int128 * use fmt::format_to instead of snprintf digit by digit for int128 * clang format * delete useless check * add warn log * clang format	2023-08-25 11:40:30 +08:00
HappenLee	d331bfc513	[Performance](pipeline) support shared scan segment in mow (#23305 )	2023-08-25 10:43:02 +08:00
Pxl	d9db3f5431	[Improvement](scan) Remove redundant predicates on scan node (#23374 ) * Remove redundant predicates on scan node * update * fix	2023-08-25 10:41:37 +08:00
Kaijie Chen	0a70cbfe99	[feature](move-memtable)[5/7] add olap table sink v2 and writers (#23458 ) Co-authored-by: laihui <1353307710@qq.com>	2023-08-25 10:20:06 +08:00
zclllyybb	9cacf9535a	[Opt](functions) Use preloaded cache to accelerate timezone parsing (#22694 ) * opt * bugfix * fix ut * fix stylecheck	2023-08-25 10:00:48 +08:00
zclllyybb	7cfb3cc0aa	[fix](functions) fix function substitute for datetimeV1/V2 (#23344 ) * fix * function fe	2023-08-25 09:59:38 +08:00
Kaijie Chen	b65023e667	[fix](load) avoid using protobuf set_allocated_id in VDataStreamSender (#23435 )	2023-08-25 00:07:09 +08:00
Qi Chen	caddcc6215	[Fix](orc-reader) Fix decimal type check for ColumnValueRange issue and use primitive_type. (#23424 ) Fix decimal type check for ColumnValueRange issue and use primitive_type in orc_reader. Because in #22842 the `CppType` of `PrimitiveTypeTraits<TYPE_DECIMALXXX> ` were changed.	2023-08-24 23:26:41 +08:00
Gabriel	55e572df82	[pipelineX](analytic operator) Support analytic operator (#23444 )	2023-08-24 23:05:29 +08:00
Pxl	376d54de41	[Chore](function) add check for comparison input type #23330	2023-08-24 08:28:31 +08:00
Gabriel	9d1f2cd8e0	[Improvement](pipeline) Terminate early for short-circuit join (#23378 )	2023-08-23 19:40:17 +08:00
zclllyybb	51ac92f65c	Revert "[fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236 )" (#23368 ) This reverts commit 1c3cc77a54938ed948ad8186b8dea8385977d23c.	2023-08-23 18:27:35 +08:00
camby	22e373a799	[feature](vector-search) add 4 distance functions to support vector search (#23129 )	2023-08-23 15:51:15 +08:00
daidai	2dda44d7b5	[fix](csv-reader)fix bug of multi-char delimiter in csv reader fix bug that csv_reader parse line in order to get column.	2023-08-23 15:19:13 +08:00
Pxl	ab7b45a9e2	[Bug](map) support replicate for column map and remove some unused code #23356	2023-08-23 15:06:04 +08:00
lihangyu	527293aa41	[refactor](dynamic table) remove dynamic table (#23298 )	2023-08-23 14:15:14 +08:00
Gabriel	ba882dea21	[pipelineX](dependency) Build DAG between pipelines (#23355 )	2023-08-23 13:21:32 +08:00
Jerry Hu	14296ee87f	[fix](window_function) wrong order by range (#23346 )	2023-08-23 11:23:00 +08:00
Pxl	8ed4045df9	[Chore](primitive-type) remove VecPrimitiveTypeTraits (#22842 )	2023-08-23 08:37:40 +08:00
Pxl	e6d20f842c	[Bug](compile) fix compile failed on function case (#23335 )	2023-08-22 22:10:53 +08:00
HappenLee	5c2fae7ce5	[pipeline](exec) Refactor the table sink code in remove unless code (#23223 ) Refactor the table sink code in remove unless code	2023-08-22 20:42:14 +08:00
Pxl	1a1f86486d	[Improvement](function) opt for case when (#23068 ) opt for case when	2023-08-22 18:31:40 +08:00
camby	0b51e6d8e1	[refractor](FunctionArrayIndex) make the codes more simple	2023-08-22 17:48:59 +08:00
Ashin Gau	9d2e23b1aa	[fix](parquet) A row of complex type may be stored across more pages (#23277 ) A row of complex type may be stored across two(or more) pages, and the parameter `align_rows` indicates that whether the reader should read the remaining value of the last row in previous page.	2023-08-22 14:47:10 +08:00
Ashin Gau	5ff7b57fc1	[fix](parquet) parquet reader confuses logical/physical/slot id of columns (#23198 ) `ParquetReader` confuses logical/physical/slot id of columns. If only reading the scalar types, there's nothing wrong, but when reading complex types, `RowGroup` and `PageIndex` will get wrong statistics. Therefore, if the query contains complex types and pushed-down predicates, the probability of the result set is incorrect.	2023-08-22 13:35:29 +08:00
Gabriel	12075f9853	[pipelineX](projection) Support projection and blocking agg (#23256 )	2023-08-21 22:23:02 +08:00
Gabriel	dcd6c3c022	[pipelineX](refactor) propose a new pipeline execution model (#22562 )	2023-08-21 15:38:45 +08:00
plat1ko	d4694167a8	[Enhancement](chore) Some Status relevant enhancement (#23072 )	2023-08-21 14:14:38 +08:00
zhangstar333	37b49f60b7	[refactor](conf) add be conf for partition topn partitions threshold (#23220 ) add be conf for partition topn partitions threshold	2023-08-21 10:52:41 +08:00
amory	33dfa0c454	[Improve](serde) support text serde for nested type-array/map (#22738 ) Now we can not support nested type array/map so this pr aim to: 1. add format option for string convert defined datatype to keep with origin from_string 2. support array map can nested array and map	2023-08-21 10:32:28 +08:00
Jerry Hu	0967d7ec04	[improvement](agg) Do not serialize bitmap to string (#23172 )	2023-08-21 10:10:15 +08:00
Pxl	a11e0e3bc4	[Bug](agg) fix QUANTILE_UNION many problems (#23181 ) fix QUANTILE_UNION many problems	2023-08-21 10:04:27 +08:00
Ashin Gau	4bf055c818	[fix](parquet) the key colum of map type in parquet may be nullable (#23180 ) Fix errors when reading map type with nullable key column in parquet file. `ParquetReader` support to read nullable key column, but add a check to prevent reading nullable key column. Unfortunately, this check error was not thrown correctly, causing the BE to crash, and thrown meaningless error logs in be.out: ``` ... 11# doris::vectorized::ParquetReader::get_columns(std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::TypeDescriptor, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, doris::TypeDescriptor> > >, std::unordered_set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >) at /root/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:508 12# doris::vectorized::VFileScanner::_get_next_reader() in /root/yun_you_external/output/be/lib/doris_be 13# doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState, doris::vectorized::Block, bool*) at /root/doris/be/src/vec/exec/scan/vfile_scanner.cpp:241 ... ```	2023-08-20 22:59:18 +08:00
HappenLee	433a6103ab	[Enhancement](scanner) allocate blocks in scanner_context on demand and free them on close (#23182 ) Introduced #19389 , removed #20785	2023-08-19 12:13:24 +08:00
Tiewei Fang	0838ff4bf4	[fix](Outfile) fix bug that the `fileSize` is not correct when outfile is completed (#22951 )	2023-08-18 22:31:44 +08:00
daidai	419e922a69	[fix](json)Fix the bug that does not stop when reading json files (#23062 ) * [fix](json)Fix the bug that does not stop when reading json files	2023-08-18 18:23:19 +08:00
Pxl	477961dc21	[Chore](agg) refactor of hash map (#22958 ) refactor of hash map	2023-08-18 17:59:30 +08:00
HappenLee	3d4ec1ac88	[pipeline](exec) support async writer in jdbc sink in pipeline query engine (#23144 ) support async writer in jdbc sink in pipeline query engine	2023-08-18 17:07:57 +08:00
ZenoYang	1c3cc77a54	[fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236 ) * [fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty * add ut * fix nereids * fix regression-test	2023-08-18 14:37:49 +08:00
Ashin Gau	795006ea3d	[fix](multi-catalog) conversion of compatible numerical types (#23113 ) Hive support schema change, but doesn't rewrite the parquet file, so the physical type of parquet file may not equal the logical type of table schema.	2023-08-18 14:05:33 +08:00

1 2 3 4 5 ...

2110 Commits