doris

Author	SHA1	Message	Date
feiniaofeiafei	987f755206	[Fix](nereids) fix rule SimplifyWindowExpression (#34099 ) Co-authored-by: feiniaofeiafei <moailing@selectdb.com>	2024-04-25 15:07:09 +08:00
Mryange	f34fe46bfa	[fix](scan) fix ignore expr exec when _non_predicate_columns is empty (#33934 ) fix ignore expr exec when _non_predicate_columns is empty	2024-04-25 15:06:57 +08:00
yujun	c2e3defe56	[fix](tablet invert index) fix tablet invert index leaky caused by auto partition (#33973 )	2024-04-25 15:06:43 +08:00
walter	789a16ec6b	[fix](fe) Fix SHOW CREATE TABLE with AUTO PARTITION (#34071 ) AUTO PARTITION grammar has changed since #31585, but the output of SHOW CREATE TABLE was left out to change, so the result is not able to be recognized by the FE parser.	2024-04-25 15:05:58 +08:00
wangbo	47b54d4bd5	Fix remote scan pool (#33976 )	2024-04-25 15:04:43 +08:00
wangbo	5f2d0e3d53	[Fix](executor)Fix when Fe send empty wg list to be may cause query failed. (#34074 )	2024-04-25 12:01:44 +08:00
yujun	450f443413	[fix](decommission) fix cann't decommission mtmv (#33823 )	2024-04-25 12:01:44 +08:00
seawinde	a15a8e119f	[fix](mtmv) Fix exception when create materialized view with cte (#33988 ) Fix exception when create materialized view with cte, after this fix, can create materialized view with following ``` CREATE MATERIALIZED VIEW mv_with_cte BUILD IMMEDIATE REFRESH AUTO ON MANUAL DISTRIBUTED BY RANDOM BUCKETS 2 PROPERTIES ('replication_num' = '1') AS with `test_with` AS ( select l_partkey, l_suppkey from lineitem union select ps_partkey, ps_suppkey from partsupp) select * from test_with; ``` this is brought from https://github.com/apache/doris/pull/28144	2024-04-25 12:01:44 +08:00
Gabriel	f4deb42a80	[pipeline](fix) Prevent re-cancel pipeline tasks (#34073 )	2024-04-25 12:01:44 +08:00
zhangdong	eaacba644d	[fix](auth)can not grant priv to __internal_schema (#34009 ) mysql> grant SELECT_PRIV on `_internal_schema`.* to 'test'@'%'; ERROR 1102 (42000): errCode = 2, detailMessage = Incorrect database name '_internal_schema'	2024-04-25 12:01:44 +08:00
Mingyu Chen	103987ebd8	[fix](parse) set origin stmt for select stmt generated from show stmt (#34015 ) * [fix](parse) set origin stmt for select stmt generated from show stmt * 2	2024-04-25 12:01:44 +08:00
Mingyu Chen	fbd2c9db2d	[upgrade](hive-shade)(paimon) upgrade hive shade to 2.0.0 and paimon to 0.7 (#34085 ) * Adapt paimon 0.6.0 (#33943) Version 2.0.0 of the shade package eliminates potential jar conflicts, resolves dependency component issues, and significantly reduces package size. Utilize the directly-dependent guava library instead of relying on transitively included libraries. * [chore](dependencies)Upgrade paimon to 0.7.0 (#33987) --------- Co-authored-by: Calvin Kirs <kirs@apache.org>	2024-04-25 12:01:44 +08:00
deardeng	ac038b3d4f	[fix](auto bucket) Fix auto bucket regression case occasional fail (#34069 )	2024-04-25 12:01:44 +08:00
yiguolei	a17524b427	[bugfix](core) close method should check if the pointer is nullptr (#34067 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-04-25 12:01:44 +08:00
yiguolei	7eb35c95f3	[refactor](errormessage) step1: unify the status usage in FE (#34062 ) We should tell the user the correct error message when some thing wrong. But error message is in a mess. I will make it clear. This is the first step: unify the error code usage in FE.	2024-04-25 12:01:44 +08:00
seawinde	a720e03a02	[improvement](mtmv) Optimize the nested materialized view rewrite performance (#34050 ) Optimize the nested materialized view rewrite performance when exists many join This is brought by #33362	2024-04-25 12:01:44 +08:00
Mryange	67b394f2b0	[feature](profile) sort pipelineX task by total time #34053	2024-04-25 12:01:44 +08:00
TengJianPing	2c3e838971	[improvement](spill) improve config of spill thread pool (#33992 )	2024-04-25 12:01:44 +08:00
minghong	0faae45537	[opt](nereids)project sub expression in other condition for nested loop join (#32697 ) 1. project sub expression in other condition for nested loop join 2. fix a bug in ut framework which may gennerate duplicated ExprId	2024-04-25 12:01:44 +08:00
feiniaofeiafei	ef73533e27	[Feat](nereids) add transform rule SimplifyWindowExpression (#33647 ) rewrite func(para) over (partition by unique_keys) 1. func() is count(non-null) or rank/dense_rank/row_number -> 1 2. func(para) is min/max/sum/avg/first_value/last_value -> para e.g select max(c1) over(partition by pk) from t1; -> select c1 from t1;	2024-04-25 12:01:44 +08:00
feiniaofeiafei	800bb3d4ba	[Feat](nereids) add expression rewrite rule LikeToEqualRewrite (#33803 ) like expressions without fuzzy matching are rewritten into equivalent expressions	2024-04-25 12:01:44 +08:00
feiniaofeiafei	2f996a574f	[Feat](nereids) nereids add alter view (#33970 ) nereids support alter view stmt. e.g. ALTER VIEW example_db.example_view ( c1 COMMENT "column 1", c2 COMMENT "column 2", c3 COMMENT "column 3" ) AS SELECT k1, k2, SUM(v1) FROM example_table GROUP BY k1, k2	2024-04-25 12:01:44 +08:00
zhangdong	edff4137fe	[fix](mtmv) Mv check name (#34016 )	2024-04-25 12:01:44 +08:00
walter	e54ae4519d	[fix](bdb) Write OP_TIMESTAMP operation until it successed (#33967 ) For now, it will reset the next journal id and return if the OP_TIMESTAMP operation writes failed. Because BDBJE will replicate the committed txns (only persisted in BDB log, but not replicated to other members) to FOLLOWERs after the connection resumed, directly resetting the next journal id and returning will cause subsequent txn written to the same journal ID not to be replayed by the FOLLOWERS. So for OP_TIMESTAMP operation, try to write until it succeeds.	2024-04-25 11:59:52 +08:00
zhangstar333	cc3decffa4	[bug](test) fix test case failed with fuuzy fold constatnt to false (#34052 )	2024-04-24 19:42:08 +08:00
zhangstar333	d5275c55b4	[bug](fold) fix fold date/datetime error as null (#33845 ) the LocalDateTime/LocalDate value maybe null, so need check it firstly. if it's null, could return NullLiteral directly.	2024-04-24 19:41:42 +08:00
Xinyi Zou	f6ec64c6ad	[fix](exception) Fix Block noexcept method not throw exception (#34002 )	2024-04-24 17:13:50 +08:00
TengJianPing	00d773117d	[fix](stream agg) fix coredump when close if open failed (#33978 )	2024-04-24 17:13:50 +08:00
xy720	080c07ad87	[bug](random distribution) fix data loss and incorrect in random distribution table #33962	2024-04-24 17:13:50 +08:00
Mingyu Chen	799c43686c	[fix](jni-connector) avoid core dump if init connector failed (#34007 ) _jni_scanner_cls may be null if connector init failed. So need to check it before delete it.	2024-04-24 17:13:50 +08:00
amory	8d98c71079	[FIX]fix cidr func with const param (#33968 )	2024-04-24 17:13:50 +08:00
Mryange	d5b212f6c5	[test](p0 case) Increase the batch size in test leading cases (#33994 ) Due to the presence of fuzziness, the batch size may be set to 50, and this case runs very slowly locally, making it prone to timeouts.	2024-04-24 17:13:50 +08:00
Dongyang Li	d354c2f1a9	[fix](case) test_alter_colocate_table.groovy (#33981 )	2024-04-24 17:13:50 +08:00
zhangstar333	2f60dcf890	[test](hll) fix unstable case without order by clause (#33947 )	2024-04-24 17:13:50 +08:00
Sun Chenyang	6531e4c540	[improve](regression test)Add test for time series compact empty rowset (#29509 )	2024-04-24 17:13:49 +08:00
Mryange	df96f76f78	[featrue](pipelineX) check output type in some node (#33716 )	2024-04-24 17:13:49 +08:00
Tiewei Fang	a11ae2cd51	[Fix](Jdbc-Hive) fix the order of partition keys (#33963 ) The partition key information recorded in PARTITION_KEYS table is sorted according to the INTEGER_IDX field, so we need to add an 'order by' clause to ensure that the obtained partition names are ordered.	2024-04-24 17:13:43 +08:00
Lei Zhang	2a1fbfd72c	[feat](fe) Add `ignore_bdbje_log_checksum_read` for BDBEnvironment (#31247 ) * https://forums.oracle.com/ords/apexds/post/je-log-checksumexception-2812 * When meeting disk damage or other reason described in the oracle forums and fe cannot start due to `com.sleepycat.je.log.ChecksumException`, we add a param `ignore_bdbje_log_checksum_read` to ignore the exception, but there is no guarantee of correctness for bdbje kv data Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>	2024-04-22 22:33:24 +08:00
HHoflittlefish777	9bb149b3be	[fix](stream-load) fix query id is zero in stream load log (#33954 )	2024-04-22 22:33:24 +08:00
zfr95	b0524d9d2f	[fix](test)fix auto partition date file not exists (#33827 ) fix auto partition date file not exists	2024-04-22 22:33:24 +08:00
Qi Chen	31e7cc3822	[Enhancement](multi-catalog) Rewrite `S3URI` to remove tricky virtual bucket mechanism and support different uri styles by flags. (#33858 ) Many domestic cloud vendors are compatible with the s3 protocol. However, early versions of s3 client will only generate path style http requests (https://github.com/aws/aws-sdk-java-v2/pull/763) when encountering endpoints that do not start with s3, while some cloud vendors only support virtual host style http request. Therefore, Doris used `forceVirtualHosted` in `S3URI` to convert it into a virtual hosted path and implemented it through path style. For example: For s3 uri `s3://my-bucket/data/file.txt`, It will eventually be parsed into: - virtualBucket: my-bucket - Bucket: data (bucket must be set, otherwise the s3 client will report an error) Especially this step is particularly tricky because of the limitations of the s3 client. - Key: file.txt The path style mode is used to generate an http request similar to the virtual host by setting the endpoint to virtualBucket + original endpoint, setting the bucket and key. However, the bucket and key here are inconsistent with the original concepts of s3, but the aws client happens to be able to generate an http request similar to the virtual host through the path style mode. However, after #30799 we have upgrade the aws sdk version from 2.17.257 to 2.20.131. The current aws s3 client can already generate a virtual host by third party by default style of http request. So in #31111 need to set the path style option, let the s3 client use doris' virtual bucket mechanism to continue working. Finally, the virtual bucket mechanism is too confusing and tricky, and we no longer need it with the new version of s3 client. ### Resolution: Rewrite `S3URI` to remove tricky virtual bucket mechanism and support different uri styles by flags. This class represents a fully qualified location in S3 for input/output operations expressed as as URI. #### For AWS S3, URI common styles: - AWS Client Style(Hadoop S3 Style): `s3://my-bucket/path/to/file?versionId=abc123&partNumber=77&partNumber=88` - Virtual Host Style: `https://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88` - Path Style: `https://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88` Regarding the above-mentioned common styles, we can use <code>isPathStyle</code> to control whether to use path style or virtual host style. "Virtual host style" is the currently mainstream and recommended approach to use, so the default value of <code>isPathStyle</code> is false. #### Other Styles: - Virtual Host AWS Client (Hadoop S3) Mixed Style: `s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88` - Path AWS Client (Hadoop S3) Mixed Style: `s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88` For these two styles, we can use <code>isPathStyle</code> and <code>forceParsingByStandardUri</code> to control whether to use. Virtual Host AWS Client (Hadoop S3) Mixed Style: <code>isPathStyle = false && forceParsingByStandardUri = true</code> Path AWS Client (Hadoop S3) Mixed Style: <code>isPathStyle = true && forceParsingByStandardUri = true</code> When the incoming location is url encoded, the encoded string will be returned. For <code>getKey()</code>, <code>getQueryParams()</code> will return the encoding string	2024-04-22 22:33:24 +08:00
Pxl	5a5063be20	[bug](fix) heap use after free when json parse failed (#33955 )	2024-04-22 22:33:24 +08:00
Gabriel	4d7ac82305	[profile](scanner) Fix wrong metrics (#33965 )	2024-04-22 22:33:24 +08:00
Kang	fbbb7c5b85	improve logstash doris output plugin (#33135 ) 1. support multi thread concurrency for performance 2. support retry count and infinite retry 3. add a config to log doris stream load request header and response 4. add a config to log speed for better observability	2024-04-22 22:33:24 +08:00
wangbo	299d069da9	Fix alter policy failed (#33910 )	2024-04-22 22:33:24 +08:00
deardeng	a050513c91	[Fix](clean trash) Fix clean trash use agent task (#33912 ) (#33972 ) * [Fix](clean trash) Fix clean trash use agent task (#33912) * add .h	2024-04-22 17:14:21 +08:00
Mingyu Chen	f6b6c13fb3	[enhance](auth)Abstract authentication interface (#33668 ) (#33961 ) bp #33668 Co-authored-by: zhangdong <493738387@qq.com>	2024-04-22 16:41:49 +08:00
Mingyu Chen	88b3d61eca	[refactor](Mysql) Refactoring the process of using external components to authenticate in MySQL connections (#32875 ) (#33958 ) bp #32875 Co-authored-by: LompleZ Liu <47652868+LompleZ@users.noreply.github.com>	2024-04-22 16:41:49 +08:00
Mingyu Chen	71314595be	[Enhancement](ranger) Disable some permission operations when Ranger or LDAP are enabled (#32538 ) (#33957 ) bp #32538 Co-authored-by: yongjinhou <109586248+yongjinhou@users.noreply.github.com>	2024-04-22 16:41:49 +08:00
Mingyu Chen	98e90dd47e	[fix](auth)fix missing authentication (#33347 ) (#33956 ) bp #33347 Co-authored-by: zhangdong <493738387@qq.com>	2024-04-22 13:52:36 +08:00

1 2 3 4 5 ...

18346 Commits