doris

Author	SHA1	Message	Date
morrySnow	469b6b8466	[enhancement](Nereids) datetime v2 type precision derive (#17079 )	2023-02-26 22:33:55 +08:00
jakevin	710529b060	[enhance](Nereids): refactor LogicalJoin. (#17099 )	2023-02-26 22:28:54 +08:00
Mingyu Chen	e9619368e9	[fix](s3) fix SdkClientException: Multiple HTTP implementations were found on the classpath (#17136 )	2023-02-26 15:32:43 +08:00
Pxl	6bb721d86b	[Chore](build) fix some warning on code generate and webui #17078 [WARNING:gensrc/thrift/parquet.thrift:22] Uncaptured doctext at on line 18. [WARNING:gensrc/thrift/parquet.thrift:23] Uncaptured doctext at on line 22. [WARNING:gensrc/thrift/parquet.thrift:436] Uncaptured doctext at on line 428. WARNING in asset size limit: The following asset(s) exceed the recommended size limit (244 KiB).WARNING in asset size limit: The following asset(s) exceed the recommended size limit (244 KiB). This can impact web performance WARNING in entrypoint size limit: The following entrypoint(s) combined asset size exceeds the recommended limit Warning : Macro "NonTerminator" has been declared but never used.	2023-02-26 13:01:19 +08:00
plat1ko	0251cb8941	[fix](cooldown) Handle re-add replica with cooldowned data #17047 Modify rule of choosing cooldown replica, only alive replica can be cooldown replica. Handle re-add replica with cooldowned data.	2023-02-26 12:36:55 +08:00
奕冷	5018223176	[Enhancement](stmt-forward) better error msg for follower fe #17132 The error log msg for the FE follower's forward to master failure is ambiguous as seen, so we should clarify it.	2023-02-26 12:28:33 +08:00
奕冷	605d840231	[improvement](log)enhance log msg of finding be policy failure (#17134 )	2023-02-26 11:52:25 +08:00
xueweizhang	d3a7cb8bde	[fix](stream_load) can abort 2pc stream load when table dropped #17088 when stream load with 2pc, the table was droped before commit, it will get error commit or abort, trasaction can not finish. if commit or abort ,will get error: { "status": "ANALYSIS_ERROR", "msg": "errCode = 7, detailMessage = unknown table, tableId=52579" } after this pr, i can abort success.	2023-02-26 11:20:41 +08:00
Tiewei Fang	3a9aa03aab	[BugFix](oracle-catalog) Modify the doris data type mapping of oracle `NUMBER(p,s)` type (#17051 ) The data type `NUMBER(p,s)` of oracle has some different of doris decimal type in semantics. For Oracle Number(p,s) type： 1. if s<0 , it means this is an Interger. This `NUMBER(p,s)` has (p+\|s\| ) significant digit, and rounding will be performed at s position. eg: if we insert 1234567 into `NUMBER(5,-2)` type, then the oracle will store 1234500. In this case, Doris will use int type (`TINYINT/SMALLINT/INT/.../LARGEINT`). 2. if s>=0 && s<p , it just like doris Decimal(p,s) behavior. 3. if s>=0 && s>p, it means this is a decimal(like 0.xxxxx). p represents how many digits can be left to the left after the decimal point, the figure after the decimal point s will be rounded. eg: we can not insert 0.0123456 into `NUMBER(5,7)` type, because there must be two zeros on the right side of the decimal point, we can insert 0.0012345 into `NUMBER(5,7)` type. In this case, Doris will use `DECIMAL(s,s)` 4. if we don't specify p and s for `NUMBER(p,s)` like `NUMBER`, the p and s of `NUMBER` are uncertain. In this case, doris can not determine p and s, so doris can not determine data type.	2023-02-26 09:05:41 +08:00
Tiewei Fang	f6ce072297	[Enhencement](csv-reader) Optimize csv_reader `_split_value` and fix json_reader case sensitive (#17093 ) 1. Enhencement: For single-charset column separator，csv_reader use another method of `split value`. 2. BugFix Set `json` file format loading to be sensitive.	2023-02-26 09:03:04 +08:00
Ashin Gau	c43e521d29	[feature](multi-catalog) support map&struct type in parquet&orc reader (#17087 ) Support parsing map&struct type in parquet&orc reader. ## Remaining Problems 1. Doris use array type to build the key and value column of a `map`, but doesn't fill the offsets in value column, so the offsets in value column is wasted. 2. Parquet support reading only key or value column in `map`, this PR hasn't supported yet. 3. Parquet support reading partial columns in `struct`, this PR hasn't supported yet.	2023-02-26 08:55:39 +08:00
Mingyu Chen	50b423e09b	[improvement](mysql) merge connect context and mysql channel and reduce send buffer memory (#17125 )	2023-02-25 21:07:23 +08:00
Mingyu Chen	4093ef9e4b	[fix](auth) fix losing global priv bug and refactor default role name (#16966 ) This PR mainly changes: When upgrading from old version to master, the ADMIN_PRIV for normal user may be lost. This may only happen if: Create a user with ADMIN_PRIV privilege. Upgrade Doris to v1.2.x or master before the meta image which contains the edit log in step 1 is generate. And the ADMIN_PRIV will be lost in Global Privileges This PR will rectify this bug and set ADMIN_PRIV to the right place Refactor the user's implicit role name In [feature](auth)Implementing privilege management with rbac model #16091, we refactor the Doris auth model by introducing RBAC. And each user will have an implicit role, named with prefix default_role_rbac_. But it has wrong format like: default_role_rbac_'default_cluster:user1'@'%' This PR change the role name's format, like: default_role_rbac_user1@% default_role_rbac_user2@[domain] NOTICE: this change may cause incompatible metadata, but since [feature](auth)Implementing privilege management with rbac model #16091 is not released, we should fix it soon. Add a new session variable show_user_default_role When set to true, it will show implicit role of user in the result of show roles stmt. Default is false	2023-02-24 23:36:53 +08:00
谢健	83e5ecdecc	[fix](Nereids) use a threshold to check the equal double values in n-th rank (#17118 ) The cost is inaccurate, so we use a threshold to check the equal double values	2023-02-24 22:12:47 +08:00
Pxl	2db4a981b3	[Feature](Materialized-View) forbiden rename column on materialized view (#17030 ) forbiden rename column on materialized view	2023-02-24 21:28:31 +08:00
YangShaw	c53b6a9532	[fix](Nereids) fix nullable() of lead/lag (#17014 ) fix bug when we use NULL as default value for window function lead() and lag()	2023-02-24 21:27:44 +08:00
pengxiangyu	54e68fe250	[feature](cooldown)add ut for CooldownConfHandler (#17007 ) * add ut for CooldownConfHandler * add ut for CooldownConfHandler * add ut for CooldownConfHandler	2023-02-24 17:06:55 +08:00
Pxl	0691586eb7	[Chore](regression-test) add createMV action && add some mv case from fe ut MaterializedViewFunctionTest (#16825 ) 1. add createMV action 2. add some mv case from fe ut MaterializedViewFunctionTest 3. reduce mv scheduler interval time from 10s to 0.3s	2023-02-24 16:35:37 +08:00
AKIRA	cf5bc9594b	[fix](planner) conjuncts of the outer query block didn't work when it's on the results expr of inline view (#17036 ) Here is a cases: select id, name from (select '123' as id, '1234' as name, age from test_insert ) a where name != '1234';	2023-02-24 15:27:34 +08:00
AlexYue	c39914c0a0	[feature](partition)add default list partition (#15509 ) This pr implements the list default partition referred in related #15507. It's similar as GreenPlum's default's partition which would store all data not satisfying prior partition key's constraints and optimizer wouldn't filter default partition which means default partition would be scanned each time you try to select data from one table with default partition. User could either create one table with default partition or alter add one default partition. ```sql PARTITION LIST(key) { PARTITION p1 values in (xx,xx), PARTITION DEFAULT } ALTER TABLE XXX ADD PARTITION DEFAULT ``` We don't support automatically migrate data inside default partition which meets newly added partition key's constraint to newly add partition when alter add new partition. User should select default partition using new constraints as predicate and insert them to new partition. ```sql insert into tbl select * from tbl partition default where partition_key=xx; ```	2023-02-24 15:24:59 +08:00
starocean999	479d57df88	[fix](planner) the project expr should be calculated in join node in some case (#17035 ) Consider the sql bellow: select sum(cc.qlnm) as qlnm FROM outerjoin_A left join (SELECT outerjoin_B.b, coalesce(outerjoin_C.c, 0) AS qlnm FROM outerjoin_B inner JOIN outerjoin_C ON outerjoin_B.b = outerjoin_C.c ) cc on outerjoin_A.a = cc.b group by outerjoin_A.a; The coalesce(outerjoin_C.c, 0) was calculated in the agg node, which is wrong. This pr correct this, and the expr is calculated in the inner join node now.	2023-02-24 15:20:05 +08:00
Zhengguo Yang	d562428b1d	[enhancement](memory) reduce memory usage for failed broker loads (#16974 ) Reduce more memory usage for failed broker load msg in fe after pr #15895	2023-02-24 12:07:02 +08:00
yongjinhou	c3538ca804	[Enhancement](HttpServer) Add http interface authentication (#16571 ) 1. Organize http documents 2. Add http interface authentication for FE 3. Support https interface for FE 4. Provide authentication interface 5. Add http interface authentication for BE 6. Support https interface for BE	2023-02-24 10:59:33 +08:00
YueW	a12b3c3f0c	[fix](alter inverted index) fix incorrect CreateTime of 'show alter' query result after fe restart (#17043 ) For add or drop inverted index, when replay the logModifyTableAddOrDropInvertedIndices will new a schema change job, that has a new CreateTime, here should new a schema change job when not replay log.	2023-02-24 10:25:48 +08:00
amory	7229751bd9	[Improve](map-type) Add contains_null for map (#16948 ) Add contains_null for map type.	2023-02-23 20:47:26 +08:00
qiye	92ecd16573	(feature)[DOE]Support array for Doris on ES (#16941 ) * (feature)[DOE]Support array for Doris on ES	2023-02-23 19:31:18 +08:00
谢健	48fd528a2b	[feature](Nereids) Add hint NTH_OPTIMIZED_PLAN to let the optimzier select n-th optimized plan (#16992 ) Add hint NTH_OPTIMIZED_PLAN to let the optimzier can select n-th optimized plan. For example, you could use, select /+SET_VAR("nth_optimized_plan"=2) / * from table; to select the second-best plan in the optimizer.	2023-02-23 18:56:51 +08:00
zhannngchen	edead494cb	[Enhancement](storage) add a new hidden column __DORIS_VERSION_COL__ for unique key table (#16509 )	2023-02-23 15:47:17 +08:00
Ashin Gau	3ea6478ba8	[feature](multi-catalog) parquet reader support nested array column (#16961 ) Support to decode nested array column in parquet reader: 1. FE should generate the right nested column type. FE doesn't check the nesting depth and legality, like map\<array\<int\>, int\>. 2. `ParquetColumnReader` has removed the filtering of page index to support nested array type. It's too difficult to skip values in nested complex types. Maybe we should support the filtering of page index and lazy read in later PR. 3. `ExternalFileScanNode` has a bug in creating default value expression. 4. Maybe it's slow to read repetition levels in a while loop. I'll optimize this in next PR. 5. Array column has temporary `SchemaElement` in its thrift definition, we have removed them and keep its parent in former implementation. The remaining parent should inherit the repetition and definition level of its child.	2023-02-23 14:54:58 +08:00
Tiewei Fang	c2cc75d741	[BugFix](Jdbc Catalog) Fix null pointer exception in JdbcExecutor (#16958 ) This pr do two things: 1. fix: It use `column[0]` to judge class type in JdbcExecutor, but column[0] may be null ! 2. Enhencement In the original logic, all fields in jdbc catalog table will be set Nullable. However, it is inefficient for nullable fields. Actually, we can know if the fields in data source table is nullable through jdbc. So we can set the corresponding fields in Doris jdbc catalog to nullable or not.	2023-02-23 14:04:54 +08:00
slothever	51bbae27b8	[feature-wip](iceberg) add dlf and glue catalog impl for iceberg catalog (#16602 ) iceberg catalog supports DLF on Alibaba Cloud and AWS Glue Catalog	2023-02-23 14:02:41 +08:00
Jibing-Li	bc619ce5be	[Fix](load)Pass hidden column to load columns (#17004 ) The LoadScanProvider doesn't get Hidden Columns from stream load parameter. This may cause stream load delete operation fail. This pr is to pass the hidden columns to LoadScanProvider.	2023-02-23 13:54:36 +08:00
minghong	a9fb47a80a	[fix](planner) create view init bug (#16890 ) the body of create view stmt is parsed twice. in the second parse, we get sql string from CreateViewStmt.viewDefStmt.toSql() function, which missed selectlist.	2023-02-22 20:40:08 +08:00
mch_ucchi	df2f248712	[feature](planner) add dayofweek for FEFunctions to support fold constant (#16993 ) add dayofweek for FEFunctions to support fold constant. use Zellar algorithm	2023-02-22 20:27:49 +08:00
starocean999	7aa063c1f3	[fix](planner) bucket shuffle join is not recognized if the first table is a subquery (#16985 ) consider sql select * from (select * from test_1) a inner join (select * from test_2) b on a.id = b.id inner join (select * from test_3) c on a.id = c.id Because a.id is from a subquery, to find its source table, need use function getSrcSlotRef().	2023-02-22 20:23:00 +08:00
catpineapple	4c92730c3a	[fix](planner)fix multi partition support datetime column #16759	2023-02-22 19:38:42 +08:00
zhangstar333	dc3dab5a23	[vectorized](jdbc) fix jdbc connect sql server error (#16929 )	2023-02-22 19:36:27 +08:00
Mingyu Chen	12b6786522	[fix](hive) fix unable to specify user to access hdfs (#16999 ) In version 1.2.1, user can set `"hadoop.username" = "xxx"` to specify a remote user to access hdfs when creating hive catalog. But in version 1.2.2, we upgrade the hadoop version from 2.8 to 3.3, some behavior changed and the user specified remote user is useless. This PR try to fix this by using `UserGroupInformation` to delegate.	2023-02-22 19:35:40 +08:00
morrySnow	7956800df7	[refactor](Nereids) let type coercion same with legacy planner (#16844 ) - change for Nereids 1. add a variable length parameter to the ctor of Count for a good error reporting of Count(a, b) 2. refactor StringRegexPredicate, let it inherit from ScalarFunction 3. remove useless class TypeCollection 4. use catalog.Type.Collection to check expression arguments type 5. change type coercion for TimestampArithmetic, divide, integral divide, comparison predicate, case when and in predicate. Let them same as legacy planner. - change for legacy planner 1. change the common type of floating and Decimal from Decimal to Double	2023-02-22 17:29:37 +08:00
plat1ko	66ceab540a	[fix](replica) Fix inconsistent replica id between BE and FE in corner case of tablet rebalance (#16889 )	2023-02-22 16:21:11 +08:00
YueW	76ef4af29d	[fix](alter inverted index) fix write edit log in replaymodifyTableAddOrDropInvertedIndices function (#16977 ) Actually, when modifyTableAddOrDropInvertedIndices, no need write logAlterJob edit log, because write logModifyTableAddOrDropInvertedIndices is enough	2023-02-21 22:36:56 +08:00
starocean999	0de8f90a83	[enhancement](nereids) add a session variable to control join reorder algorithm (#16783 ) 1. disable join reorder in nereids if session variable disable_join_reorder is true. 2. add a session variable max_table_count_use_cascades_join_reorder to control join reorder algorithm in nereids. if dp hyper is used only when enable_dphyp_optimizer is true and the joined table count more than max_table_count_use_cascades_join_reorder, which default value is 10.	2023-02-21 21:08:39 +08:00
jakevin	54bf40b6e7	[feature](Nereids): Eliminate duplicate join condition. (#16910 )	2023-02-21 19:40:44 +08:00
AKIRA	a95f47ac0a	[ehancement](planner) Support filter the output of set operation node (#16666 )	2023-02-21 19:22:09 +08:00
TengJianPing	ed05f3b480	[regression-test](fuzzy) fuzzy session variable batch_size (#16384 )	2023-02-21 17:53:19 +08:00
YangShaw	cc839aead7	[fix](Nereids) fix signatures of some window functions (#16871 ) change signatures of lead(), lag(), first_value(), last_value() to be equal with legacy optimizer; these four functions only support Type.trivialTypes as returnType and input column type	2023-02-21 15:55:29 +08:00
Jibing-Li	44fed0e99b	[Fix](multi catalog)(nereids)Enable runtime filter for external table (#16855 ) Enable runtime filter for external table.	2023-02-21 10:35:58 +08:00
Mingyu Chen	57519fcf50	[fix](information_schema) catch and skip exception when getting schema from FE catalog (#16647 ) When querying information_schema database, BE will call FE RPC to get schema info such as db name list, table name list, etc. But some external catalog when failed to get these info because of wrong connection info. We should catch these kind of exception and skip it, so that it can continue to get schema info of other catalogs. Otherwise, the whole query on information_schema will fail, even if user just want to get info of internal catalog. And set jdbc connection timeout to 5s, to avoid thrift rpc timeout from BE to FE(default is 30s)	2023-02-21 08:43:09 +08:00
Pxl	ce3afe7f13	[Enchancement](Materialized-View) forbiden some case in create mv with group by and fix select fail on g… (#16820 ) 1. forbiden some case in create mv with group by select k1+1,sum(abs(k2+2)+k3+3) from d_table group by k1; 2. fix select fail on grouping column have diffrent expr with select list create materialized view k1p2ap3psg as select k1+1,sum(abs(k2+2)+k3+3) from d_table group by k1+1; mysql [test]>explain select k1+1,sum(abs(k2+2)+k3+3) from d_table group by k1; ERROR 1105 (HY000): errCode = 2, detailMessage = select list expression not produced by aggregation output (missing from GROUP BY clause?): `k1` + 1	2023-02-20 13:04:50 +08:00
jakevin	1011422e6d	[feature](Nereids): infer isNotNull from Inner/Semi/Anti Join (#16821 )	2023-02-20 12:14:15 +08:00

... 37 38 39 40 41 ...

5755 Commits