doris

Author	SHA1	Message	Date
Jibing-Li	11a5875283	[fix](regression)Remove useless case which may cause preHeat npe. (#35582 ) (#35685 ) backport https://github.com/apache/doris/pull/35582 Remove useless case which may cause preHeat npe.	2024-05-31 10:20:21 +08:00
meiyi	d6757e03de	[fix](group commit) Group commit http stream should not begin txn (#35494 ) (#35672 ) ## Proposed changes Pick https://github.com/apache/doris/pull/35494	2024-05-30 20:57:59 +08:00
minghong	64023c54bc	[fix](nereids) do not generate runtime filter on schema-scan (#35655 ) ## Proposed changes schema-scan node does not support runtime filter Issue Number: close #xxx <!--Describe your changes.-->	2024-05-30 20:25:23 +08:00
yiguolei	d458f1af34	fix compile for jdk17	2024-05-30 20:17:54 +08:00
seawinde	34d27ddd6d	[fix](mtmv) Fix get mv statistics plan wrong and optimize code usage (#35623 ) 1. get mv statistics, the relation id should be getted from scan plan, fix it. 2. optimize the `isGroupEquals` method, add `MaterializationContext` param which maybe used to control the decide group by equals logic. 3. remove `set enable_nereids_timeout = false;` setting in mv rewrite regression test.	2024-05-30 19:59:37 +08:00
minghong	8f264a7206	[opt](nereids) compare str literal as date literal to compute selectivity (#35610 ) this pr improves #34542, when the real data type is date-like type. Some users are likely to define date(datetime) column as Varchar type. when estimating the selectivity of predicate like A>'2020-01-01', if nereids regards A and '2020-01-01' as date type, the sel is more accurate than that as string type.	2024-05-30 19:59:37 +08:00
Jibing-Li	3efab570df	[fix](stats) Add synchronize for some analysis maps to avoid ConcurrentModificationException (#35591 ) Add synchronized for analysis related maps to avoid ConcurrentModificationException. For example, modify the map while writing image will throw ConcurrentModificationException.	2024-05-30 19:59:37 +08:00
Mingyu Chen	cb88334e34	[fix](meta) fix catalog replay error (#35532 ) ``` Caused by: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) ~[?:1.8.0_301] at org.apache.doris.catalog.ResourceMgr.getResource(ResourceMgr.java:166) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.datasource.CatalogProperty.catalogResource(CatalogProperty.java:67) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.datasource.CatalogProperty.getOrDefault(CatalogProperty.java:77) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.datasource.ExternalCatalog.setDefaultPropsIfMissing(ExternalCatalog.java:173) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.datasource.hive.HMSExternalCatalog.setDefaultPropsIfMissing(HMSExternalCatalog.java:238) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.datasource.ExternalCatalog.gsonPostProcess(ExternalCatalog.java:687) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.persist.gson.GsonUtils$PostProcessTypeAdapterFactory$1.read(GsonUtils.java:640) ~[doris-fe.jar:1.2-SNAPSHOT] at com.google.gson.TypeAdapter.fromJsonTree(TypeAdapter.java:299) ~[gson-2.10.1.jar:?] at org.apache.doris.persist.gson.RuntimeTypeAdapterFactory$1.read(RuntimeTypeAdapterFactory.java:289) ~[doris-fe.jar:1.2-SNAPSHOT] ``` Introduced from #33610. When read meta image, the `resource` maybe null, we should ignore it.	2024-05-30 19:59:37 +08:00
Mingyu Chen	14de0974a5	[opt](audit) add timeout for audit log load and modify the label format (#35535 ) ## Proposed changes 1. Add a new global variable: `audit_plugin_load_timeout`, default is 600 sec Avoid using default stream load, which is 4 hours, too long for audit log load. 2. Modify the label of audit log load Add millisecond to avoid same label from different FE.	2024-05-30 19:59:37 +08:00
Mingyu Chen	0ed7dc1081	[fix](resource-tag) missing resource tag after forwarding to master (#35618 ) ## Proposed changes All DDL and DML will be forwarded to Master FE. And we forgot to set resource tag in ConnectionContext after forwarding	2024-05-30 19:59:37 +08:00
walter	900850fb16	[fix](backup) save finished state for local repo backup (#35491 ) The backup finished state in the local repo is not logged. So if there exist multiple local repo backups, and FE was restarted, then these backup records will be reset to the UPLOAD_INFO state.	2024-05-30 19:59:37 +08:00
zy-kkk	b0e2461181	[branch-2.1][improvement](JdbcScan) Change the mysql function that does not support pushdown in JdbcScan to Config (#35631 ) pk #35196	2024-05-30 15:40:08 +08:00
morrySnow	f6e8c5324d	[fix](Nereids) remove getTableInMinidumpCache temporary (#35571 ) bug introduced by #18747 getTableInMinidumpCache use wrong way to compare table's qualified name. we remove it temporary since it not use in productive env anymore	2024-05-29 20:31:13 +08:00
morrySnow	aafd0a7868	[fix](Nereids) adjust nullable for set operation may cause IndexOutOfBound (#35588 ) after we refactor set oepration, let it has projection ability, its output size may diff with its child's output size. so we should use it output size as nullable flag list size to ensure it has same size with set operation's output.	2024-05-29 20:30:43 +08:00
seawinde	a536227dea	[fix](mtmv) Fix getting related partition table wrongly when multi base partition table exists (#34781 ) Fix getting related partition table wrongly when multi base partition table exists such as base table def is as following: CREATE TABLE `test1` ( `pre_batch_no` VARCHAR(100) NULL COMMENT 'pre_batch_no', `batch_no` VARCHAR(100) NULL COMMENT 'batch_no', `vin_type1` VARCHAR(50) NULL COMMENT 'vin', `upgrade_day` date COMMENT 'upgrade_day' ) ENGINE=OLAP unique KEY(`pre_batch_no`,`batch_no`, `vin_type1`, `upgrade_day`) COMMENT 'OLAP' PARTITION BY RANGE(`upgrade_day`) ( FROM ("2024-03-20") TO ("2024-03-31") INTERVAL 1 DAY ) DISTRIBUTED BY HASH(`vin_type1`) BUCKETS 10 PROPERTIES ( "replication_num" = "1" ); CREATE TABLE `test2` ( `batch_no` VARCHAR(100) NULL COMMENT 'batch_no', `vin_type2` VARCHAR(50) NULL COMMENT 'vin', `status` VARCHAR(50) COMMENT 'status', `upgrade_day` date not null COMMENT 'upgrade_day' ) ENGINE=OLAP Duplicate KEY(`batch_no`,`vin_type2`) COMMENT 'OLAP' PARTITION BY RANGE(`upgrade_day`) ( FROM ("2024-01-01") TO ("2024-01-10") INTERVAL 1 DAY ) DISTRIBUTED BY HASH(`vin_type2`) BUCKETS 10 PROPERTIES ( "replication_num" = "1" ); if you create partition mv which partition by ` t1.upgrade_day` as following it will be successful select t1.upgrade_day, t1.batch_no, t1.vin_type1 from ( SELECT batch_no, vin_type1, upgrade_day FROM test1 where batch_no like 'c%' group by batch_no, vin_type1, upgrade_day ) t1 left join ( select batch_no, vin_type2, status from test2 group by batch_no, vin_type2, status ) t2 on t1.vin_type1 = t2.vin_type2;	2024-05-29 20:30:23 +08:00
seawinde	eefea4c7e6	[fix](mtmv) Fix partition mv rewrite result wrong (#35236 ) this is brought by https://github.com/apache/doris/pull/33800 if mv is partitioned materialzied view, the data will be wrong by using the hited materialized view when the paritions in related base partiton table are deleted, created and so on. this fix the problem. if SET enable_materialized_view_union_rewrite=true; this will use the materializd view and make sure the data is corrent if SET enable_materialized_view_union_rewrite=false; this will query base table directly to make sure the data is right	2024-05-29 20:30:23 +08:00
morrySnow	1f5edae090	[fix](Nereids) prune not required window expressions on window operator (#35593 ) pick from master #35504 if window expression is not required by its parent, we should prune this column. If all window expressions of window operator are pruned, we remove this window operator directly.	2024-05-29 15:27:27 +08:00
zhangdong	8c0c05b9c6	[fix](auth) Fix no auth,but can select count() (#35465 ) when select count(),cols is empty, should check table priv	2024-05-29 15:07:09 +08:00
morrySnow	f076fe8624	[fix](Nereids) aggregate combinator should be case-insensitive (#35540 )	2024-05-29 15:06:58 +08:00
wuwenchi	aaa89ec768	[bugfix](iceberg)support null values as partition (#35503 ) #31442 test in #34929 When null value is used as the partition value, BE will return the "null" string, so this string needs to be processed specially.	2024-05-29 15:03:16 +08:00
walter	646d8eaa73	[fix](restore) Fix restore table name when lower_case_table_names enabled (#35508 )	2024-05-29 15:02:08 +08:00
TengJianPing	b06794d619	[opt](spill) add session variable of 'enable_force_spill' (#34664 ) (#35561 ) ## Proposed changes pick #34664 <!--Describe your changes.-->	2024-05-29 09:57:31 +08:00
wuwenchi	9fae08254d	[bugfix](hive)Partition fields in wrong order for 2.1 (#35554 ) bp #35322	2024-05-28 22:43:46 +08:00
苏小刚	72a27a0938	[fix](paimon)fix paimon cache bug (#35309 ) Issue Number: close #35024 This bug is because the fe incorrectly sets the update time of paimon catalog, causing the be to be unable to update paimon's schema in time. ```c++ private void initTable() { PaimonTableCacheKey key = new PaimonTableCacheKey(ctlId, dbId, tblId, paimonOptionParams, dbName, tblName); TableExt tableExt = PaimonTableCache.getTable(key); if (tableExt.getCreateTime() < lastUpdateTime) { LOG.warn("invalidate cache table:{}, localTime:{}, remoteTime:{}", key, tableExt.getCreateTime(), lastUpdateTime); PaimonTableCache.invalidateTableCache(key); tableExt = PaimonTableCache.getTable(key); } this.table = tableExt.getTable(); paimonAllFieldNames = PaimonScannerUtils.fieldNames(this.table.rowType()); if (LOG.isDebugEnabled()) { LOG.debug("paimonAllFieldNames:{}", paimonAllFieldNames); } } ```	2024-05-28 18:52:51 +08:00
walter	efdce7e9b3	[fix](binlog) Fix add partition record sql (#35461 ) 1. support adding a temporary partition 2. remove extra parentheses in the list partition value set 3. support unpartitioned partition item	2024-05-28 18:50:05 +08:00
minghong	50e81d9db7	[feat](nereids) add more rules to eliminate empty relation (#34997 ) -branch-2.1 (#35534 ) eliminate empty relations for following patterns: topn->empty sort->empty distribute->empty project->empty (cherry picked from commit 8340f23946c0c8e40510ce937acd3342cb2e28b7) ## Proposed changes Issue Number: close #xxx <!--Describe your changes.--> ## Further comments If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...	2024-05-28 18:12:42 +08:00
caiconghui	27cf5a667f	[enhancement](export) filter empty partition before export table to remote storage (#35389 ) (#35542 ) ## Proposed changes Linked PR : #35389 <!--Describe your changes.--> ## Further comments If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...	2024-05-28 18:11:12 +08:00
morrySnow	b78dae040a	Revert "[fix](nereids) push filter through window, using slot equal-set (#35361 )" (#35541 ) This reverts commit d2df392994e8dc00dfb5f8e49cca83fca97cb565. This PR should not pick to branch-2.1, because the infra it relayed on not in branch-2.1	2024-05-28 17:54:13 +08:00
Jibing-Li	69da39b43d	[improvement](statistics)Use defaultSessionVariable instead of clone a new one. (#34672 ) (#35531 ) backport https://github.com/apache/doris/pull/34672 <!--Describe your changes.--> ## Further comments If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...	2024-05-28 17:19:38 +08:00
Jibing-Li	aa4fd3fd79	[fix](statistics)Improve analyze timeout. (#33836 ) (#35530 ) backport https://github.com/apache/doris/pull/33836 <!--Describe your changes.--> ## Further comments If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...	2024-05-28 17:12:53 +08:00
morrySnow	63e63e114d	[fix](Nereids) could not push down filter through cte producer sometimes (#35507 ) pick from master #35463 commit id 0632309209cc3f9b6523ef7054eb1abdb9d0e7d8 when consumer side eliminate some consumers from plan, the size of consumers is wrong. so we cannot push down some filter in producer side. this PR fix this problem by update consumer set after rewrite outer side	2024-05-28 16:53:51 +08:00
Jibing-Li	9d04d18c94	[improvement](statistics)Write audit log while doing drop stats. (#34433 ) (#35526 ) backport https://github.com/apache/doris/pull/34433 <!--Describe your changes.--> ## Further comments If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...	2024-05-28 16:46:27 +08:00
Pxl	87c90094a7	[Bug](materialized-view) fix unmatch mv coz table name (#35444 ) fix unmatch mv coz table name	2024-05-28 13:17:33 +08:00
seawinde	8599e8ee64	[improvement](mtmv) Add id to statistics map in statement context for cost estimation later (#35436 ) Add id to statistics map in statement context for cost estimation later this helps to improve the probability to use materialized view when query a single table with aggregate and many filter	2024-05-28 13:17:05 +08:00
minghong	d2df392994	[fix](nereids) push filter through window, using slot equal-set (#35361 ) example: filter (y=1) +-- window( ... partition by x) +-- project( A as x, A as y) filter(y=1) is equivalent to filter(x=1), because x and y are in the same equal-set in window#logicalProperties. And hence we could push filter(y=1) through window operator	2024-05-28 13:16:53 +08:00
minghong	dfcabf8d47	[fix](nereids) set mark join reference for bitmap-in-apply (#35435 ) bitmap filter is implemented before mark-join. When support mark-join, we forgot to update the bitmap-filter branch. when convert a bitmap-apply-in to join, we should set markjoinReference to the join if there are markJoinRefereneces	2024-05-28 13:13:41 +08:00
feiniaofeiafei	ac49576229	[Fix](nereids) fix merge aggregate setting top projection bug (#35348 ) introduced by #31811 sql like this: select col1, col2 from (select a as col1, a as col2 from mal_test1 group by a) t group by col1, col2 ; Transformation Description: In the process of optimizing the query, an agg-project-agg pattern is transformed into a project-agg pattern: Before Transformation: LogicalAggregate +-- LogicalPrject +-- LogicalAggregate After Transformation: LogicalProject +-- LogicalAggregate Before the transformation, the projection in the LogicalProject was a AS col1, a AS col2, and the outer aggregate group by keys were col1, col2. After the transformation, the aggregate group by keys became a, a, and the projection remained a AS col1, a AS col2. Problem: When building the project projections, the group by key a, a needed to be transformed to a AS col1, a AS col2. The old code had a bug where it used the slot as the map key and the alias in the projections as the map value. This approach did not account for the situation where aliases might have the same slot. Solution: The new code fixes this issue by using the original outer aggregate group by expression's exprId. It searches within the original project projections to find the NamedExpression that has the same exprId. These expressions are then placed into the new projections. This method ensures that the correct aliases are maintained, resolving the bug.	2024-05-28 13:13:31 +08:00
xy720	c38c939b52	[bug](Fe) fix potential deadlock in show proc statement (#34988 )	2024-05-28 13:12:03 +08:00
wangbo	f0e883c968	[Fix](executor)Fix backend_active_tasks only scan one be (#35490 ) ## Proposed changes Fix ```select * from backend_active_tasks``` but only return one random be info.	2024-05-28 11:48:42 +08:00
morrySnow	8c4f5af708	[opt](Nereids) auto fallback when insert unsupport catalog (#33353 ) (#35453 ) pick from master #33353	2024-05-27 16:58:35 +08:00
zhangdong	1a52e4f7db	[chore](mtmv)Optimize mtmv logs and exception information (#34957 ) (#35446 ) pick from master #34957 1. Change some logs to debug. 2. Error prompt changed from MTMV to async materialized view	2024-05-27 16:35:13 +08:00
zhangdong	a32db25070	[enhance](mtmv) allow add index for MTMV (#34225 ) (#35443 ) Previously, the limitation on whether operations can be performed on materialized views was to determine `opType`. Now, a `allowOpMTMV()` method is implemented through various `clauses`. Because some operations have the same `opType`, but some operations allow and some do not. For example, the `opType` for both `add column` and `create index` is `SCHEMA-CHANGE`, but `add column` is not allowed and `create index` is allowed.	2024-05-27 16:22:16 +08:00
Lightman	d71e9d34fe	[Bugfix] Fix mv column type is not changed when do schema change (#34598 )	2024-05-27 15:28:12 +08:00
wangbo	c44affb43f	Add downgrade scan thread num by column num (#35351 )	2024-05-27 15:27:12 +08:00
Pxl	82ff29faea	[Chore](materialized-view) forbid create mv on row store table (#35360 ) forbid create mv on row store table	2024-05-27 15:25:16 +08:00
wuwenchi	f98ed4e4c5	[bugfix](hive)Misspelling of class names (#34981 )	2024-05-27 15:24:38 +08:00
morrySnow	a82c6e869e	[fix](Nereids) LogicalEmptyRelation type is wrong (#35382 )	2024-05-27 15:23:46 +08:00
zy-kkk	2e20e38523	[improvement](jdbc catalog) remove useless jdbc catalog code (#34986 ) (#35418 )	2024-05-27 14:25:26 +08:00
谢健	af986c370b	[feat](Nereids): Put the Child with Least Row Count in the First Position of Intersect (#34290 ) (#35339 ) In this pull request, we optimize the ordering of children in the Intersect operator to improve query performance. The proposed change is to place the child with the least row count in the first position of the Intersect operator. The rationale behind this optimization is that the Intersect operator works by first evaluating the leftmost child and then iterating through the results of the other children to find matching rows. By placing the child with the least row count first, we can minimize the number of iterations required to find the matching rows, thereby reducing the overall execution time of the query.	2024-05-27 11:52:35 +08:00
walter	952875b437	[chore](restore) Add logs about the restore table state (#35363 )	2024-05-25 17:47:38 +08:00

1 2 3 4 5 ...

7013 Commits