doris

Author	SHA1	Message	Date
minghong	d7cb5cf3db	[feature](nereids) add session var: dump_nereids_memo (#17666 ) * dump_nereids_memo * print groupexpr id	2023-03-11 13:40:15 +08:00
minghong	3231fab8c2	[feature](nereids) add unique id for groupExpression and plan node (#17628 ) * add unqiue id for groupExpression and plan node * fix ut	2023-03-11 13:23:41 +08:00
jakevin	db9692a114	[feature](Nereids): convert CrossJoin to InnerJoin. (#17681 )	2023-03-11 13:23:28 +08:00
谢健	3745e6c18a	[fix](Nereids): order of project's logical properties is different with that of project expression (#17648 )	2023-03-11 00:26:54 +08:00
jakevin	051ab7a9c6	[refactor](Nereids): refactor Join-Dependent Predicate Duplication. (#17653 )	2023-03-10 22:19:45 +08:00
Weijie Guo	566d133610	[enhancement](Nereids) Refactor EliminateLimitTest and EliminateFilterTest by match-pattern (#17631 )	2023-03-10 21:24:36 +08:00
yongjinhou	9cfa61b402	[Enhancement](HttpServer) Provide authentication interface for BE (#17073 ) Add an authentication interface in FE for BE	2023-03-10 16:34:47 +08:00
minghong	9ae5ec4dc5	[fix](nereids) PushdownExpressionsInHashCondition contains duplicate column and WindowExpression miss column stats (#17624 ) tpcds: q47 and q57 1. PushdownExpressionsInHashCondition:project contains duplicate column 2. WindowExpression stats caclucate: miss column stats	2023-03-10 16:08:43 +08:00
xueweizhang	739e043c8d	[fix](publish) add retry publish when succeed replica num less than quorum and transaction not VISIBLE (#17453 ) for some reasons, transaction pushlish succeed replica num less than quorum, this transaction's status can not to be VISIBLE, and this publish task of this replica of this tablet on this backend need retry publish success to make transaction VISIBLE when last publish failed. Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2023-03-10 12:02:15 +08:00
Pxl	1a549edac2	[Chore](third-party) upgrade thrift from 0.13 to 0.16 (#17202 ) upgrade thrift from 0.13 to 0.16 There is thrift's release notes https://github.com/apache/thrift/blob/master/CHANGES.md	2023-03-10 11:33:16 +08:00
Yulei-Yang	f84b8b7c8b	[fix](priv) fix extract real user name when do privilege check (#17488 ) fix extract real user name of root/admin	2023-03-10 10:22:13 +08:00
Mingyu Chen	fe6361f4b5	[regression-test](p0) fix some unstable p0 cases (#17518 ) drop database before create remove some large, unused debug log	2023-03-10 10:21:39 +08:00
Mingyu Chen	c7aa3f9717	[fix](backup) backup throw NPE when no partition in table (#17546 ) If table has no partition, backup will report error: 2023-03-06 17:35:32,971 ERROR (backupHandler\|24) [Daemon.run():118] daemon thread got exception. name: backupHandler java.util.NoSuchElementException: No value present at java.util.Optional.get(Optional.java:135) ~[?:1.8.0_152] at org.apache.doris.catalog.OlapTable.selectiveCopy(OlapTable.java:1259) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.backup.BackupJob.prepareBackupMeta(BackupJob.java:505) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.backup.BackupJob.prepareAndSendSnapshotTask(BackupJob.java:398) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.backup.BackupJob.run(BackupJob.java:301) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.backup.BackupHandler.runAfterCatalogReady(BackupHandler.java:188) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.common.util.Daemon.run(Daemon.java:116) ~[doris-fe.jar:1.0-SNAPSHOT]	2023-03-10 10:19:37 +08:00
huangzhaowei	4ba93efc98	[Enhance](DOE)Support parse default es iso datetime string (#17412 ) * support parse default es iso datetime string	2023-03-10 09:59:20 +08:00
morrySnow	006f7a91ac	[fix](planner) should not turn on push agg op when olapscan has conjuncts on it (#17598 ) we should not set PushAggOp to any type, if olap scan already has conjunct on it.	2023-03-10 09:33:08 +08:00
luozenglin	c3c7bc4340	[fix](profile) fix profile sort child list exception (#17613 )	2023-03-10 08:44:32 +08:00
Xinyi Zou	f9baf9c556	[improvement](scan) Support pushdown execute expr ctx (#15917 ) In the past, only simple predicates (slot=const), and, like, or (only bitmap index) could be pushed down to the storage layer. scan process: Read part of the column first, and calculate the row ids with a simple push-down predicate. Use row ids to read the remaining columns and pass them to the scanner, and the scanner filters the remaining predicates. This pr will also push-down the remaining predicates (functions, nested predicates...) in the scanner to the storage layer for filtering. scan process: Read part of the column first, and use the push-down simple predicate to calculate the row ids, (same as above) Use row ids to read the columns needed for the remaining predicates, and use the pushed-down remaining predicates to reduce the number of row ids again. Use row ids to read the remaining columns and pass them to the scanner.	2023-03-10 08:35:32 +08:00
huangzhaowei	4ddd303cfc	[Feature-wip](MySQL Load)Support cancel query for mysql load (#17233 ) Notice some changes: 1. Support cancel query for mysql load 2. Change the thread pool for mysql load manager. 3. Fix sucret path check logic 4. Fix some doc error	2023-03-09 22:08:26 +08:00
YueW	4a0361914b	[fix](alter inverted index) add or drop inverted index also need change table state to SCHEMA_CHANGE (#17471 ) before this pr, add or drop inverted index not change table state, maybe multiple alter jobs executed at the same time, that may lead to some unexpected problems.	2023-03-09 16:33:46 +08:00
Adonis Ling	310bdb60f4	[chore](maven) Prefer protoc in thirdparty to the one in maven artifacts (#17596 ) The prebuilt protoc-gen-grpc-java binary uses glibc on Linux and the version of glibc which Centos 6 uses is too old.	2023-03-09 16:21:38 +08:00
morrySnow	6c894be007	[enhancement](Nereids) support decimalv3 and precision derive (#17393 )	2023-03-09 14:12:10 +08:00
谢健	aaedcf34cf	[enhancement](Nereids) refactor costModel framework (#17339 ) refactor cost-model frameWork: 1. Use Cost class to encapsulate double cost 2. Use the `addChildCost` function to calculate the cost with children rather than add directly Note we use the `Cost` class because we hope to customize the operator of adding a child host. Therefore, only when the cost would add the child Cost or be added by the parent we use `Cost`. Otherwise, we use double such as `upperbound`	2023-03-09 13:58:44 +08:00
谢健	e1ea2e1f2c	[fix](Nereids) store offset of Limit in exchangeNode (#17548 ) When the limit has offset, we should add an exchangeNode and store the offset in it	2023-03-09 13:43:12 +08:00
xueweizhang	2d027282f3	[fix](profile) modify load profile some bugs and docs (#17533 ) 1. 'insert into' profile has 'insert' type, can not query by 'load' type 2. 'insert into' profile does not have job_id, can not query by job_id. so put all profiles key with query_id 3. 'broker load' profile does not have some infos, npe	2023-03-09 11:58:40 +08:00
zhangstar333	4ef46159ae	[vectorized](udaf) support array type for java-udaf (#17351 )	2023-03-09 11:30:07 +08:00
luozenglin	00727e8c11	[fix](in-bitmap) fix result may be wrong if the left side of the in bitmap predicate is a constant (#17570 )	2023-03-09 10:59:05 +08:00
Xinyi Zou	397cc011c4	[fix](function) fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (#17420 ) ECB algorithm, block_encryption_mode does not take effect, it only takes effect when init vector is provided. Solved: 192/256 supports calculation without init vector For other algorithms, an error should be reported when there is no init vector Initialization Vector. The default value for the block_encryption_mode system variable is aes-128-ecb, or ECB mode, which does not require an initialization vector. The alternative permitted block encryption modes CBC, CFB1, CFB8, CFB128, and OFB all require an initialization vector. Reference: https://dev.mysql.com/doc/refman/8.0/en/encryption-functions.html#function_aes-decrypt Note: This fix does not support smooth upgrades. during upgrade process, query may report error: funciton not found	2023-03-09 09:51:41 +08:00
Calvin Kirs	b6128f9b65	[dependenct](fe) Replace jackson-mapper-asl with fastxml-jsckson (#17303 )	2023-03-09 09:35:58 +08:00
starocean999	2b6d971c2f	[fix](nereids)fix first_value/lead/lag window function bug in nereids (#17315 ) * [fix](nereids)fix first_value/lead/lag window function bug in nereids * add more test * add order by to fix test case * fix test cases	2023-03-09 09:35:27 +08:00
minghong	4822b9811a	[feature](nereids)support bitmap runtime filter on nereids (#16927 ) * A in(B) -> bitmap_contains(bitmap_union(B), A) support bitmap runtime filter on nereids * GroupPlan -> Plan * fmt * fix target cast problem remove test code	2023-03-09 09:30:24 +08:00
yinzhijian	ebda7ba5c6	[Fix](FQDN) fix slow when ip changed (#17455 )	2023-03-09 09:07:16 +08:00
ElvinWei	bd5ed2b0c2	[enhancement](histogram) optimize the histogram bucketing strategy, etc (#17264 ) * optimize the histogram bucketing strategy, etc * fix p0 regression of histogram	2023-03-08 20:12:05 +08:00
Yulei-Yang	75e4f86c2d	[fix](meta) fix catlog parameter when checking privilege of show_create_table stmt (#17445 ) the ctl parameter of show_create_table stmt is not set in checkTblPriv, this is not correct for multicatalog	2023-03-08 19:50:31 +08:00
Tiewei Fang	05b04e4c39	[BugFix](PG catalog) fix that pg catalog can not get all schemas that a pg user can access. (#17517 ) Describe your changes. In the past, pg catalog use sql SELECT schema_name FROM information_schema.schemata where schema_owner='<UserName>'; to select schemas of an user. Howerver, this sql can not find all schemas that a user can access, that because: A user may not be the owner of an schema, but may have read permission on the schema. A user may inherit the permissions of its user group and thus have read permissions on one schema. For these reasons, we replace the sql statement with select nspname from pg_namespace where has_schema_privilege('<UserName>', nspname, 'USAGE');	2023-03-08 19:12:47 +08:00
morrySnow	678f34cad3	[fix](planner) insert default value should not change return type of function object in function set (#17536 ) function now's return type changed to datetimev2 by mistake. It can be reproduced in the following way CREATE TABLE `testdt` ( `c1` int(11) NULL, `c2` datetimev2 NULL DEFAULT CURRENT_TIMESTAMP ) ENGINE=OLAP DUPLICATE KEY(`c1`, `c2`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`c1`) BUCKETS 10 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "light_schema_change" = "true", "disable_auto_compaction" = "false" ); insert into testdt2(c1) values(1); select now();	2023-03-08 17:08:28 +08:00
amory	b1ca87eb9b	[FIX](complex-type) fix Is null predict for map/struct (#17497 ) Fix is null predicate is not supported in select statement for map and struct column	2023-03-08 17:03:06 +08:00
Gabriel	feacb15e71	[Improvement](datev2) push down datev2 predicates with date literal (#17522 )	2023-03-08 16:54:54 +08:00
AKIRA	36b6cea462	[feature-wip](nereids) Support Q-Error to measure the accuracy of derived statistics (#17185 ) Collect each estimated output rows and exact output rows for each plan node, and use this to measure the accuracy of derived statistics. The estimated result is managed by ProfileManager. We would get this estimated result in the http request by query id later.	2023-03-08 16:26:24 +08:00
Calvin Kirs	d908d5fe01	[dependency](fe)Dependency Upgrade (#17377 ) * Upgrade log4j to 2.X - binding log4j version to 2.18.0 - used log4j-1.2-api complete smooth upgrade * Upgrade filerupload to 1.5 * Upgrade commons-io to 2.7 * Upgrade commons-compress to 1.22 * Upgrade gson to 2.8.9 * Upgrade guava to 30.0-jre * Binding jackson version to 2.14.2 * Upgrade netty-all to 4.1.89.final * Upgrade protobuf to 3.21.12 * Upgrade kafka-clints to 3.4.0 * Upgrade calcite version to 1.33.0 * Upgrade aws-java-sdk to 1.12.302 * Upgrade hadoop to 3.3.4 * Upgrade zookeeper to 3.4.14 * Binding tomcat-embed-core to 8.5.86 * Upgrade apache parent pom to 25 * Use hive-exec-core as a hive dependency, add the missing jar-hive-serde separately * Basic public dependencies are extracted to parent dependencies * Use jackson uniformly as the basic json tool * Remove springloaded, spring-boot-devtools has the same functionality * Modify the spark-related dependency scope to provide, which should be provided at runtime	2023-03-08 14:28:40 +08:00
zhengshiJ	aab14922af	[Feature](Nereids) support MarkJoin (#16616 ) # Proposed changes 1.The new optimizer supports the combination of subquery and disjunction.In the way of MarkJoin, it behaves the same as the old optimizer. For design details see:https://emmymiao87.github.io/jekyll/update/2021/07/25/Mark-Join.html. 2.Implicit type conversion is performed when conjects are generated after subquery parsing 3.Convert the unnesting of scalarSubquery in filter from filter+join to join + Conjuncts.	2023-03-08 14:26:24 +08:00
Kang	626fbc34f9	[bugfix](jsonb) Fix create mv using jsonb key cause be crash (#17430 )	2023-03-08 14:18:26 +08:00
bobhan1	4ea0d6c5fa	[feature](array_function) add support for array_popfront (#17416 )	2023-03-08 13:57:38 +08:00
gitccl	b1d65f855d	[Feature](array-function) Support array_concat function (#17436 )	2023-03-08 13:57:16 +08:00
jakevin	2b6133f4d0	[feature](Nereids): pushdown complex project through inner/outer Join. (#17365 )	2023-03-08 12:00:56 +08:00
Kang	4b743061b4	[feature](function) support type template in SQL function (#17344 ) A new way just like c++ template is proposed in this PR. The previous functions can be defined much simpler using template function. # map element extract template function [['element_at', '%element_extract%'], 'E', ['ARRAY<E>', 'BIGINT'], 'ALWAYS_NULLABLE', ['E']], # map element extract template function [['element_at', '%element_extract%'], 'V', ['MAP<K, V>', 'K'], 'ALWAYS_NULLABLE', ['K', 'V']], BTW, the plain type function is not affected and the legacy ARRAY_X MAP_K_V is still supported for compatability.	2023-03-08 10:51:31 +08:00
yiguolei	c97422bd3d	[enhancement](regression-test) add sleep 3s for schema change and rollup (#17484 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-03-08 10:43:05 +08:00
qiye	a767472c56	[fix](DOE)Fix es p0 case error (#17502 ) Fix es array parse error, introduced by #16806	2023-03-08 08:06:30 +08:00
LiBinfeng	6b88df2bdd	[enhancement](planner) support case transition of timestamp datatype when create table (#17305 )	2023-03-07 21:03:25 +08:00
minghong	fd8adb492d	[fix](nereids) fix bugs in nereids window function (#17284 ) fix two problems: 1. push agg-fun in windowExpression down to AggregateNode for example, sql: select sum(sum(a)) over (order by b) Plan: windowExpression( sum(y) over (order by b)) +--- Agg(sum(a) as y, b) 2. push other expr to upper proj for example, sql: select sum(a+1) over () Plan: windowExpression(sum(y) over ()) +--- Project(a + 1 as y,...) +--- Agg(a,...)	2023-03-07 16:35:37 +08:00
liujinhui	fca567068e	[Enhancement](spark load)Support for RM HA (#15000 ) Adding RM HA configuration to the spark load. Spark can accept HA parameters via config, we just need to accept it in the DDL CREATE EXTERNAL RESOURCE spark_resource_sinan_node_manager_ha PROPERTIES ( "type" = "spark", "spark.master" = "yarn", "spark.submit.deployMode" = "cluster", "spark.executor.memory" = "10g", "spark.yarn.queue" = "XXXX", "spark.hadoop.yarn.resourcemanager.address" = "XXXX:8032", "spark.hadoop.yarn.resourcemanager.ha.enabled" = "true", "spark.hadoop.yarn.resourcemanager.ha.rm-ids" = "rm1,rm2", "spark.hadoop.yarn.resourcemanager.hostname.rm1" = "XXXX", "spark.hadoop.yarn.resourcemanager.hostname.rm2" = "XXXX", "spark.hadoop.fs.defaultFS" = "hdfs://XXXX", "spark.hadoop.dfs.nameservices" = "hacluster", "spark.hadoop.dfs.ha.namenodes.hacluster" = "mynamenode1,mynamenode2", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode1" = "XXX:8020", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode2" = "XXXX:8020", "spark.hadoop.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider", "working_dir" = "hdfs://XXXX/doris_prd_data/sinan/spark_load/", "broker" = "broker_personas", "broker.username" = "hdfs", "broker.password" = "", "broker.dfs.nameservices" = "XXX", "broker.dfs.ha.namenodes.XXX" = "mynamenode1, mynamenode2", "broker.dfs.namenode.rpc-address.XXXX.mynamenode1" = "XXXX:8020", "broker.dfs.namenode.rpc-address.XXXX.mynamenode2" = "XXXX:8020", "broker.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" ); Co-authored-by: liujh <liujh@t3go.cn>	2023-03-07 15:46:14 +08:00

... 85 86 87 88 89 ...

8289 Commits