doris

Author	SHA1	Message	Date
luozenglin	00727e8c11	[fix](in-bitmap) fix result may be wrong if the left side of the in bitmap predicate is a constant (#17570 )	2023-03-09 10:59:05 +08:00
Xinyi Zou	397cc011c4	[fix](function) fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (#17420 ) ECB algorithm, block_encryption_mode does not take effect, it only takes effect when init vector is provided. Solved: 192/256 supports calculation without init vector For other algorithms, an error should be reported when there is no init vector Initialization Vector. The default value for the block_encryption_mode system variable is aes-128-ecb, or ECB mode, which does not require an initialization vector. The alternative permitted block encryption modes CBC, CFB1, CFB8, CFB128, and OFB all require an initialization vector. Reference: https://dev.mysql.com/doc/refman/8.0/en/encryption-functions.html#function_aes-decrypt Note: This fix does not support smooth upgrades. during upgrade process, query may report error: funciton not found	2023-03-09 09:51:41 +08:00
Calvin Kirs	b6128f9b65	[dependenct](fe) Replace jackson-mapper-asl with fastxml-jsckson (#17303 )	2023-03-09 09:35:58 +08:00
starocean999	2b6d971c2f	[fix](nereids)fix first_value/lead/lag window function bug in nereids (#17315 ) * [fix](nereids)fix first_value/lead/lag window function bug in nereids * add more test * add order by to fix test case * fix test cases	2023-03-09 09:35:27 +08:00
minghong	4822b9811a	[feature](nereids)support bitmap runtime filter on nereids (#16927 ) * A in(B) -> bitmap_contains(bitmap_union(B), A) support bitmap runtime filter on nereids * GroupPlan -> Plan * fmt * fix target cast problem remove test code	2023-03-09 09:30:24 +08:00
yinzhijian	ebda7ba5c6	[Fix](FQDN) fix slow when ip changed (#17455 )	2023-03-09 09:07:16 +08:00
ElvinWei	bd5ed2b0c2	[enhancement](histogram) optimize the histogram bucketing strategy, etc (#17264 ) * optimize the histogram bucketing strategy, etc * fix p0 regression of histogram	2023-03-08 20:12:05 +08:00
Yulei-Yang	75e4f86c2d	[fix](meta) fix catlog parameter when checking privilege of show_create_table stmt (#17445 ) the ctl parameter of show_create_table stmt is not set in checkTblPriv, this is not correct for multicatalog	2023-03-08 19:50:31 +08:00
Tiewei Fang	05b04e4c39	[BugFix](PG catalog) fix that pg catalog can not get all schemas that a pg user can access. (#17517 ) Describe your changes. In the past, pg catalog use sql SELECT schema_name FROM information_schema.schemata where schema_owner='<UserName>'; to select schemas of an user. Howerver, this sql can not find all schemas that a user can access, that because: A user may not be the owner of an schema, but may have read permission on the schema. A user may inherit the permissions of its user group and thus have read permissions on one schema. For these reasons, we replace the sql statement with select nspname from pg_namespace where has_schema_privilege('<UserName>', nspname, 'USAGE');	2023-03-08 19:12:47 +08:00
morrySnow	678f34cad3	[fix](planner) insert default value should not change return type of function object in function set (#17536 ) function now's return type changed to datetimev2 by mistake. It can be reproduced in the following way CREATE TABLE `testdt` ( `c1` int(11) NULL, `c2` datetimev2 NULL DEFAULT CURRENT_TIMESTAMP ) ENGINE=OLAP DUPLICATE KEY(`c1`, `c2`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`c1`) BUCKETS 10 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "light_schema_change" = "true", "disable_auto_compaction" = "false" ); insert into testdt2(c1) values(1); select now();	2023-03-08 17:08:28 +08:00
amory	b1ca87eb9b	[FIX](complex-type) fix Is null predict for map/struct (#17497 ) Fix is null predicate is not supported in select statement for map and struct column	2023-03-08 17:03:06 +08:00
Gabriel	feacb15e71	[Improvement](datev2) push down datev2 predicates with date literal (#17522 )	2023-03-08 16:54:54 +08:00
AKIRA	36b6cea462	[feature-wip](nereids) Support Q-Error to measure the accuracy of derived statistics (#17185 ) Collect each estimated output rows and exact output rows for each plan node, and use this to measure the accuracy of derived statistics. The estimated result is managed by ProfileManager. We would get this estimated result in the http request by query id later.	2023-03-08 16:26:24 +08:00
Calvin Kirs	d908d5fe01	[dependency](fe)Dependency Upgrade (#17377 ) * Upgrade log4j to 2.X - binding log4j version to 2.18.0 - used log4j-1.2-api complete smooth upgrade * Upgrade filerupload to 1.5 * Upgrade commons-io to 2.7 * Upgrade commons-compress to 1.22 * Upgrade gson to 2.8.9 * Upgrade guava to 30.0-jre * Binding jackson version to 2.14.2 * Upgrade netty-all to 4.1.89.final * Upgrade protobuf to 3.21.12 * Upgrade kafka-clints to 3.4.0 * Upgrade calcite version to 1.33.0 * Upgrade aws-java-sdk to 1.12.302 * Upgrade hadoop to 3.3.4 * Upgrade zookeeper to 3.4.14 * Binding tomcat-embed-core to 8.5.86 * Upgrade apache parent pom to 25 * Use hive-exec-core as a hive dependency, add the missing jar-hive-serde separately * Basic public dependencies are extracted to parent dependencies * Use jackson uniformly as the basic json tool * Remove springloaded, spring-boot-devtools has the same functionality * Modify the spark-related dependency scope to provide, which should be provided at runtime	2023-03-08 14:28:40 +08:00
zhengshiJ	aab14922af	[Feature](Nereids) support MarkJoin (#16616 ) # Proposed changes 1.The new optimizer supports the combination of subquery and disjunction.In the way of MarkJoin, it behaves the same as the old optimizer. For design details see:https://emmymiao87.github.io/jekyll/update/2021/07/25/Mark-Join.html. 2.Implicit type conversion is performed when conjects are generated after subquery parsing 3.Convert the unnesting of scalarSubquery in filter from filter+join to join + Conjuncts.	2023-03-08 14:26:24 +08:00
Kang	626fbc34f9	[bugfix](jsonb) Fix create mv using jsonb key cause be crash (#17430 )	2023-03-08 14:18:26 +08:00
bobhan1	4ea0d6c5fa	[feature](array_function) add support for array_popfront (#17416 )	2023-03-08 13:57:38 +08:00
gitccl	b1d65f855d	[Feature](array-function) Support array_concat function (#17436 )	2023-03-08 13:57:16 +08:00
jakevin	2b6133f4d0	[feature](Nereids): pushdown complex project through inner/outer Join. (#17365 )	2023-03-08 12:00:56 +08:00
Kang	4b743061b4	[feature](function) support type template in SQL function (#17344 ) A new way just like c++ template is proposed in this PR. The previous functions can be defined much simpler using template function. # map element extract template function [['element_at', '%element_extract%'], 'E', ['ARRAY<E>', 'BIGINT'], 'ALWAYS_NULLABLE', ['E']], # map element extract template function [['element_at', '%element_extract%'], 'V', ['MAP<K, V>', 'K'], 'ALWAYS_NULLABLE', ['K', 'V']], BTW, the plain type function is not affected and the legacy ARRAY_X MAP_K_V is still supported for compatability.	2023-03-08 10:51:31 +08:00
yiguolei	c97422bd3d	[enhancement](regression-test) add sleep 3s for schema change and rollup (#17484 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-03-08 10:43:05 +08:00
qiye	a767472c56	[fix](DOE)Fix es p0 case error (#17502 ) Fix es array parse error, introduced by #16806	2023-03-08 08:06:30 +08:00
LiBinfeng	6b88df2bdd	[enhancement](planner) support case transition of timestamp datatype when create table (#17305 )	2023-03-07 21:03:25 +08:00
minghong	fd8adb492d	[fix](nereids) fix bugs in nereids window function (#17284 ) fix two problems: 1. push agg-fun in windowExpression down to AggregateNode for example, sql: select sum(sum(a)) over (order by b) Plan: windowExpression( sum(y) over (order by b)) +--- Agg(sum(a) as y, b) 2. push other expr to upper proj for example, sql: select sum(a+1) over () Plan: windowExpression(sum(y) over ()) +--- Project(a + 1 as y,...) +--- Agg(a,...)	2023-03-07 16:35:37 +08:00
liujinhui	fca567068e	[Enhancement](spark load)Support for RM HA (#15000 ) Adding RM HA configuration to the spark load. Spark can accept HA parameters via config, we just need to accept it in the DDL CREATE EXTERNAL RESOURCE spark_resource_sinan_node_manager_ha PROPERTIES ( "type" = "spark", "spark.master" = "yarn", "spark.submit.deployMode" = "cluster", "spark.executor.memory" = "10g", "spark.yarn.queue" = "XXXX", "spark.hadoop.yarn.resourcemanager.address" = "XXXX:8032", "spark.hadoop.yarn.resourcemanager.ha.enabled" = "true", "spark.hadoop.yarn.resourcemanager.ha.rm-ids" = "rm1,rm2", "spark.hadoop.yarn.resourcemanager.hostname.rm1" = "XXXX", "spark.hadoop.yarn.resourcemanager.hostname.rm2" = "XXXX", "spark.hadoop.fs.defaultFS" = "hdfs://XXXX", "spark.hadoop.dfs.nameservices" = "hacluster", "spark.hadoop.dfs.ha.namenodes.hacluster" = "mynamenode1,mynamenode2", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode1" = "XXX:8020", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode2" = "XXXX:8020", "spark.hadoop.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider", "working_dir" = "hdfs://XXXX/doris_prd_data/sinan/spark_load/", "broker" = "broker_personas", "broker.username" = "hdfs", "broker.password" = "", "broker.dfs.nameservices" = "XXX", "broker.dfs.ha.namenodes.XXX" = "mynamenode1, mynamenode2", "broker.dfs.namenode.rpc-address.XXXX.mynamenode1" = "XXXX:8020", "broker.dfs.namenode.rpc-address.XXXX.mynamenode2" = "XXXX:8020", "broker.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" ); Co-authored-by: liujh <liujh@t3go.cn>	2023-03-07 15:46:14 +08:00
谢健	704faaed84	[feature](Nereids) add rule split limit into two phase (#16797 ) 1. Add a rule split limit, like Limit(Origin) ==> Limit(Global) -> Gather -> Limit(Local) 2. Add a rule: limit-> sort ==> topN 3. fix a bug about topN 4. make the type of limit,offset long in topN And because this rule is always beneficial, we add a rule in the rewrite phase	2023-03-07 15:34:12 +08:00
谢健	05c5ab5490	[fix](planner) only table name should convert to lowercase when create table (#17373 ) we met error: Unknown column '{}DORIS_DELETE_SIGN{}' in 'default_cluster:db.table. that because when we use alias as the tableName to construct a Table, all parts of the name will be lowercase if lowerCaseTableNames = 1. To avoid it, we should extract tableName from alias and only lower tableName	2023-03-07 14:41:35 +08:00
mch_ucchi	b9bb28f22c	[Enhancement](Planner)fix unclear exception msg when create table. #17473	2023-03-07 13:38:20 +08:00
jakevin	357d8c1746	[enhance](Nereids): remove rule flag in LogicalJoin (#17452 )	2023-03-07 13:18:50 +08:00
jakevin	b8c9875adb	[refactor](Nereids): refactor PushdownLimit (#17355 )	2023-03-07 12:04:20 +08:00
jakevin	b0e3156f51	[enhance](Nereids): refactor code in Project (#17450 )	2023-03-07 11:15:33 +08:00
pengxiangyu	f79b066790	[fix](resource)Add s3 checker for alter resource (#17467 ) * add s3 validity checker for alter resource. * add s3 validity checker for alter resource. * add s3 validity checker for alter resource.	2023-03-07 11:07:15 +08:00
zhangdong	7e96b06e6c	[Enhance](auth)Users support multiple roles (#17236 ) Describe your changes. 1.support GRANT role [, role] TO user_identity 2.support REVOKE role [, role] FROM user_identity 3.’Show grants‘ Add a column to display the roles owned by users 4.‘alter user’ prohibit deleting user's role 5.Repair Logic of roleName cannot start with RoleManager.DEFAULT_ ROLE	2023-03-07 10:28:56 +08:00
xueweizhang	bada731390	[fix](restore) fix bug when replay restore and reserve dynamic partition (#17326 ) when replay restore a table with reserve_dynamic_partition_enable=true, must registerOrRemoveDynamicPartitionTable with isReplay=true, or maybe cause OBSERVER can not replay restore auditlog success.	2023-03-07 10:13:08 +08:00
AKIRA	f85f89f240	[fix](planner) Fix incosistency between groupby expression and output of aggregation node (#17438 )	2023-03-07 09:38:20 +08:00
Yulei-Yang	50bf02024a	[Improvement](meta) support return total statistics of all databases for command show proc '/jobs (#17342 ) currently, show proc jobs command can only used on a specific database, if a user want to see overall data of the whole cluster, he has to look into every database and sum them up, it's troublesome. now he can achieve it simply by giving a -1 as dbId. mysql> show proc '/jobs/-1'; +---------------+---------+---------+----------+-----------+-------+ \| JobType \| Pending \| Running \| Finished \| Cancelled \| Total \| +---------------+---------+---------+----------+-----------+-------+ \| load \| 0 \| 0 \| 0 \| 2 \| 2 \| \| delete \| 0 \| 0 \| 0 \| 0 \| 0 \| \| rollup \| 0 \| 0 \| 1 \| 0 \| 1 \| \| schema_change \| 0 \| 0 \| 2 \| 0 \| 2 \| \| export \| 0 \| 0 \| 0 \| 3 \| 3 \| +---------------+---------+---------+----------+-----------+-------+ mysql> show proc '/jobs/-1/rollup'; +----------+------------------+---------------------+---------------------+------------------+-----------------+----------+---------------+----------+------+----------+---------+ \| JobId \| TableName \| CreateTime \| FinishTime \| BaseIndexName \| RollupIndexName \| RollupId \| TransactionId \| State \| Msg \| Progress \| Timeout \| +----------+------------------+---------------------+---------------------+------------------+-----------------+----------+---------------+----------+------+----------+---------+ \| 17826065 \| order_detail \| 2023-02-23 04:21:01 \| 2023-02-23 04:21:22 \| order_detail \| rp1 \| 17826066 \| 6009 \| FINISHED \| \| NULL \| 2592000 \| +----------+------------------+---------------------+---------------------+------------------+-----------------+----------+---------------+----------+------+----------+---------+ 1 row in set (0.01 sec)	2023-03-07 08:57:55 +08:00
ZhangYu0123	440cf526c8	[fix](type compatibility) fix unsigned int type compatibility problem (#17427 ) Fix unsigned int type compatibility value scope problem. When defining columns, map UNSIGNED INT to BIGINT for compatibility. The problems are as follows: It is not consistent with this doc image We support the unsigned int type to be compatible with mysql types, but the unsigned int type is created as the int at the time of definition. This will cause numerical overflow.	2023-03-07 08:55:38 +08:00
Yulei-Yang	b68001aee5	[fix](priv) fix duplicated priv check when check column priv (#17446 ) when executing select stmt, columns privilege check will be invoked multiple times(column number in select stmt) Issue Number: close #xxx	2023-03-07 08:51:55 +08:00
Tiewei Fang	48c2d806d7	[enhencement](jdbc catalog) Use Druid instead of HikariCP in JdbcClient (#17395 ) This pr does three things: 1. Use Druid instead of HikariCP in JdbcClient 2. when download udf jar, add the name of the jar package after the local file name. 3. refactor some jdbcResource code	2023-03-07 08:51:10 +08:00
AKIRA	aedbc5fcb1	[fix](planner) Slots in the cojuncts of table function node didn't got materialized #17460	2023-03-07 08:50:33 +08:00
Pxl	28c55f15c9	[Enchancement](Materialized-View) add more error infomation for select materialized view fail (#17262 ) add more error infomation for select materialized view fail	2023-03-06 18:59:46 +08:00
Ashin Gau	dca16796ad	[fix](ParquetReader) definition level of repeated parent is wrong (#17337 ) Fix three bugs: 1. `repeated_parent_def_level ` should be the definition of its repeated parent. 2. Failed to parse schema like `decimal(p, s)` 3. Fill wrong offsets for array type	2023-03-06 18:15:57 +08:00
caiconghui	0ad638f9fe	[enhancement](transaction) Reduce hold writeLock time for DatabaseTransactionMgr to clear transaction (#17414 ) * [enhancement](transaction) Reduce hold writeLock time for DatabaseTransactionMgr to clear transaction * fix ut * remove unnessary field for remove txn bdbje log --------- Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2023-03-06 11:32:21 +08:00
Yulei-Yang	56a3ead2d7	[Improvement](restore) make timeout of restore job's dispatching task progress configuable (#17434 ) when a restore job which has a plenty of replicas, it may fail due to timeout. The error message is： [RestoreJob.checkAndPrepareMeta():782] begin to send create replica tasks to BE for restore. total 381344 tasks. timeout: 600000 Currently, the max value of timeout is fixed, it's not suitable for such cases.	2023-03-06 10:05:31 +08:00
WenYao	a8f20eb4ac	[Enhencement](schema_scanner) Optimize the performance of reading information schema tables (#17371 ) batch fill block batch call rpc from FE to get table desc For 34w colunms SELECT COUNT( * ) FROM information_schema.columns; time: 10.3s --> 0.4s	2023-03-06 09:53:01 +08:00
Yulei-Yang	d8a231f340	[Improvement](auth)(step-2) add ranger authorizer for hms catalog (#17424 )	2023-03-05 21:50:44 +08:00
奕冷	afb5def385	[enhancement](timeout) replace query timeout with exec timeout (#17360 )	2023-03-05 11:03:59 +08:00
yinzhijian	627b5ee302	[enhancement](k8s) Support fqdn mode for fe in k8s enviroment (#17329 )	2023-03-05 10:18:56 +08:00
yiguolei	b9b028099d	[enhancement](stream load pipe) using queryid or load id to identify stream load pipe instead of fragment instance id (#17362 ) * [enhancement](stream load pipe) using queryid or load id to identify stream load pipe instead of fragment instance id NewLoadStreamMgr already has pipe and other info. Do not need save the pipe into fragment state. and FragmentState should be more clear. But this pr will change the behaviour of BE. I will pick the pr to doris 1.2.3 and add the load id to FE support. The user could upgrade from 1.2.3 to 2.x Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-03-04 16:19:36 +08:00
abmdocrt	82df2ae9d8	[feature](mysql) Support secure MySQL connection to FE (#17138 ) Background: Doris currently does not support SSL connection from MySQL clients, it's not secure enough in some cases, especially access Doris via the public internet. Solution: - Use TLS1.2 protocol to encrypt information. - Implementation details * server <--- connect <--- client * if enable SSL: { * server <--- SSL connection request packet <--- client * server <--- SSL Exchange ---> client } (we will add this `if` logic part in this PR) * server ---> handshake request packet ---> client * server <--- encrypted data ---> client (this part will be realized in this PR) - reference1 https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_connection_phase.html#sect_protocol_connection_phase_initial_handshake_ssl_handshake - reference2 https://www.rfc-editor.org/rfc/rfc5246 close #16313 Signed-off-by: Yukang Lian <yukang.lian2022@gmail.com> Co-authored-by: Gavin Chou <gavineaglechou@gmail.com> Co-authored-by: morningman <morningman@163.com>	2023-03-04 12:14:48 +08:00

1 2 3 4 5 ...

3964 Commits