doris

Author	SHA1	Message	Date
slothever	ede68e075d	[fix](iceberg-v2) fix fe iceberg split, add regression case (#15299 )	2022-12-23 19:33:00 +08:00
Zhengguo Yang	a98636a970	[bugfix](from_unixtime) fix timezone not work for from_unixtime (#15298 ) * [bugfix](from_unixtime) fix timezone not work for from_unixtime	2022-12-23 19:05:09 +08:00
jakevin	bfaaa2bd7c	[feature](Nereids) support digital_masking function (#15252 )	2022-12-23 18:59:08 +08:00
Tiewei Fang	e7a077a81f	[fix](jdbc catalog) fix bugs of jdbc catalog and table valued function (#15216 ) * fix bugs * add `desc function` test * add test * fix	2022-12-23 16:46:39 +08:00
starocean999	b935fd0e7d	[fix](fe)fix bug of the bucket shuffle join is not recognized (#15255 ) * [fix](fe)fix bug of the bucket shuffle join is not recognized * use broadcast join for empty table	2022-12-23 16:44:44 +08:00
luozenglin	8a810cd554	[fix](bitmapfilter) fix core dump caused by bitmap filter (#15296 ) Do not push down the bitmap filter to a non-integer column	2022-12-23 16:42:45 +08:00
AlexYue	fe562bc3e7	[Bug](Agg) fix crash when encountering not supported agg function like last_value(bitmap) (#15257 ) The former logic inside aggregate_function_window.cpp would shutdown BE once encountering agg function with complex type like BITMAP. This pr makes it don't crash and would return one more concrete error message which tells the unsupported function signature to user.	2022-12-23 14:23:21 +08:00
Gabriel	cb295de981	[Bug](decimalv3) Fix wrong precision of DECIMALV3 (#15302 ) * [Bug](decimalv3) Fix wrong precision of DECIMALV3 * update	2022-12-23 14:11:08 +08:00
starocean999	82fbfab77f	[fix](union)the union node should not pass through children in some case (#15286 ) the union node will make children pass through in wrong condition. If the children's materialized slots are different from union node, children can't be passed through.	2022-12-23 10:27:49 +08:00
chenlinzhong	020c47f528	[load](config) update max timeout (#15280 )	2022-12-23 10:15:26 +08:00
starocean999	09a22813e4	[feature](Nereids) support syntax SELECT DISTINCT (#15197 ) Add a new rule 'ProjectWithDistinctToAggregate' to support "select distinct xx from table". This rule check's the logicalProject node's isDisinct property and replace the logicalProject node with a LogicalAggregate node. So any rule before this, if createing a new logicalProject node, should make sure isDisinct property is correctly passed around. please see rule BindSlotReference or BindFunction for example.	2022-12-22 23:54:08 +08:00
HaveAnOrangeCat	df5969ab58	[Feature] Support function roundBankers (#15154 )	2022-12-22 22:53:09 +08:00
Kang	e331e0420b	[improvement](topn)add per scanner limit check for new scanner (#15231 ) Optimize for key topn query like `SELECT * FROM store_sales ORDER BY ss_sold_date_sk, ss_sold_time_sk LIMIT 100` (ss_sold_date_sk, ss_sold_time_sk is prefix of table sort key). Check per scanner limit and set eof true to reduce the data need to be read.	2022-12-22 22:39:31 +08:00
Gabriel	d38461616c	[Pipeline](error msg) format error message (#15247 )	2022-12-22 20:55:06 +08:00
luozenglin	1fdd4172bd	[fix](Inbitmap) fix in bitmap result error when left expr is constant (#15271 ) * [fix](Inbitmap) fix in bitmap result error when left expr is constant 1. When left expr of the in predicate is a constant, instead of generating a bitmap filter, rewrite sql to use `bitmap_contains`. For example,"select k1, k2 from (select 2 k1, 11 k2) t where k1 in (select bitmap_col from bitmap_tbl)" => "select k1, k2 from (select 2 k1, 11 k2) t left semi join bitmap_tbl b on bitmap_contains(b.bitmap_col, t.k1)" * add regression test	2022-12-22 19:25:09 +08:00
Xinyi Zou	77c15729d4	[fix](memory) Fix too many repeat cause OOM (#15217 )	2022-12-22 17:16:18 +08:00
zhengshiJ	a87f905a2d	[Feature](Nereids) unnest subquery in 'not in' predicate into NULL AWARE ANTI JOIN (#15230 ) when we process not in subquery. if the subquery return column is nullable, we need a NULL AWARE ANTI JOIN instead of ANTI JOIN. Doris already support NULL AWARE ANTI JOIN in PR #13871 Nereids need to do that so.	2022-12-22 14:13:47 +08:00
chenlinzhong	87756f5441	[regresstion](query) query with limit 0 regresstion test (#15245 )	2022-12-22 14:06:44 +08:00
Ashin Gau	1520a4af6d	[refactor](resource) use resource to create external catalog (#14978 ) Use resource to create external catalog. -- HMS mysql> create resource hms_resource properties( -> "type"="hms", -> 'hive.metastore.uris' = 'thrift://172.21.0.44:7004', -> 'dfs.nameservices'='HANN', -> 'dfs.ha.namenodes.HANN'='nn1,nn2', -> 'dfs.namenode.rpc-address.HANN.nn1'='172.21.0.32:4007', -> 'dfs.namenode.rpc-address.HANN.nn2'='172.21.0.44:4007', -> 'dfs.client.failover.proxy.provider.HANN'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider' -> ); -- MYSQL mysql> create resource mysql_resource properties ( -> "type"="jdbc", -> "user"="root", -> "password"="123456", -> "jdbc_url" = "jdbc:mysql://127.0.0.1:3316/doris_test?useSSL=false", -> "driver_url" = "https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/mysql-connector-java-8.0.25.jar", -> "driver_class" = "com.mysql.cj.jdbc.Driver"); -- ES mysql> create resource es_resource properties ( -> "type"="es", -> "hosts"="http://127.0.0.1:29200", -> "nodes_discovery"="false", -> "enable_keyword_sniff"="true");	2022-12-22 13:45:55 +08:00
slothever	2bb4ea5dea	[regresion-test](icebergv2) add icebergv2 test case (#15187 )	2022-12-22 13:45:07 +08:00
xueweizhang	1ca1417824	[feature](multi-catalog) support show tables/table status from catalog.db (#15180 ) support 'show tables from catalog.db' and 'show table status from catalog.db'	2022-12-22 09:22:40 +08:00
minghong	56f7ba19c0	[opt](planner) add session var: COMPACT_EQUAL_TO_IN_PREDICATE_THRESHOLD (#15225 ) in previous pr(#14876) we compact equals like "a=1 or a=2 or a = 3 " in to "a in (1, 2, 3)" this pr set a lower bound for the number of equals COMPACT_EQUAL_TO_IN_PREDICATE_THRESHOLD (default is 2) for performance reason, we create a hashSet to collect literals, like {1,2,3}. and hence, the literals in "in-predicates" are in random order. for regression test, if we need stable output of explain string, set COMPACT_EQUAL_TO_IN_PREDICATE_THRESHOLD to a large number to avoid compact rule.	2022-12-21 21:10:47 +08:00
Shuo Wang	c0b39de61c	[Feature](Nereids) Support join hint (#13601 ) Support join hint for nereids planner. Hints for broadcast and shuffle are supported by this PR.	2022-12-21 21:09:13 +08:00
Kikyou1997	649bbc1e58	[fix](nereids) Fix case-when (#15150 )	2022-12-21 21:03:50 +08:00
luozenglin	e65b577f90	[fix](InBitmap) Check whether the in bitmap contains correlated subqueries (#15184 )	2022-12-21 16:52:27 +08:00
TengJianPing	a447121fc3	[fix](scanner scheduler) fix coredump of ScannerScheduler::_scanner_scan (#15199 ) * [fix](scanner scheduler) fix coredump of ScannerScheduler::_scanner_scan * fix	2022-12-21 15:44:47 +08:00
mch_ucchi	90349f0e61	[Feature](Nereids) support mask function (#15120 ) support function for nereids: mask, mask_first_n, mask_last_n	2022-12-21 10:25:11 +08:00
Ashin Gau	d0d7a6d8ad	[fix](multi-catalog) can't show databases when creating a new user in external catalog (#15204 ) Fix bug: A new user with grants to access external catalog can't show databases.	2022-12-21 08:58:06 +08:00
xueweizhang	8969c19cd4	[fix](jdbc) fix create table like table of jdbc error (#15179 ) when create table like table of jdbc, it will get error like 'errCode = 2, detailMessage = Failed to execute CREATE TABLE LIKE baseall_mysql. Reason: errCode = 2, detailMessage = property table_type must be set' this pr fix it.	2022-12-21 08:56:43 +08:00
Gabriel	732417258c	[Bug](pipeline) Fix bugs to pass TPCDS cases (#15194 )	2022-12-20 22:29:55 +08:00
morrySnow	5cf21fa7d1	[feature](planner) mark join to support subquery in disjunction (#14579 ) Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>	2022-12-20 15:22:43 +08:00
zhengshiJ	d9550c311e	[feature](Nereids) implement setOperation (#15020 ) The pr implements the SetOperation. - Adapt to the EliminateUnnecessaryProject rule to ensure that the project under SetOperation is not deleted. - Add predicate pushdown of SetOperation - Optimization: Merge multiple SetOperations with the same type and the same qualifier - Optimization: merge oneRowRelation and union	2022-12-20 15:14:29 +08:00
minghong	fdb54a346d	[feature] (nereids) support aggregate function group_bit_and/or/xor (#15003 ) support group_bit_and group_bit_xor group_bit_or	2022-12-20 14:11:07 +08:00
starocean999	6712f1fc1d	[fix](Nereids) encryption function with 4 params should auto-complate last param with config (#15038 )	2022-12-20 13:55:54 +08:00
Gabriel	5c5ccd8d45	[regression](pipeline) add TPCH cases (#15177 )	2022-12-20 11:41:10 +08:00
starocean999	4979ad09c8	[fix](join)the policy to choose colocate join is not correct (#15140 ) * [hotfix](dev-1.0.1) fix colocate join bug in vec engine after introducing output tuple (#10651) to support vectorized outer join, we introduced a out tuple for hash join node, but it breaks the checking for colocate join. To solve this problem, we need map the output slot id to the children's slot id of hash join node, and the colocate join can be checked correctly. * fix colocate join bug * fix non vec colocate join issue Co-authored-by: lichi <lichi@rateup.com.cn> * add test cases Co-authored-by: lichi <lichi@rateup.com.cn>	2022-12-20 09:44:47 +08:00
minghong	320b264c9d	[feature](planner) compact multi-euqals to in-predicate #14876	2022-12-20 09:43:34 +08:00
Mingyu Chen	f5823a90ff	[fix](broker-load) fix broker load with hdfs failed to get right file type (#15138 )	2022-12-19 16:00:58 +08:00
Jibing-Li	6be5670ce9	[Feature](multi catalog)Remove enable_multi_catalog config item, open this function to public. (#15130 ) The multi-catalog feature is ready to use, remove enable_multi_catalog switch in FE config, open it to public.	2022-12-19 14:29:13 +08:00
xueweizhang	1597afcd67	[fix](mutil-catalog) fix get many same name db/table when show where (#15076 ) when show databases/tables/table status where xxx, it will change a selectStmt to select result from information_schema, it need catalog info to scan schema table, otherwise may get many database or table info from multi catalog. for example mysql> show databases where schema_name='test'; +----------+ \| Database \| +----------+ \| test \| \| test \| +----------+ MySQL [internal.test]> show tables from test where table_name='test_dc'; +----------------+ \| Tables_in_test \| +----------------+ \| test_dc \| \| test_dc \| +----------------+	2022-12-19 14:27:48 +08:00
Jibing-Li	3506b568ff	[Regression](multi catalog)P2 regression case for external hms catalog on emr. #15156	2022-12-19 09:21:48 +08:00
924060929	af4d9b636a	[refactor](Nerieds) Refactor aggregate function/plan/rules and support related cbo rules (#14827 ) # Proposed changes ## refactor - add AggregateExpression to shield the difference of AggregateFunction before disassemble and after - request `GATHER` physicalProperties for query, because query always collect result to the coordinator, use `GATHER` maybe select a better plan - refactor `NormalizeAggregate` - remove some physical fields for the `LogicalAggregate`, like `AggPhase`, `isDisassemble` - remove `AggregateDisassemble` and `DistinctAggregateDisassemble`, and use `AggregateStrategies` to generate various of PhysicalHashAggregate, like `two phases aggregate`, `three phases aggregate`, and cascades can auto select the lowest cost alternative. - move `PushAggregateToOlapScan` to `AggregateStrategies` - separate the traverse and visit method in FoldConstantRuleOnFE - if some expression not implement the visit method, the traverse method can handle and rewrite the children by default - if some expression implement the visit, the user defined traverse(invoke accept/visit method) will quickly return because the default visit method will not forward to the children, and the pre-process in traverse method will not be skipped. ## new feature - support `disable_nereids_rules` to skip some rules. example: 1. create 1 bucket table `n` ```sql CREATE TABLE `n` ( `id` bigint(20) NOT NULL ) ENGINE=OLAP DUPLICATE KEY(`id`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`id`) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); ``` 2. insert some rows into `n` ```sql insert into n select * from numbers('number'='20000000') ``` 3. query table `n` ```sql SET enable_nereids_planner=true; SET enable_vectorized_engine=true; SET enable_fallback_to_original_planner=false; explain plan select id from n group by id; ``` the result show that we use the one stage aggregate ``` \| PhysicalHashAggregate ( aggPhase=LOCAL, aggMode=INPUT_TO_RESULT, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional.empty, requestProperties=[GATHER], stats=(rows=1, width=1, penalty=2.0E7) ) \| \| +--PhysicalProject ( projects=[id#0], stats=(rows=20000000, width=1, penalty=0.0) ) \| \| +--PhysicalOlapScan ( qualified=default_cluster:test.n, output=[id#0, name#1], stats=(rows=20000000, width=1, penalty=0.0) ) \| ``` 4. disable one stage aggregate ```sql explain plan select /+SET_VAR(disable_nereids_rules=DISASSEMBLE_ONE_PHASE_AGGREGATE_WITHOUT_DISTINCT)/ id from n group by id ``` the result is two stage aggregate ``` \| PhysicalHashAggregate ( aggPhase=GLOBAL, aggMode=BUFFER_TO_RESULT, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional[[id#0]], requestProperties=[GATHER], stats=(rows=1, width=1, penalty=2.0E7) ) \| \| +--PhysicalHashAggregate ( aggPhase=LOCAL, aggMode=INPUT_TO_BUFFER, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional[[id#0]], requestProperties=[ANY], stats=(rows=1, width=1, penalty=2.0E7) ) \| \| +--PhysicalProject ( projects=[id#0], stats=(rows=20000000, width=1, penalty=0.0) ) \| \| +--PhysicalOlapScan ( qualified=default_cluster:test.n, output=[id#0, name#1], stats=(rows=20000000, width=1, penalty=0.0) ) \| ```	2022-12-18 21:49:29 +08:00
xueweizhang	6aba948df0	[fix](multi-catalog) hidden password for show create jdbc catalog (#15145 ) when show create catalog of jdbc, it will show 'jdbc.password' plain text. fix it like other code that hidden password.	2022-12-17 17:20:17 +08:00
starocean999	6d5251af78	[fix](subquery)fix bug of using constexpr as subquery's output (#15119 )	2022-12-16 21:58:58 +08:00
Hong Liu	33abe11dea	[regression-test](query) Add regression case of error could not be changed to nullabl when exe… (#15123 ) * Add regression case of error could not be changed to nullabl when exeing sql * add out file Co-authored-by: smallhibiscus <844981280>	2022-12-16 21:57:36 +08:00
Gabriel	4dbe30d37b	[regression](vectorized) delete vectorized config in regression tests (#15126 )	2022-12-16 17:08:29 +08:00
Mingyu Chen	0e1e5a802b	[config](load) enable new load scan node by default (#14808 ) Set FE `enable_new_load_scan_node` to true by default. So that all load tasks(broker load, stream load, routine load, insert into) will use FileScanNode instead of BrokerScanNode to read data 1. Support loading parquet file in stream load with new load scan node. 2. Fix bug that new parquet reader can not read column without logical or converted type. 3. Change jsonb parser function to "jsonb_parse_error_to_null" So that if the input string is not a valid json string, it will return null for jsonb column in load task.	2022-12-16 09:41:43 +08:00
starocean999	5e0d44ff25	[fix](nereids) fix bug of expr rewrite and column prune rule of group by exprs (#15097 )	2022-12-16 03:22:36 +08:00
mch_ucchi	8f914aa864	[feature](Nereids) support 'timestamp' type constructor (#15095 ) sql like: select timestamp '2022-01-01 01:00:00' + interval '2' hours;	2022-12-16 03:20:56 +08:00
Gabriel	5ef4c42a80	[Bug](datev2) Fix wrong result when use datev2 as partition key (#15094 )	2022-12-15 21:27:05 +08:00

1 2 3 4 5 ...

785 Commits