doris

Author	SHA1	Message	Date
morrySnow	5c00caa259	[refactor](Nereids) refactor BindSlotReference for easy merge all bind process in one rule (#16156 )	2023-01-30 10:57:39 +08:00
jakevin	bd1b7e190c	[fix](Nereids): fix field(). (#16214 )	2023-01-30 10:55:02 +08:00
Zhengguo Yang	ec4a56922f	[enhancement](memory) reduce memory usage for failed broker loads (#15895 ) * [enhancement](memory) reduce memory usage for failed broker loads	2023-01-30 10:22:31 +08:00
HappenLee	7d437d5706	[Bug](function) running_difference function coredump in regression test (#16215 )	2023-01-30 09:58:27 +08:00
Xiangyu Wang	b8a7297109	[Enhancement](profile) fill user field for profile. (#16212 ) Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>	2023-01-30 09:15:02 +08:00
Adonis Ling	d56043ab5a	[feature-wip](MTMV) Support setting variables in query statement (#16060 ) ## Use case ```shell mysql> CREATE TABLE t_user ( -> event_day DATE, -> id bigint, -> username varchar(20) -> ) -> DISTRIBUTED BY HASH(id) BUCKETS 10 -> PROPERTIES ('replication_num' = '1'); Query OK, 0 rows affected (0.07 sec) mysql> CREATE TABLE t_user_pv( -> event_day DATE, -> id bigint, -> pv bigint -> ) -> DISTRIBUTED BY HASH(id) BUCKETS 10 -> PROPERTIES ('replication_num' = '1'); Query OK, 0 rows affected (0.09 sec) mysql> CREATE MATERIALIZED VIEW mv -> BUILD IMMEDIATE REFRESH COMPLETE -> KEY (username) -> DISTRIBUTED BY HASH(username) BUCKETS 10 -> PROPERTIES ('replication_num' = '1') -> AS SELECT /+ SET_VAR(exec_mem_limit=1048576, query_timeout=3600) / t1.username ,t2.pv FROM t_user t1 LEFT JOIN t_user_pv t2 on t1.id = t2.id; Query OK, 0 rows affected (0.10 sec) ```	2023-01-30 01:05:41 +08:00
谢健	98649ec9f8	[fix](Nereids): Fix some functions error (#16197 ) * fix bugs in regexp_extract_all * fix rpad * fix weekofday * fix cryptor * fix timestamp * fix st_ function	2023-01-30 00:41:31 +08:00
jakevin	7d648a94d0	[fix](Nereids): fix scalar_function A-F. (#16209 ) * [fix](Nereids): fix scalar_function A-F. * [Fix](regression-test)fix regression test framework cannot compare double value nan and inf. * revert dround()	2023-01-30 00:37:34 +08:00
Gabriel	217db3e4c8	[refactor](built-in function) remove symbols for vectorized function (#16189 ) * [refactor](built-in function) remove symbols for vectorized function * update * update	2023-01-29 21:30:09 +08:00
mch_ucchi	1db7882bb5	[Fix](Nereids): fix error of X-Z function for nereids (#16171 )	2023-01-29 20:42:30 +08:00
starocean999	1ec88cbff6	[fix](nereids) AggregationNode process null as key column in wrong way (#16125 ) in AggregationNode, _merge_with_serialized_key_helper method should convert the key column to full column if the key column is null literal.	2023-01-29 20:12:07 +08:00
morrySnow	1ad6ef939b	[refactor](Nereids) use immutable collections as far as possible (#16193 )	2023-01-29 16:48:58 +08:00
jakevin	04ed83cb36	[fix](Nereids): remove `DataV2Type` in ConvertTz SIGNATURES (#16170 ) * [fix](Nereids): remove `DataV2Type` in ConvertTz SIGNATURES * remove it in doris_builtins_functions.py	2023-01-29 16:11:17 +08:00
jakevin	abc50c6fe5	[enhance](Nereids): remove duplicated alias Function. (#16187 )	2023-01-29 14:56:20 +08:00
huangzhaowei	c6bc0a03a4	[feature](Load)Suppot MySQL Load Data (#15511 ) Main subtask of [DSIP-28](https://cwiki.apache.org/confluence/display/DORIS/DSIP-028%3A+Suppot+MySQL+Load+Data) ## Problem summary Support mysql load syntax as below: ```sql LOAD DATA [LOCAL] INFILE 'file_name' INTO TABLE tbl_name [PARTITION (partition_name [, partition_name] ...)] [COLUMNS TERMINATED BY 'string'] [LINES TERMINATED BY 'string'] [IGNORE number {LINES \| ROWS}] [(col_name_or_user_var [, col_name_or_user_var] ...)] [SET (col_name={expr \| DEFAULT} [, col_name={expr \| DEFAULT}] ...)] [PROPERTIES (key1 = value1 [, key2=value2]) ] ``` For example, ```sql LOAD DATA LOCAL INFILE 'local_test.file' INTO TABLE db1.table1 PARTITION (partition_a, partition_b, partition_c, partition_d) COLUMNS TERMINATED BY '\t' (k1, k2, v2, v10, v11) set (c1=k1,c2=k2,c3=v10,c4=v11) PROPERTIES ("auth" = "root:", "strict_mode"="true") ``` Note that in this pr the property named `auth` must be set since stream load need auth. I will optimize it later.	2023-01-29 14:44:59 +08:00
abmdocrt	eb7da1c0ee	[fix](datatype) fix some bugs about data type array datetimev2 and decimalv3 (#16132 )	2023-01-29 14:26:08 +08:00
lihangyu	578a855b3e	[Bug](topn-opt) filter condition for analytic info for two phase read opt (#16173 ) two phase read optimization should not be enabled when query has analytic info	2023-01-29 12:06:18 +08:00
jakevin	ce487e2b11	[fix](Nereids): fix dceil() dfloor() (#16174 )	2023-01-29 11:59:23 +08:00
Jibing-Li	35398ad8d9	[fix](multi-catalog)Use -1 for column_statistics internal table idx_id default value instead of null, for external catalog (#16177 ) The internal statistic table column_statistics has a non-null field idx_id, the insert sql for hive table set the default value to NULL, which will failed to insert the result. Change it to -1.	2023-01-29 11:29:25 +08:00
Pxl	2b5f95f08a	[Bug](function) remove datev2 signature of hour_ceil/hour_floor #16168	2023-01-29 11:27:56 +08:00
jakevin	3151d94e9e	[fix](Nereids): fix Ceiling. (#16164 )	2023-01-28 20:26:20 +08:00
jiafeng.zhang	da28d2faee	[deps](http)Upgrade springboot version to 2.7.8 (#16158 ) * Upgrade springboot version to 2.7.8 * fix	2023-01-28 20:13:50 +08:00
pengxiangyu	c506b4a1e3	[bug](cooldown)add config for Cooldown Job	2023-01-28 19:58:50 +08:00
Gabriel	26fc7c8196	[Bug](decimalv3) fix BE crash for function `if` (#16152 )	2023-01-28 19:37:50 +08:00
jakevin	7e7fd5d049	[cleanup](fe) cleanup useless code. (#16129 ) * [cleanup](Nereids): cleanup useless code. * revert ErrorCode.java	2023-01-28 18:44:43 +08:00
yiguolei	49395390be	[bugfix](metareader) meta reader could not load image (#16148 ) This bug is introduced by PR #16009. Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-28 14:22:18 +08:00
AKIRA	b919cbe487	[ehancement](nereids) Enhancement for limit clause (#16114 ) support limit offset without order by. the legacy planner supoort this feature in PR #15218	2023-01-28 11:04:03 +08:00
Jibing-Li	1589d453a3	[fix](multi catalog)Support parquet and orc upper case column name (#16111 ) External hms catalog table column names in doris are all in lower case, while iceberg table or spark-sql created hive table may contain upper case column name, which will cause empty query result. This pr is to fix this bug. 1. For parquet file, transfer all column names to lower case while parse parquet metadata. 2. For orc file, store the origin column names and lower case column names in two vectors, use the suitable names in different cases. 3. FE side, change the column name back to the origin column name in iceberg while doing convertToIcebergExpr.	2023-01-27 23:52:11 +08:00
yiguolei	a3cd0ddbdc	[refactor](remove broker scan node) it is not useful any more (#16128 ) remove broker scannode remove broker table remove broker scanner remove json scanner remove orc scanner remove hive external table remove hudi external table remove broker external table, user could use broker table value function instead Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-23 19:37:38 +08:00
Xiangyu Wang	ab04a458aa	[Enhancement](export) cancel all running coordinators when execute cancel-export statement. (#15801 )	2023-01-22 23:11:32 +08:00
zhangstar333	253445ca46	[vectorzied](jdbc) fix jdbc executor for get result by batch and memo… (#15843 ) result set should be get by batch size2. fix memory leak3.	2023-01-21 08:22:22 +08:00
Xiangyu Wang	87c7f2fcc1	[Feature](profile) set sql and defaultDb fields in show-load-profile. (#15875 ) When execute show load profile '/', the value of SQL and DefaultDb columns are all 'N/A', but we can fill these fields，the result of this pr is as follows: Execute show load profile '/'\G: MySQL [test_d]> show load profile '/'\G ************************* 1. row ************************* QueryId: 652326 User: N/A DefaultDb: default_cluster:test_d SQL: LOAD LABEL `default_cluster:test_d`.`xxx` (APPEND DATA INFILE ('hdfs://xxx/user/hive/warehouse/xxx.db/xxx/*') INTO TABLE xxx FORMAT AS 'ORC' (c1, c2, c3) SET (`c1` = `c1`, `c2` = `c2`, `c3` = `c3`)) WITH BROKER broker_xxx (xxx) PROPERTIES ("max_filter_ratio" = "0", "timeout" = "30000") QueryType: Load StartTime: 2023-01-12 18:33:34 EndTime: 2023-01-12 18:33:46 TotalTime: 11s613ms QueryState: N/A 1 row in set (0.01 sec)	2023-01-21 08:10:15 +08:00
Stalary	8b40791718	[Feature](ES): catalog support mapping es _id #15943	2023-01-21 08:08:32 +08:00
Gabriel	01c001e2ac	[refactor](javaudf) simplify UdfExecutor and UdafExecutor (#16050 ) * [refactor](javaudf) simplify UdfExecutor and UdafExecutor * update * update	2023-01-21 08:07:28 +08:00
AKIRA	2daa5f3fef	[fix](statistics) Fix statistics related threads continuously spawn as doing checkpoint #16088	2023-01-21 07:58:33 +08:00
Tiewei Fang	7814d2b651	[Fix](Oracle External Table) fix that oracle external table can not insert batch values (#16117 ) Issue Number: close #xxx This pr fix two bugs: _jdbc_scanner may be nullptr in vjdbc_connector.cpp, so we use another method to count jdbc statistic. close [Enhencement](jdbc scanner) add profile for jdbc scanner #15914 In the batch insertion scenario, oracle database does not support syntax insert into tables values (...),(...); , what it supports is: insert all into table(col1,col2) values(c1v1, c2v1) into table(col1,col2) values(c1v2, c2v2) SELECT 1 FROM DUAL;	2023-01-21 07:57:12 +08:00
caiconghui	5514b1c1b7	[enhancement](tablet_report) accelerate deleteFromBackend function to avoid tablet report task blocked (#16115 )	2023-01-20 20:11:58 +08:00
zhangdong	0305aad097	[fix](privilege)fix grant resource bug (#16045 ) GRANT USAGE_PRIV ON RESOURCE * TO user; user will see all database Describe your changes. Set a PrivPredicate for show resources and remove USAGE under PrivPredicate in SHOW_ PRIV	2023-01-20 19:00:44 +08:00
caiconghui	a4265fae70	[enhancement](query) Make query scan nodes more evenly distributed (#16037 ) Add replicaNumPerHost into consideration while schedule scan node to host to make final query scan nodes more evenly distributed in cluster	2023-01-20 16:24:49 +08:00
morrySnow	419f433d21	[fix](Nereids) topn arg check is not compatible with legacy planner (#16105 )	2023-01-20 15:08:10 +08:00
Mingyu Chen	72df283344	[fix](planner) extract common factor rule should consider not only where predicate (#16110 ) This PR #14381 limit the `ExtractCommonFactorsRule` to handle only `WHERE` predicate, but the predicate in `ON` clause should also be considered. Such as: ``` CREATE TABLE `nation` ( `n_nationkey` int(11) NOT NULL, `n_name` varchar(25) NOT NULL, `n_regionkey` int(11) NOT NULL, `n_comment` varchar(152) NULL ) DUPLICATE KEY(`n_nationkey`) DISTRIBUTED BY HASH(`n_nationkey`) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1" ); select * from nation n1 join nation n2 on (n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY') or (n1.n_name = 'GERMANY' and n2.n_name = 'FRANCE') ``` There should be predicates: ``` PREDICATES: `n1`.`n_name` IN ('FRANCE', 'GERMANY') PREDICATES: `n2`.`n_name` IN ('FRANCE', 'GERMANY') ``` On each scan node. This PR fix this issue by removing the limit of `ExtractCommonFactorsRule`	2023-01-20 14:53:48 +08:00
Tiewei Fang	1638936e3f	[fix](oracle catalog) oracle catalog support `TIMESTAMP` dateType of oracle (#16113 ) `TIMESTAMP` dateType of Oracle will map to `DateTime` dateType of Doris	2023-01-20 14:47:58 +08:00
Mingyu Chen	726427b795	[refactor](fe) refactor and upgrade dependency tree of FE and support AWS glue catalog (#16046 ) 1. Spark dpp Move `DppResult` and `EtlJobConfig` to sparkdpp package in `fe-common` module. So taht `fe-core` is longer depends on `spark-dpp` module, so that the `spark-dpp.jar` will not be moved into `fe/lib`, which reduce the size of FE output. 2. Modify start_fe.sh Modify the CLASSPATH to make sure that doris-fe.jar is at front, so that when loading classes with same qualified name, it will be got from doris-fe.jar firstly. 3. Upgrade hadoop and hive version hadoop: 2.10.2 -> 3.3.3 hive: 2.3.7 -> 3.1.3 4. Override the IHiveMetastoreClient implementations from dependency `ProxyMetaStoreClient.java` for Aliyun DLF. `HiveMetaStoreClient.java` for origin Apache Hive metastore. Because I need to modified some of their method to make them compatible with different version of Hive. 5. Exclude some unused dependencies to reduce the size of FE output Now it is only 370MB (Before is 600MB) 6. Upgrade aws-java-sdk version to 1.12.31 7. Support AWS Glue Data Catalog 8. Remove HudiScanNode(no longer support)	2023-01-20 14:42:16 +08:00
mch_ucchi	3652cb3fe9	[test](Nereids)add test aboule dateType and dateTimeType (#16098 )	2023-01-20 14:15:54 +08:00
谢健	101bc568d7	[fix](Nereids) fix bugs about date function (#16112 ) 1. when casting constant, check the value is whether in the range of targetType 2. change the scale of dateTimeV2 to 6	2023-01-20 14:11:17 +08:00
starocean999	cbb203efd2	[fix](nereids) fix test_join regression test for nereids (#16094 ) 1. add TypeCoercion for (string, decimal) and (date, decimal) 2. The equality of LogicalProject node should consider children in some case 3. don't push down join condition like "t1 join t2 on true/false" 4. add PUSH_DOWN_FILTERS after FindHashConditionForJoin 5. nestloop join should support all kind of join 6. the intermediate tuple should contains slots from both children of nest loop join.	2023-01-20 14:02:29 +08:00
lihangyu	116e17428b	[Enhancement](point query optimize) improve performace of point query on primary keys (#15491 ) 1. support row format using codec of jsonb 2. short path optimize for point query 3. support prepared statement for point query 4. support mysql binary format	2023-01-20 13:33:01 +08:00
Jibing-Li	3ebc98228d	[feature wip](multi catalog)Support iceberg schema evolution. (#15836 ) Support iceberg schema evolution for parquet file format. Iceberg use unique id for each column to support schema evolution. To support this feature in Doris, FE side need to get the current column id for each column and send the ids to be side. Be read column id from parquet key_value_metadata, set the changed column name in Block to match the name in parquet file before reading data. And set the name back after reading data.	2023-01-20 12:57:36 +08:00
Qi Chen	ab4127d0b2	[Fix][regression-test] Fix test_hdfs_tvf.groovy by update HDFS conf URI to uri and better error msg handling. (#16029 ) Fix test_hdfs_tvf.groovy by update HDFS conf URI to uri and better error msg handling. test_hdfs_tvf.groovy didn't passed.	2023-01-20 12:40:25 +08:00
Tiewei Fang	ba71516eba	[feature](jdbc catalog) support SQLServer jdbc catalog (#16093 )	2023-01-20 12:37:38 +08:00

1 2 3 4 5 ...

3636 Commits