doris

Author	SHA1	Message	Date
yiguolei	49395390be	[bugfix](metareader) meta reader could not load image (#16148 ) This bug is introduced by PR #16009. Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-28 14:22:18 +08:00
AlexYue	7f2ff83480	[enhancement](FE)shut down fast throw JVM might do when throwing exception #16146 As discussed in 16107 Sometimes jvm would try to reduce the whole stack to just one line, it's kind of confusing for debugging. Issue Number: close #xxx	2023-01-28 14:18:25 +08:00
yiguolei	e49766483e	[refactor](remove unused code) remove many xxxVal structure (#16143 ) remove many xxxVal structure remove BetaRowsetWriter::_add_row remove anyval_util.cpp remove non-vectorized geo functions remove non-vectorized like predicate Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-28 14:17:43 +08:00
zhannngchen	4e64ff6329	[enhancement](load) avoid schema copy to reduce cpu usage (#16034 )	2023-01-28 11:13:57 +08:00
AKIRA	b919cbe487	[ehancement](nereids) Enhancement for limit clause (#16114 ) support limit offset without order by. the legacy planner supoort this feature in PR #15218	2023-01-28 11:04:03 +08:00
caiconghui	0148b39de0	[fix](metric) fix be down when enable_system_metrics is false (#16140 ) if we set enable_system_metrics to false, we will see be down with following message "enable metric calculator failed, maybe you set enable_system_metrics to false ", so fix it Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2023-01-28 00:10:39 +08:00
Qi Chen	fa14b7ea9c	[Enhancement](icebergv2) Optimize the position delete file filtering mechanism in iceberg v2 parquet reader (#16024 ) close #16023	2023-01-28 00:04:27 +08:00
Jibing-Li	1589d453a3	[fix](multi catalog)Support parquet and orc upper case column name (#16111 ) External hms catalog table column names in doris are all in lower case, while iceberg table or spark-sql created hive table may contain upper case column name, which will cause empty query result. This pr is to fix this bug. 1. For parquet file, transfer all column names to lower case while parse parquet metadata. 2. For orc file, store the origin column names and lower case column names in two vectors, use the suitable names in different cases. 3. FE side, change the column name back to the origin column name in iceberg while doing convertToIcebergExpr.	2023-01-27 23:52:11 +08:00
yiguolei	adb758dcac	[refactor](remove non vec code) remove json functions string functions match functions and some code (#16141 ) remove json functions code remove string functions code remove math functions code move MatchPredicate to olap since it is only used in storage predicate process remove some code in tuple, Tuple structure should be removed in the future. remove many code in collection value structure, they are useless	2023-01-26 16:21:12 +08:00
yiguolei	615a5e7b51	[refactor](remove non vec code) remove non vec functions and AggregateInfo (#16138 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-25 12:53:05 +08:00
catpineapple	aa87e31b6f	[doc](cold hot separation)cold hot separation document adjustment (#15811 )	2023-01-24 23:24:28 +08:00
yiguolei	6e8eedc521	[refactor](remove unused code) remove storage buffer and orc reader (#16137 ) remove olap storage byte buffer remove orc reader remove time operator remove read_write_util remove aggregate funcs remove compress.h and cpp remove bhp_lib Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-24 22:29:32 +08:00
yiguolei	79ad74637d	[refactor](remove expr) remove non vectorized Expr and ExprContext related codes (#16136 )	2023-01-24 10:45:35 +08:00
Mingyu Chen	23edb3de5a	[fix](icebergv2) fix bug that delete file reader is not opened (#16133 ) This pr #15836 change the way to use parquet reader by first open() then init_reader(). But we forgot to call open() for iceberg delete file, which cause coredump.	2023-01-24 10:19:46 +08:00
yiguolei	a3cd0ddbdc	[refactor](remove broker scan node) it is not useful any more (#16128 ) remove broker scannode remove broker table remove broker scanner remove json scanner remove orc scanner remove hive external table remove hudi external table remove broker external table, user could use broker table value function instead Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-23 19:37:38 +08:00
zhangstar333	61fccc88d7	[vectorized](analytic) fix analytic node of window function get wrong… (#16074 ) [Bug] 基础函数rank()开窗排序结果错误 #15951	2023-01-23 16:09:46 +08:00
Xiangyu Wang	ab04a458aa	[Enhancement](export) cancel all running coordinators when execute cancel-export statement. (#15801 )	2023-01-22 23:11:32 +08:00
ZhaoChangle	199d7d3be8	[Refactor]Merged string_value into string_ref (#15925 )	2023-01-22 16:39:23 +08:00
Mingyu Chen	b9872ceb98	[deps](libhdfs3) update to 2.3.6 to fix kms aes 256 bug (#16127 ) update libhdfs3 to 2.3.6 to fix kms aes 256 bug. And update the licences and changelog	2023-01-22 07:18:35 +08:00
yiguolei	8920295534	[refactor](remoe non vec code) remove non vectorized conjunctx from scanner (#16121 ) 1. remove arrow group filter 2. remove non vectorized conjunctx from scanner Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-21 19:23:17 +08:00
zhangstar333	253445ca46	[vectorzied](jdbc) fix jdbc executor for get result by batch and memo… (#15843 ) result set should be get by batch size2. fix memory leak3.	2023-01-21 08:22:22 +08:00
Xiangyu Wang	87c7f2fcc1	[Feature](profile) set sql and defaultDb fields in show-load-profile. (#15875 ) When execute show load profile '/', the value of SQL and DefaultDb columns are all 'N/A', but we can fill these fields，the result of this pr is as follows: Execute show load profile '/'\G: MySQL [test_d]> show load profile '/'\G ************************* 1. row ************************* QueryId: 652326 User: N/A DefaultDb: default_cluster:test_d SQL: LOAD LABEL `default_cluster:test_d`.`xxx` (APPEND DATA INFILE ('hdfs://xxx/user/hive/warehouse/xxx.db/xxx/*') INTO TABLE xxx FORMAT AS 'ORC' (c1, c2, c3) SET (`c1` = `c1`, `c2` = `c2`, `c3` = `c3`)) WITH BROKER broker_xxx (xxx) PROPERTIES ("max_filter_ratio" = "0", "timeout" = "30000") QueryType: Load StartTime: 2023-01-12 18:33:34 EndTime: 2023-01-12 18:33:46 TotalTime: 11s613ms QueryState: N/A 1 row in set (0.01 sec)	2023-01-21 08:10:15 +08:00
Stalary	8b40791718	[Feature](ES): catalog support mapping es _id #15943	2023-01-21 08:08:32 +08:00
Gabriel	01c001e2ac	[refactor](javaudf) simplify UdfExecutor and UdafExecutor (#16050 ) * [refactor](javaudf) simplify UdfExecutor and UdafExecutor * update * update	2023-01-21 08:07:28 +08:00
gnehil	25046fabec	[regression-test](sub query) add regression test for subquery with limit (#16051 ) * [regression-test](sub query) add regression test for subquery with limit * add lisence header	2023-01-21 08:06:49 +08:00
Ashin Gau	de12957057	[debug](ParquetReader) print file path if failed to read parquet file (#16118 )	2023-01-21 08:05:17 +08:00
AKIRA	2daa5f3fef	[fix](statistics) Fix statistics related threads continuously spawn as doing checkpoint #16088	2023-01-21 07:58:33 +08:00
chunping	8d02961216	[test](pipline)Remove P1 regression required check in .asf.yaml (#16119 )	2023-01-21 07:57:52 +08:00
Tiewei Fang	7814d2b651	[Fix](Oracle External Table) fix that oracle external table can not insert batch values (#16117 ) Issue Number: close #xxx This pr fix two bugs: _jdbc_scanner may be nullptr in vjdbc_connector.cpp, so we use another method to count jdbc statistic. close [Enhencement](jdbc scanner) add profile for jdbc scanner #15914 In the batch insertion scenario, oracle database does not support syntax insert into tables values (...),(...); , what it supports is: insert all into table(col1,col2) values(c1v1, c2v1) into table(col1,col2) values(c1v2, c2v2) SELECT 1 FROM DUAL;	2023-01-21 07:57:12 +08:00
Hu Yanjun	d318d644ff	[docs](en) update en docs (#16124 )	2023-01-20 23:05:39 +08:00
abmdocrt	9ffd109b35	[fix](datetimev2) Fix BE datetimev2 type returning wrong result (#15885 )	2023-01-20 22:25:20 +08:00
mch_ucchi	6b110aeba6	[test](Nereids) add regression cases for all functions (#15907 )	2023-01-20 22:17:27 +08:00
caiconghui	5514b1c1b7	[enhancement](tablet_report) accelerate deleteFromBackend function to avoid tablet report task blocked (#16115 )	2023-01-20 20:11:58 +08:00
zhangdong	0305aad097	[fix](privilege)fix grant resource bug (#16045 ) GRANT USAGE_PRIV ON RESOURCE * TO user; user will see all database Describe your changes. Set a PrivPredicate for show resources and remove USAGE under PrivPredicate in SHOW_ PRIV	2023-01-20 19:00:44 +08:00
谢健	3b08a22e61	[test](Nereids) add p0 regression test for Nereids (#15888 )	2023-01-20 18:50:23 +08:00
lsy3993	956070e17f	fix english number of tpch (#16116 )	2023-01-20 17:27:10 +08:00
yixiutt	171404228f	[improvement](vertical compaction) cache segment in vertical compaction (#16101 ) 1.In vertical compaction, segments will be loaded for every column group, so we should cache segment ptr to avoid too many repeated io. 2.fix vertical compaction data size bug	2023-01-20 16:38:23 +08:00
caiconghui	a4265fae70	[enhancement](query) Make query scan nodes more evenly distributed (#16037 ) Add replicaNumPerHost into consideration while schedule scan node to host to make final query scan nodes more evenly distributed in cluster	2023-01-20 16:24:49 +08:00
caoliang-web	13d93cb2b4	[typo](doc)nvl add 1.2 label (#15856 )	2023-01-20 15:11:58 +08:00
morrySnow	419f433d21	[fix](Nereids) topn arg check is not compatible with legacy planner (#16105 )	2023-01-20 15:08:10 +08:00
Mingyu Chen	72df283344	[fix](planner) extract common factor rule should consider not only where predicate (#16110 ) This PR #14381 limit the `ExtractCommonFactorsRule` to handle only `WHERE` predicate, but the predicate in `ON` clause should also be considered. Such as: ``` CREATE TABLE `nation` ( `n_nationkey` int(11) NOT NULL, `n_name` varchar(25) NOT NULL, `n_regionkey` int(11) NOT NULL, `n_comment` varchar(152) NULL ) DUPLICATE KEY(`n_nationkey`) DISTRIBUTED BY HASH(`n_nationkey`) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1" ); select * from nation n1 join nation n2 on (n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY') or (n1.n_name = 'GERMANY' and n2.n_name = 'FRANCE') ``` There should be predicates: ``` PREDICATES: `n1`.`n_name` IN ('FRANCE', 'GERMANY') PREDICATES: `n2`.`n_name` IN ('FRANCE', 'GERMANY') ``` On each scan node. This PR fix this issue by removing the limit of `ExtractCommonFactorsRule`	2023-01-20 14:53:48 +08:00
Tiewei Fang	1638936e3f	[fix](oracle catalog) oracle catalog support `TIMESTAMP` dateType of oracle (#16113 ) `TIMESTAMP` dateType of Oracle will map to `DateTime` dateType of Doris	2023-01-20 14:47:58 +08:00
Mingyu Chen	726427b795	[refactor](fe) refactor and upgrade dependency tree of FE and support AWS glue catalog (#16046 ) 1. Spark dpp Move `DppResult` and `EtlJobConfig` to sparkdpp package in `fe-common` module. So taht `fe-core` is longer depends on `spark-dpp` module, so that the `spark-dpp.jar` will not be moved into `fe/lib`, which reduce the size of FE output. 2. Modify start_fe.sh Modify the CLASSPATH to make sure that doris-fe.jar is at front, so that when loading classes with same qualified name, it will be got from doris-fe.jar firstly. 3. Upgrade hadoop and hive version hadoop: 2.10.2 -> 3.3.3 hive: 2.3.7 -> 3.1.3 4. Override the IHiveMetastoreClient implementations from dependency `ProxyMetaStoreClient.java` for Aliyun DLF. `HiveMetaStoreClient.java` for origin Apache Hive metastore. Because I need to modified some of their method to make them compatible with different version of Hive. 5. Exclude some unused dependencies to reduce the size of FE output Now it is only 370MB (Before is 600MB) 6. Upgrade aws-java-sdk version to 1.12.31 7. Support AWS Glue Data Catalog 8. Remove HudiScanNode(no longer support)	2023-01-20 14:42:16 +08:00
mch_ucchi	3652cb3fe9	[test](Nereids)add test aboule dateType and dateTimeType (#16098 )	2023-01-20 14:15:54 +08:00
谢健	101bc568d7	[fix](Nereids) fix bugs about date function (#16112 ) 1. when casting constant, check the value is whether in the range of targetType 2. change the scale of dateTimeV2 to 6	2023-01-20 14:11:17 +08:00
starocean999	cbb203efd2	[fix](nereids) fix test_join regression test for nereids (#16094 ) 1. add TypeCoercion for (string, decimal) and (date, decimal) 2. The equality of LogicalProject node should consider children in some case 3. don't push down join condition like "t1 join t2 on true/false" 4. add PUSH_DOWN_FILTERS after FindHashConditionForJoin 5. nestloop join should support all kind of join 6. the intermediate tuple should contains slots from both children of nest loop join.	2023-01-20 14:02:29 +08:00
lihangyu	116e17428b	[Enhancement](point query optimize) improve performace of point query on primary keys (#15491 ) 1. support row format using codec of jsonb 2. short path optimize for point query 3. support prepared statement for point query 4. support mysql binary format	2023-01-20 13:33:01 +08:00
Jibing-Li	3ebc98228d	[feature wip](multi catalog)Support iceberg schema evolution. (#15836 ) Support iceberg schema evolution for parquet file format. Iceberg use unique id for each column to support schema evolution. To support this feature in Doris, FE side need to get the current column id for each column and send the ids to be side. Be read column id from parquet key_value_metadata, set the changed column name in Block to match the name in parquet file before reading data. And set the name back after reading data.	2023-01-20 12:57:36 +08:00
Qi Chen	ab4127d0b2	[Fix][regression-test] Fix test_hdfs_tvf.groovy by update HDFS conf URI to uri and better error msg handling. (#16029 ) Fix test_hdfs_tvf.groovy by update HDFS conf URI to uri and better error msg handling. test_hdfs_tvf.groovy didn't passed.	2023-01-20 12:40:25 +08:00
Tiewei Fang	ba71516eba	[feature](jdbc catalog) support SQLServer jdbc catalog (#16093 )	2023-01-20 12:37:38 +08:00

1 2 3 4 5 ...

8353 Commits