doris

Author	SHA1	Message	Date
zhangstar333	54dbb4af67	[vectorzied](jdbc) refactor jdbc table read array type (#18187 ) jdbc read array type get result from Doris is string, PG is java.sql.array, CK is java.lang.object it's difficult to maintain and read the code, so change all database's array result to string, then add a cast function from string to doris array type	2023-04-04 11:57:04 +08:00
yongkang.zhong	6231ca80f7	[improve](clickhouse catalog) Add `"` wrap select column for the sql query clickhouse jdbc (#18352 )	2023-04-04 10:19:24 +08:00
minghong	3e7a9424e4	[feature](nereids) explain shape plan (#18296 ) `explain shape plan select ...` only print plan shape related information, including - node name - join type, join condition - filter condition - agg phase It is painful to maintain regression cases using explain since there are a lot of mutable information, like slot id. By this pr, we could use explain shape plan in regression cases. for exmaple: this is tpch q2 +-----------------------------------------------------------------------------------------------------------+ \| Explain String \| +-----------------------------------------------------------------------------------------------------------+ \| PhysicalTopN \| \| --PhysicalDistribute \| \| ----PhysicalTopN \| \| ------PhysicalProject \| \| --------filter((cast(ps_supplycost as DECIMAL(27, 9)) = min(ps_supplycost) OVER(PARTITION BY p_partkey))) \| \| ----------PhysicalWindow \| \| ------------PhysicalQuickSort \| \| --------------PhysicalProject \| \| ----------------hashJoin[INNER_JOIN](supplier.s_suppkey = partsupp.ps_suppkey) \| \| ------------------PhysicalProject \| \| --------------------hashJoin[INNER_JOIN](part.p_partkey = partsupp.ps_partkey) \| \| ----------------------PhysicalProject \| \| ------------------------PhysicalOlapScan[partsupp] \| \| ----------------------PhysicalProject \| \| ------------------------filter((part.p_size = 15)(p_type like '%BRASS')) \| \| --------------------------PhysicalOlapScan[part] \| \| ------------------PhysicalDistribute \| \| --------------------hashJoin[INNER_JOIN](supplier.s_nationkey = nation.n_nationkey) \| \| ----------------------PhysicalOlapScan[supplier] \| \| ----------------------PhysicalDistribute \| \| ------------------------hashJoin[INNER_JOIN](nation.n_regionkey = region.r_regionkey) \| \| --------------------------PhysicalProject \| \| ----------------------------PhysicalOlapScan[nation] \| \| --------------------------PhysicalDistribute \| \| ----------------------------PhysicalProject \| \| ------------------------------filter((region.r_name = 'EUROPE')) \| \| --------------------------------PhysicalOlapScan[region] \| +-----------------------------------------------------------------------------------------------------------+	2023-04-04 09:44:15 +08:00
xueweizhang	798d2e5160	[fix](catalog) all properties should be checked when create unpartitioned table (#18149 ) all properties should be checked when create unpartitioned table like partitioned table. Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2023-04-04 08:53:45 +08:00
ZhangYu0123	8b85c55117	[vectorized](function) Support array_shuffle and shuffle function. (#18116 ) --------- Co-authored-by: zhangyu209 <zhangyu209@meituan.com>	2023-04-04 08:53:13 +08:00
starocean999	88c5e64c4a	[fix](nereids) fix bug of SelectMaterializedIndexWithAggregate rule (#18265 ) 1. create a project node to adjust the output column position when a mv is selected in olap scan node 2. pass SlotReference's column info when call Alias's toSlot() method 3. should compare plan's logical properties when compare two plans after rewrite	2023-04-03 22:32:43 +08:00
yongkang.zhong	fe9d2b00fc	[test](jdbc catalog) add clickhouse jdbc catalog base type test (#18007 )	2023-04-03 20:18:36 +08:00
Gabriel	96a64dc9e8	[Improvement](pipeline) Use bloom runtime filter by default for pipeline engine (#18177 )	2023-04-03 15:31:48 +08:00
yongjinhou	aff260c06f	[Enhancement](HttpServer) Support https interface (#16834 ) 1. Organize http documents 2. Add http interface authentication for FE 3. Support https interface for FE 4. Provide authentication interface 5. Add http interface authentication for BE 6. Support https interface for BE	2023-04-03 14:18:17 +08:00
Mingyu Chen	ecd3fd07f6	[feature](colocate) support cross database colocate join (#18152 )	2023-04-03 14:03:42 +08:00
Jibing-Li	e260dca7a1	[Improvement](multi catalog)Change hive metastore cache split value type to Doris defined Split. Fix split file length -1 bug (#18319 ) HiveMetastoreCache type for file split was Hadoop InputSplit. In this pr, change it to Doris defined Split This change could avoid convert it every time. Also fix the explain verbose result return -1 for split file length.	2023-04-03 13:54:28 +08:00
Liqf	961f5d1bb7	[feature](function)Add St_Angle/St_Azimuth function (#18293 ) Add St_Angle/St_azimuth function： St_Angle： Enter three point, which represent two intersecting lines. Returns the angle between these lines. Point 2 and point 1 represent the first line and point 2 and point 3 represent the second line. The angle between these lines is in radians, in the range [0, 2pi). The angle is measured clockwise from the first line to the second line. ` mysql> SELECT ST_Angle(ST_Point(1, 0),ST_Point(0, 0),ST_Point(0, 1)); +----------------------------------------------------------------------+ \| st_angle(st_point(1.0, 0.0), st_point(0.0, 0.0), st_point(0.0, 1.0)) \| +----------------------------------------------------------------------+ \| 4.71238898038469 \| +----------------------------------------------------------------------+ 1 row in set (0.04 sec) ` St_azimuth： Enter two point, and returns the azimuth of the line segment formed by points 1 and 2. The azimuth is the angle in radians measured between the line from point 1 facing true North to the line segment from point 1 to point 2. ` mysql> SELECT st_azimuth(ST_Point(0, 0),ST_Point(1, 0)); +----------------------------------------------------+ \| st_azimuth(st_point(0.0, 0.0), st_point(1.0, 0.0)) \| +----------------------------------------------------+ \| 1.5707963267948966 \| +----------------------------------------------------+ 1 row in set (0.04 sec)	2023-04-03 13:01:59 +08:00
Pxl	e77833bfa1	[Bug](materialized-view) fix where clause persistence replay incorrect (#18228 ) fix where clause persistence replay incorrect	2023-04-03 12:49:01 +08:00
AKIRA	ce4dc681be	[test](stats) Test framework for stats estimation on TPCH-1G dataset (#18267 ) Implement a test framework for stats estimation on TPCH-1G dataset to ensure accuracy	2023-04-03 11:01:57 +08:00
WenYao	2bce4db81a	[Enchancement](mysql-compatable) add regression-test for MySQLdump #18208 add regression-test for like this: mysqldump -h127.0.0.1 -P9030 -uroot --no-tablespaces --databases > /backup/mysqldump/test.db To prevent errors Unknown table 'column_statistics' in information_schema (1109), the table information_schema.column_statistics was added.	2023-04-03 09:49:07 +08:00
minghong	b9381570d6	[feature](nereids) semi and anti join estimation (#18129 ) in this pr, we add a new algorithm to estimate semi/anti join row count. In original alg., we reduce row count from cross join. usually, this is not good. for example, L left semi join R on L.a=R.a suppose L is larger than R, and ndv(L.a) < ndv(R.a) the estimated row count is rowcount(R) * rowcount(L) / ndv(R.a). in most cases, the estimated row count is larger than rowcount(L). in new alg, we use ndv(R.a)/originalNdv(R.a) to estimate result rowCount. the basic idea is as following: 1. Suppose ndv(R.a) reduced from m to n. 2. Assume that the value space of L.a is the same as R.a if R.a is not filtered.(this assumption is also hold in original alg.) regard `L left join R` as a filter applied on L, that is, if L.a is in R.a, then this tuple stays in result. R.a shrinks to m/n, so L.a also shrinks to m/n	2023-04-03 09:11:10 +08:00
Mingyu Chen	7131c60e05	[fix](audit-log) fixslow query missing in audit log (#18317 ) #17738 changed the column name in audit log, causing "slow_query" will not be recorded in fe.audit.log	2023-04-03 08:52:14 +08:00
mch_ucchi	4fcd93ac00	[Enhancement](Nereids)add datelikev2 type support for fold constant. #18275 add datelikev2 type support for fold constant. date_add / years_add / mouths_add / days_add / hours_add / minutes_add / seconds_add and xxx_sub.	2023-04-03 08:47:47 +08:00
Jack Drogon	7d49d9cf99	[improvement](dynamic partition) Fix dynamic partition no bucket (#18300 ) Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>	2023-04-02 15:51:21 +08:00
slothever	97aab138aa	[fix](parquet-reader) reset value idx in bool rle decoder and support iceberg datetime(3) (#18245 ) 1. Fix value idx in bool rle decoder 2. Iceberg table support datetimev2(3). In the previous version, we converted hive timestamp to datetimev2(0) default.	2023-04-01 21:00:01 +08:00
jakevin	9e087622ab	[fix](Nereids): fix JoinReorderContext in withXXX() of LogicalJoin. (#18299 )	2023-04-01 16:51:27 +08:00
abmdocrt	365867a867	[feature](SSL) default enable SSL MySQL connection to FE (#18285 )	2023-03-31 21:31:23 +08:00
Mingyu Chen	7e61a85331	[refactor](libhdfs) introduce hadoop libhdfs (#18204 ) 1. Introduce hadoop libhdfs 2. For Linux-X86 platform, use the hadoop libhdfs 3. For other platform, use libhdfs3, because currently we don't have hadoop libhdfs binary for other platform Co-authored-by: adonis0147 <adonis0147@gmail.com>	2023-03-31 18:41:39 +08:00
mch_ucchi	3ea98b65df	[Fix](Nereids) fix nereids failed to parse set operation with query in parenthesis (#18062 ) sql like the format (q1, q2, q3 is a query): ``` sql (q1) UNION ALL (q2) UNION ALL (q3) ORDER BY keys ``` cannot be parsed by nereids, because order will be recognized as an alias of query, we add queryOrganization to avoid it.	2023-03-31 15:55:52 +08:00
morrySnow	1a56c56e90	[fix](planner) lateral view do not support lower case table name config (#18165 ) TableFunctionNode lower_case_table_names set to 1 and 2	2023-03-31 13:42:24 +08:00
yongkang.zhong	1c2f95b887	[improve](clickhouse jdbc) support clickhouse jdbc 4.x version (#18258 ) In clickhouse's 4.x version of jdbc, some UInt types use special Java types, so I adapted Doris's ClickHouse JDBC External ``` com.clickhouse.data.value.UnsignedByte; com.clickhouse.data.value.UnsignedInteger; com.clickhouse.data.value.UnsignedLong; com.clickhouse.data.value.UnsignedShort; ```	2023-03-31 13:40:10 +08:00
gitccl	20b3bdb000	[vectorized](function) support array_first_index function (#18175 ) mysql> select array_first_index(x->x+1>3, [2, 3, 4]); +-------------------------------------------------------------------+ \| array_first_index(array_map([x] -> x(0) + 1 > 3, ARRAY(2, 3, 4))) \| +-------------------------------------------------------------------+ \| 2 \| +-------------------------------------------------------------------+ mysql> select array_first_index(x -> x is null, [null, 1, 2]); +----------------------------------------------------------------------+ \| array_first_index(array_map([x] -> x(0) IS NULL, ARRAY(NULL, 1, 2))) \| +----------------------------------------------------------------------+ \| 1 \| +----------------------------------------------------------------------+ mysql> select array_first_index(x->power(x,2)>10, [1, 2, 3, 4]); +---------------------------------------------------------------------------------+ \| array_first_index(array_map([x] -> power(x(0), 2.0) > 10.0, ARRAY(1, 2, 3, 4))) \| +---------------------------------------------------------------------------------+ \| 4 \| +---------------------------------------------------------------------------------+	2023-03-31 12:51:29 +08:00
Pxl	307170030c	[Bug](materialized-view) fix core dump when create mv have case different with base table (#18206 ) fix core dump when create mv have case different with base table	2023-03-31 12:32:09 +08:00
zhangstar333	1b2aaab2f2	[vectorized](bug) fix some case in enable fold constant (#17997 ) fix some case in enable fold constant	2023-03-31 11:41:31 +08:00
Pxl	e7bcd970f5	[Bug](materialized-view) fix isDisableTuplesMVRewriter rreturn true when expr is literal (#18246 ) fix isDisableTuplesMVRewriter rreturn true when expr is literal	2023-03-31 11:30:47 +08:00
jakevin	8e15388074	[fix](Nereids): use CBO rule instead of using rewrite rule. (#18256 )	2023-03-31 11:23:26 +08:00
Kang	4e1e0ce06d	[bugfix](topn) fix topn optimzation wrong result for NULL values (#18121 ) 1. add PassNullPredicate to fix topn wrong result for NULL values 2. refactor RuntimePredicate to avoid using TCondition 3. refactor using ordering_exprs in fe and vsort_node	2023-03-31 10:01:34 +08:00
minghong	1abb19d0fd	filter estimation refactor (#18170 )	2023-03-31 08:49:38 +08:00
abmdocrt	a88e80f8ee	[fix](ssl)refactor some SSL info logs to debug logs (#18234 )	2023-03-31 08:41:02 +08:00
AKIRA	b5ea299697	[fix](planner) Fix agg on inlineview which with constant slot (#18201 ) Since slot that reference to constant has been marked as constant expr either, just add condition check to make sure such slot wouldn't be eliminated as constant from group exprs	2023-03-30 23:54:37 +08:00
jakevin	28793b6441	[fix](Nereids): fix copyIn() in Memo when useless project with groupplan (#18223 )	2023-03-30 23:49:21 +08:00
Ashin Gau	d6b0fe9072	[feature](jni) jni table scanner framework (#17960 ) A framework that read data from jni scanner, which can support the data source from java ecosystem(java API). ## Java Interface Java scanner should extends `org.apache.doris.jni.JniScanner`, implements the following methods: ``` // Initialize JniScanner public abstract void open() throws IOException; // Close JniScanner and release resources public abstract void close() throws IOException; // Scan data and save as vector table public abstract int getNext() throws IOException; ``` See demo usage in `org.apache.doris.jni.MockJniScanner` ## c++ interface C++ reader should use `doris::JniConnector` to get data from `org.apache.doris.jni.JniScanner`. See demo usage in `doris::MockJniReader`. ## Pushed-down predicates Java scanner can get pushed-down predicates by `org.apache.doris.jni.vec.ScanPredicate`. ## Remaining works: 1. Implement complex nested types. 2. Read hudi MOR table as the end-to-end demo usage.	2023-03-30 23:47:45 +08:00
Xiangyu Wang	0c2ff09fcf	[Fix](multi-catalog) fix hms automatic update error. (#18252 ) Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>	2023-03-30 23:09:07 +08:00
jakevin	3d2c70f75d	[fix](Nereids): fix merge_group(). (#18250 )	2023-03-30 20:34:47 +08:00
mch_ucchi	fefc0d6814	[Fix](planner)fix create view ignore order by info bug. (#18197 )	2023-03-30 20:17:46 +08:00
liujinhui	ce79ff947a	[Enhancement](spark load)Support spark time out config (#17108 )	2023-03-30 20:12:46 +08:00
morrySnow	99bd5ec022	[fix](Nereids) fix some bugs in Subquery to window rule (#18233 ) we introduce this rule by PR #17968, but some corner case do not be processed correctly. This PR fix these bugs: 1. fix window function generation method, replace inner slot with equivalent outer slot 2. forbid below scenes a. inner has a mapping project b. inner has an unexpected filter c. outer has a mapping project d. outer has an unexpected filter e. outer has additional table f. outer has same table g. outer and inner with different join condition h. outer and inner has same table with different join condition	2023-03-30 16:09:16 +08:00
amory	ea41d94582	[Improve](complex-type) Support Count(complexType) (#17868 ) Support count function for ARRAY/MAP/STRUCT type	2023-03-30 15:43:32 +08:00
Gabriel	b7af110f61	[Bug](bloomfilter) Fix bloom filter for date type (#18205 )	2023-03-30 14:15:06 +08:00
Pxl	cec983b7ef	[Chore](materialized-view) forbiden create mv with where clause contained aggregate column (#18168 ) forbiden create mv with where clause contained aggregate column create table a_table( k1 int null, k2 int not null, k3 bigint null, k4 bigint sum null, k5 bitmap bitmap_union null, k6 hll hll_union null ) aggregate key (k1,k2,k3) distributed BY hash(k1) buckets 3 properties("replication_num" = "1"); create materialized view where_1 as select k1,k4 from a_table where k4 =1; // invalid, mv on agg table need group by create materialized view where_2 as select k1,sum(k4) from a_table where k4 =1 group by k1; // invalid, k4 is agg column create materialized view where_2 as select k1,sum(k4) from a_table where k1+k4 =1 group by k1; // invalid, k4 is agg column	2023-03-30 13:03:03 +08:00
Pxl	c8ad62a3cd	[Enchancement](materialized-view) enchance materialized view where clause match (#18179 ) enchance materialized view where clause match	2023-03-30 13:02:21 +08:00
mch_ucchi	c8ea5bff1d	[Fix](planner) fix nested udf bind arguments exception (#18188 ) nested alias function will cause bind argument exception, sql like: ``` sql CREATE ALIAS FUNCTION f1(DATETIMEV2(3), INT) with PARAMETER (datetime1, int1) as date_trunc(days_sub(datetime1, int1), 'day') CREATE ALIAS FUNCTION f2(DATETIMEV2(3), int) with PARAMETER (datetime1, int1) as DATE_FORMAT(HOURS_ADD( date_trunc(datetime1, 'day'), add(multiply(floor(divide(HOUR(datetime1), divide(24,int1))), 1), 1) ), '%Y%m%d:%H') select f2(f1(now(3), 2), 3) ``` bug in FunctionCallExpr#rewriteExpr(), the retExpr will be replaced to originExpr to change the alias function to builtin function, but the retExpr.fn is not null, so when return to outer scope, the fn will be covered. That's the example: ``` f1(f1()) -> date_trunc(days_sub(date_trunc(days_sub()))) is correct and f1(f1()) -> date_trunc(days_sub(days_sub())) is bug. ``` we fix it.	2023-03-30 11:39:02 +08:00
starocean999	b3657959c9	[fix](planner )need add LateralViewRef's id into TableRef's allTableRefIds (#18220 ) 1. add LateralViewRef's id into TableRef's allTableRefIds, so the caller won't miss LateralViewRef when trying to get all the tableref ids. 2. TableFunctionNode should use child node's output tuple id as the input tuple id	2023-03-30 11:32:18 +08:00
lihangyu	9c1aad06ea	[Improve](query) improve column match performance by introducing a column name map in `MaterializedIndexMeta` (#18203 ) improve column match performance by introducing a column name map in `MaterializedIndexMeta` `getColumnByName` is slow due to the linear search process, using a map to speed up search.	2023-03-30 11:24:51 +08:00
zhangstar333	525f15dddf	[vectorized](function) support array_sortby function (#18071 )	2023-03-30 11:07:49 +08:00

1 2 3 4 5 ...

4242 Commits