doris

Author	SHA1	Message	Date
Lightman	313e14d220	[Bugfix] (ROLLUP) fix the coredump when add rollup by link schema change (#15654 ) Because of the rollup has the same keys and the keys's order is same, BE will do linked schema change. The base tablet's segments will link to the new rollup tablet. But the unique id from the base tablet is starting from 0 and as the rollup tablet also. In this case, the unique id 4 in the base table is column 'city', but in the rollup tablet is 'cost'. It will decode the varcode page to bigint page so that be coredump. It needs to be rejected. I think that if a rollup add by link schema change, it means this rollup is redundant. It brings no additional revenue and wastes storage space. So It needs to be rejected.	2023-01-14 10:20:07 +08:00
Tiewei Fang	2580c88c1b	[feature](multi-catalog) support oracle jdbc catalog (#15862 )	2023-01-14 00:01:33 +08:00
caoliang-web	a788623ee2	doris largeint type execute where query, the result is incorrect (#15034 )	2023-01-13 23:12:02 +08:00
AlexYue	514de605b6	[Bug](predicate) add double predicate creator (#15762 ) Add one double predicator the same as integer predicate creator.	2023-01-13 18:34:09 +08:00
AlexYue	049f8ad2f9	[Bug](sort)fix merge sorter might div zero when block bytes less than block rows (#15859 ) If block bytes are bigger than the corresponding block's rows, then the avg_size_per_row would be zero. Which would end up diving zero in the following logic.	2023-01-13 18:33:40 +08:00
minghong	67378a2dc3	[fix](nereids) fix bug in SequenceFunction legality check (#15812 ) 1. fix bug in sequence_match function 2. do type promotion instead of explicit cast for - varcharLiteral -> stringLiteral - charLiteral->stringLiteral	2023-01-13 12:09:53 +08:00
luozenglin	b1fb1277dd	[fix](bitmap) fix bitmap iterator comparison error (#15779 ) Fix the bug that bitmap.begin() == bitmap.end() is always true when the bitmap contains a single value.	2023-01-13 11:37:07 +08:00
HappenLee	9468711f9f	[Bug](join) fix bug null aware left anti join not correct result (#15841 )	2023-01-13 10:18:05 +08:00
Adonis Ling	14e3879c4b	[regression-test](MTMV) Make the case test_create_mtmv more robust (#15866 ) ## Proposed changes 1. Check the state of MTMV task as the loop condition. 2. Check the data in materialized view. ## Problem summary There are some minor issues with #15546. 1. The case used a retry strategy as the loop condition, it may not be stable while the host machine is busy. 2. The case didn't check the final data in materialized view.	2023-01-13 00:13:24 +08:00
abmdocrt	7441b4dc96	[Feature](function) Support width_bucket function (#14396 )	2023-01-12 13:59:21 +08:00
谢健	39697bb83e	[fix](Nereids) make the type of the first parameter in window_funnel is intergerLike (#15810 )	2023-01-12 11:53:28 +08:00
starocean999	cfb110c905	[fix](nereids) fix some nereids bugs (#15714 ) 1. remove forcing nullable for slot on EmptySetNode. 2. order by xxx desc should use nulls last as default order. 3. don't create runtime filter if runtime filter mode is OFF. 4. group by constant value need check the corresponding expr shouldn't have any aggregation functions. 5. fix two left outer join reorder bug( A left join B left join C). 6. fix semi join and left outer join reorder bug.( A left join B semi join C ). 7. fix group by NULL bug. 8. change ceil and floor function to correct signature. 9. add literal comparasion for string and date type. 10. fix the getOnClauseUsedSlots method may not return valid value. 11. the tightness common type of string and date should be date. 12. the nullability of set operation node's result exprs is not set correctly. 13. Sort node should remove redundent ordering exprs.	2023-01-11 17:18:44 +08:00
starocean999	006b3bd61a	[fix](nereids) orthogonal_bitmap_intersect's return type should be bitmap (#15784 )	2023-01-11 12:53:37 +08:00
mch_ucchi	7f2c433e08	[feature](Nereids) add relation id to unboundTVFRelation to avoid incorrect group expression comparison (#15740 )	2023-01-11 12:49:14 +08:00
Pxl	2587095811	[Bug](mv) fix mv selector check group expr && forbid create dup mv with bitmap/hll && add some case (#15738 )	2023-01-11 11:38:56 +08:00
mch_ucchi	bc34a44f06	[Fix](Nereids) fix type coercion for binary arithmetic (#15185 ) support sql like: select true + 1 + '2.0' and prevent select true + 1 + 'x';	2023-01-11 02:55:44 +08:00
minghong	280603b253	[fix](nereids) bind sort key priority problem (#15646 ) `a.b.c` should only bind on `a.b.c`, not on `b.c` or `c`	2023-01-11 02:03:09 +08:00
starocean999	fec89ad58c	[fix](nereids) week should be able to recognized as function name in function call context (#15735 )	2023-01-10 19:54:59 +08:00
AKIRA	7767931aca	[ehancement](nereids) let parser support utf8 identifier (#15721 ) After this PR, below SQL could be parsed well too - SELECT k1 AS 测试 FROM test; - SELECT k1 AS テスト FROM test;	2023-01-10 19:43:04 +08:00
camby	bb28144c76	[fix](schema change) bugfix for light schema change while with rollup (#15681 ) Describe your changes. this problem come from pr: #11494 After add column to rollup index, it also change column UniqueId inside base index.	2023-01-10 19:03:06 +08:00
caoliang-web	672d11522b	[regression](flink)add flink doris connector case (#15676 ) * add flink doris connector case	2023-01-10 17:25:06 +08:00
camby	47097a3db8	[fix](having) revert 15143 and fix having clause with multi-conditions (#15745 ) Describe your changes. Firstly having clause of Mysql is really very complex, we are hard to follow all rules, so we revert pr15143 to keep the logic the same as before. Secondly the origin implementation has problem while having clause has multi-conditions. For example: case1: here v2 inside having clause use table column test_having_alias_tb.v2 SELECT id, v1-2 as v, sum(v2) v2 FROM test_having_alias_tb GROUP BY id,v having(v2>1); ERROR 1105 (HY000): errCode = 2, detailMessage = HAVING clause not produced by aggregation output (missing from GROUP BY clause?): (`v2` > 1) case2: here v2 inside having clause use alias name v2 =sum(test_having_alias_tb.v2), another condition make logic of v2 differently. SELECT id, v1-2 as v, sum(v2) v2 FROM test_having_alias_tb GROUP BY id,v having(v>0 AND v2>1) ORDER BY id,v; +------+------+------+ \| id \| v \| v2 \| +------+------+------+ \| 2 \| 1 \| 3 \| +------+------+------+ So here we try to make the having clause rules simple: Rule1: if alias name inside having clause is the same as column name, we use column name not alias name; Rule2: if alias name inside having clause do not have same name as column name, we use alias name; Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2023-01-10 15:57:29 +08:00
luozenglin	05f6e4c48a	[fix](predicate) fix be core dump caused by pushing down the double column predicate (#15693 )	2023-01-09 19:31:04 +08:00
AKIRA	5ceb5441f4	[feature](nereids) let set operation syntax campatible with lagecy planner (#15664 ) Though this syntax doesn't get suppoted in many other systems since the order by clause here almost redandunt and useless but we have to keep consistent with the legacy doris syntax Here is a example: SELECT * FROM (SELECT k1, k3 FROM tbl1 ORDER BY k3 UNION ALL SELECT k1, k5 FROM tbl2) t;	2023-01-09 15:31:29 +08:00
AKIRA	7543d677fa	[fix](nereids) Fix the bugs of data distribution calculation on OlapScan (#15699 ) when need to scan more than one olap table partition and it is not a colocate table or its colocate group is unstable, we need to make it as any distribution even if its distribution type is Hash	2023-01-09 15:25:54 +08:00
Gabriel	e2492cf7fc	[Bug](DECIMALV3) Fix binary predicate between decimalv3 and float (#15696 )	2023-01-09 15:16:59 +08:00
谢健	4c50c4906b	[fix](Nereids) add implicit casting for arithmetic expression (#15630 ) Add implicit casting for arithmetic expression to support select "1" + "2"	2023-01-09 15:10:35 +08:00
Gabriel	699bf972e2	[Bug](bitmap) Fix bitmap_from_string for null constant (#15698 )	2023-01-09 10:21:08 +08:00
Mingyu Chen	211cc66d02	[fix](multi-catalog) fix image loading failture when create catalog with resource (#15692 ) Bug fix fix image loading failture when create catalog with resource When creating jdbc catalog with resource, the metadata image will failed to be loaded. Because when loading jdbc catalog image, it will try to get resource from ResourceMgr, but ResourceMgr has not been loaded, so NPE will be thrown. This PR fix this bug, and refactor some logic about catalog and resource. When loading jdbc catalog image, it will not get resource from ResourceMgr. And now user can create catalog with resource and properties, like: create catalog jdbc_catalog with resource jdbc_resource properites("user" = "user1"); The properties in "properties" clause will overwrite the properties in "jdbc_resource". force adding tinyInt1isBit=false to jdbc url The default value of tinyInt1isBit is true, and it will cause tinyint in mysql to be bit type. force adding tinyInt1isBit=false to jdbc url so that the tinyint in mysql will be tinyint in Doris. Avoid calculate checksum of jdbc driver jar multiple times Refactor Refactor the notification logic when updating properties in resource. When updating properties in resource, it will notify the corresponding catalog to update its own properties. This PR change this logic. After updating properties in resource, it will only uninitialize the catalog's internal objects such "jdbc client" or "hms client". And this objects will be re-initialized lazily. And all properties will be got from Resource at runtime, so that it will always get the latest properties Regression test cases Because we add tinyInt1isBit=false to jdbc url, some of cases need to be changed.	2023-01-09 09:56:26 +08:00
Pxl	1514b5ab5c	[Feature](Materialized-View) support advanced Materialized-View (#15212 )	2023-01-09 09:53:11 +08:00
Mingyu Chen	500c7fb702	[improvement](multi-catalog) support unsupported column type (#15660 ) When creating an external catalog, Doris will automatically sync the schema of table from external catalog. But some of column type are not supported by Doris now, such as struct, map, etc. In previous, when meeting these unsupported column, Doris will throw an exception, and the corresponding table can not be synced. But user may just want to query other supported columns. In this PR, I add a new column type: UNSUPPORTED. And now it is just used for external table schema sync. When meeting unsupported column, it will be synced as column with UNSUPPORTED type. When query this table, there are serval situation: select * from table: throw error Unsupported type 'UNSUPPORTED_TYPE' xxx select k1 from table: k1 is with supported type. query OK. select * except(k2): k2 is with unsupported type. query OK	2023-01-08 10:07:10 +08:00
ElvinWei	76ad599fd7	[enhancement](histogram) optimise aggregate function histogram (#15317 ) This pr mainly to optimize the histogram(👉🏻 https://github.com/apache/doris/pull/14910) aggregation function. Including the following: 1. Support input parameters `sample_rate` and `max_bucket_num` 2. Add UT and regression test 3. Add documentation 4. Optimize function implementation logic Parameter description： - `sample_rate`：Optional. The proportion of sample data used to generate the histogram. The default is 0.2. - `max_bucket_num`：Optional. Limit the number of histogram buckets. The default value is 128. --- Example： ``` MySQL [test]> SELECT histogram(c_float) FROM histogram_test; +-------------------------------------------------------------------------------------------------------------------------------------+ \| histogram(`c_float`) \| +-------------------------------------------------------------------------------------------------------------------------------------+ \| {"sample_rate":0.2,"max_bucket_num":128,"bucket_num":3,"buckets":[{"lower":"0.1","upper":"0.1","count":1,"pre_sum":0,"ndv":1},...]} \| +-------------------------------------------------------------------------------------------------------------------------------------+ MySQL [test]> SELECT histogram(c_string, 0.5, 2) FROM histogram_test; +-------------------------------------------------------------------------------------------------------------------------------------+ \| histogram(`c_string`) \| +-------------------------------------------------------------------------------------------------------------------------------------+ \| {"sample_rate":0.5,"max_bucket_num":2,"bucket_num":2,"buckets":[{"lower":"str1","upper":"str7","count":4,"pre_sum":0,"ndv":3},...]} \| +-------------------------------------------------------------------------------------------------------------------------------------+ ``` Query result description： ``` { "sample_rate": 0.2, "max_bucket_num": 128, "bucket_num": 3, "buckets": [ { "lower": "0.1", "upper": "0.2", "count": 2, "pre_sum": 0, "ndv": 2 }, { "lower": "0.8", "upper": "0.9", "count": 2, "pre_sum": 2, "ndv": 2 }, { "lower": "1.0", "upper": "1.0", "count": 2, "pre_sum": 4, "ndv": 1 } ] } ``` Field description： - sample_rate：Rate of sampling - max_bucket_num：Limit the maximum number of buckets - bucket_num：The actual number of buckets - buckets：All buckets - lower：Upper bound of the bucket - upper：Lower bound of the bucket - count：The number of elements contained in the bucket - pre_sum：The total number of elements in the front bucket - ndv：The number of different values in the bucket > Total number of histogram elements = number of elements in the last bucket(count) + total number of elements in the previous bucket(pre_sum).	2023-01-07 00:50:32 +08:00
mch_ucchi	08d439cde7	[feature](Nereids) add keyword rlike (#15647 )	2023-01-07 00:28:21 +08:00
yongkang.zhong	cad47dd9d9	[test](Nereids) add two regression test cases for Nereids (#15598 ) 1. test predicates infer could work well with push down predicates through join 2. test count with subquery containing constant literal	2023-01-06 16:29:50 +08:00
luozenglin	53559e2bdc	[fix](decimalv2) fix loss of precision when cast to decimalv2 literal (#15629 )	2023-01-06 16:02:46 +08:00
AKIRA	7f84db310a	[fix](nereids) Convert to datetime when binary expr's left is date and right is int type (#15615 ) In the below case, expression ` date > 20200101` should implicit cast date both side to datetime instead of bigint ```sql CREATE TABLE `part_by_date` ( `date` date NOT NULL COMMENT '', `id` int(11) NOT NULL COMMENT '' ) ENGINE=OLAP UNIQUE KEY(`date`, `id`) PARTITION BY RANGE(`date`) (PARTITION p201912 VALUES [('0000-01-01'), ('2020-01-01')), PARTITION p202001 VALUES [('2020-01-01'), ('2020-02-01'))) DISTRIBUTED BY HASH(`id`) BUCKETS 3 PROPERTIES ( "replication_allocation" = "tag.location.default: 1" ); INSERT INTO part_by_date VALUES('0001-02-01', 1),('2020-01-15', 2); SELECT id FROM part_by_date WHERE date > 20200101; ```	2023-01-06 14:08:05 +08:00
Tiewei Fang	df2da89b89	[feature](multi-catalog) support postgresql jdbc catalog (#15570 ) support postgresql jdbc catalog	2023-01-06 11:00:59 +08:00
luozenglin	05d72e8919	[fix](join) fix anti join incorrectly outputs null values (#15567 )	2023-01-06 09:55:48 +08:00
zhengshiJ	5460c873e8	[Feature] (Nereids) support un equals conjuncts in un scalar sub query (#15591 ) support un equals conjuncts in un scalar sub query. [fix] in correlated subquery wrong result	2023-01-05 16:56:14 +08:00
camby	59f34be41f	[fix](having-clause) having clause do not works correct with same alias name (#15143 )	2023-01-05 10:15:15 +08:00
Gabriel	5ff5b8fc98	[feature](mark join) Support mark join for hash join node (#15569 ) * [feature](mark join) Support mark join for hash join node	2023-01-05 09:32:26 +08:00
AKIRA	f2f06c1acc	[feature](nereids) Support select temp partition (#15579 ) Support such grammer: select * from t_p temporary partition(tp1); select * from t_p temporary partitions(tp1); select * from t_p temporary partition tp1;	2023-01-04 11:04:36 +08:00
Gabriel	eef1f432dd	[Bug](datetimev2/decimalv3) Fix wrong predicate infer rule (#15574 )	2023-01-04 10:03:43 +08:00
starocean999	a97f582b93	[fix](nereids) use DAYS as default unit for DATE_ADD and DATE_SUB function (#15559 )	2023-01-04 01:55:15 +08:00
Shuo Wang	18bc354c06	[fix](Nereids) use correct column unique id when read data from non-base index (#15534 ) When light schema change is enabled by default, a column in OLAP scan is retrieved by column unique id instead of the column name. Columns with the same name would use different unique IDs among materialized indexes. This PR ensures that the column in the OLAP scan node could use the correct column unique id.	2023-01-04 01:41:25 +08:00
minghong	8d0c06c897	[fix](nereids) binding priority in agg-sort, having, group_by_key (#15240 ) This PR defines order_key and having_key binding priority. 1. order key priority ``` select col1 * -1 as col1 # inner_col1 * -1 as alias_col1 from t order by col1; # order by order_col1 ``` to bind `order_col1`, `alias_col1` has higher priority than `inner_col1` 2. having key priority ``` select (a-1) as a # inner_a - 1 as alias_a from bind_priority_tbl group by a having a=1; ``` to bind having key, `inner_a` has higher priority than `alias_a` 3. group by key binding priority ``` SELECT date_format(b.k10, '%Y%m%d') AS k10 FROM test a LEFT JOIN (SELECT k10 FROM baseall) b ON a.k10 = b.k10 GROUP BY k10; ``` group_by_key (k10) binding priority: - agg.child.output - agg.output if binding with agg.child.output failed(the slot not found, or more than one candidate slot found in agg.child.output), nereids try to bind group_by_key with agg.output. In above example, nereids found 2 candidate slots (a.k10, b.k10) in agg.child.output for group_by_key (k10), binding with agg.child.output failed. Then nereids try to bind group_by_key with agg.output, that is `date_format(b.k10, '%Y%m%d') AS k10`. and finally, group_by_key is bound with `alias k10`	2023-01-03 22:09:28 +08:00
starocean999	55dc541c90	[Fix](Nereids) aggregate function except COUNT should nullable without group by expr (#15547 ) Co-authored-by: mch_ucchi	2023-01-03 21:28:07 +08:00
Pxl	85fe9d2496	[Bug](filter) fix not in(null) return true (#15466 ) fix not in(null) return true	2023-01-03 21:14:50 +08:00
zhengshiJ	1dabcb0111	[Fix](Nereids) fix except and intersect error for statsCalculator (#15557 ) When calculating the statsCalculator of except and intersect, the slotId of the corresponding column was not replaced with the slotId of output, resulting in NPE.	2023-01-03 17:06:57 +08:00
zhangstar333	b50448d5c4	[vectorized](udaf) fix udaf result is null when has multiple aggs (#15554 )	2023-01-03 16:03:43 +08:00

1 2 3 4 5 ...

673 Commits