doris

Author	SHA1	Message	Date
Jibing-Li	86d7a8be44	[improvement](statistics nereids)Nereids support select mv. (#30267 )	2024-01-25 13:24:09 +08:00
jakevin	83ea486b15	[fix](Nereids): Except just can merge with left deep shape (#30270 )	2024-01-25 13:24:09 +08:00
Pxl	7e60369ba2	[Feature](materialized-view) support create mv with count() (#30313 ) support create mv with count()	2024-01-25 13:24:09 +08:00
Guangdong Liu	101b2593fc	[regression test](schema change) add case for tinyint/smallint/int/bigint/float/double type in agg (#30193 )	2024-01-25 13:24:09 +08:00
Guangdong Liu	df504df475	[regression test](schema change) add case for partition (#30195 )	2024-01-25 13:24:09 +08:00
koarz	ca5a314765	[fix](function) make STRLEFT and STRRIGHT and SUBSTR function DEPEND_ON_ARGUMENT (#28352 ) make STRLEFT and STRRIGHT function DEPEND_ON_ARGUMENT	2024-01-25 13:23:59 +08:00
seawinde	2f68aac885	[Improvement](Nereids) Support to query rewrite by materialized view when join input has aggregate (#30230 ) Support to query rewrite by materialized view when join input has aggregate, the aggregate should be simple For example as following: The materialized view def is > select > l_linenumber, > count(distinct l_orderkey), > sum(case when l_orderkey in (1,2,3) then l_suppkey * l_linenumber else 0 end), > max(case when l_orderkey in (4, 5) then (l_quantity 2 + part_supp_a.qty_max) 0.88 else 100 end), > avg(case when l_partkey in (2, 3, 4) then l_discount + o_totalprice + part_supp_a.qty_sum else 50 end) > from lineitem > left join orders on l_orderkey = o_orderkey > left join > (select ps_partkey, ps_suppkey, sum(ps_availqty) qty_sum, max(ps_availqty) qty_max, > min(ps_availqty) qty_min, > avg(ps_supplycost) cost_avg > from partsupp > group by ps_partkey,ps_suppkey) part_supp_a > on l_partkey = part_supp_a.ps_partkey > and l_suppkey = part_supp_a.ps_suppkey > group by l_linenumber; when query is like following, it can be rewritten by mv above > select > l_linenumber, > sum(case when l_orderkey in (1,2,3) then l_suppkey * l_linenumber else 0 end), > avg(case when l_partkey in (2, 3, 4) then l_discount + o_totalprice + part_supp_a.qty_sum else 50 end) > from lineitem > left join orders on l_orderkey = o_orderkey > left join > (select ps_partkey, ps_suppkey, sum(ps_availqty) qty_sum, max(ps_availqty) qty_max, > min(ps_availqty) qty_min, > avg(ps_supplycost) cost_avg > from partsupp > group by ps_partkey,ps_suppkey) part_supp_a > on l_partkey = part_supp_a.ps_partkey > and l_suppkey = part_supp_a.ps_suppkey > group by l_linenumber;	2024-01-25 13:23:59 +08:00
Nitin-Kashyap	f85b04c2c6	[fix](datatype) fixed decimal type implicit cast handling in BinaryPredicate (#30181 )	2024-01-25 13:23:12 +08:00
nanfeng	c7360fd014	[feature](function) support ip function named ipv4_cidr_to_range(addr, cidr) (#29819 ) * support ip function ipv4_cidr_to_range * fix ipv4_cidr_to_range function only support ipv4 type	2024-01-24 10:02:03 +08:00
jakevin	1b9f1f6483	[feature](Planner): Push down TopNDistinct through Join (#30216 ) Push down TopNDistinct through Outer/Cross Join	2024-01-24 09:59:13 +08:00
wuwenchi	8308bc96b9	[fix](paimon)set timestamp's scale for parquet which has no logical type (#30119 )	2024-01-23 13:22:14 +08:00
Pxl	1e74ad3f3b	[Feature](materialized-view) support predicate apprear both on key and value mv column (#30215 ) support predicate apprear both on key and value mv column	2024-01-23 13:22:14 +08:00
morrySnow	ce47354d59	[fix](Nereids) result nullable of sum distinct in scalar agg is wrong (#30221 )	2024-01-23 10:09:54 +08:00
yangshijie	d5d0e5e611	[feature](function) support ip functions named to_ipv4[or_default, or_null](string) and to_ipv6[or_default, or_null](string) (#29838 )	2024-01-23 10:09:54 +08:00
starocean999	24c0900b41	[fix](planner) should return outputTupleDesc's id instead of tupleIds if outputTupleDesc is set in Plan Node (#30150 )	2024-01-23 10:09:54 +08:00
zhiqiang	e5dea910bf	[feature](bitwise function) bit_count/bit_shift_left/bit_shift_right implementation (#30046 )	2024-01-23 10:09:54 +08:00
seawinde	9a58cacf0f	[Improvement](nereids) Make sure to catch and record exception for every materialization context (#29953 ) 1. Make sure instance when change params of StructInfo,Predicates. 2. Catch and record exception for every materialization context, this make sure that if throw exception when one materialization context rewrite, it will not influence others. 3. Support to mv rewrite when hava count function when aggregate without group by	2024-01-23 10:09:54 +08:00
Chester	dfde10d4c8	[improvement](function) switch inet(6)_aton alias origin function (#30196 )	2024-01-23 10:09:54 +08:00
lihangyu	4480f751e6	[Improve](Variant) support implicit cast to numeric and string type (#30029 )	2024-01-23 10:09:54 +08:00
zzzxl	e5f1d8d7ec	[fix](phrase_prefix) fix match_phrase_prefix query incorrect result (#29946 )	2024-01-23 10:09:54 +08:00
minghong	332b9cb619	[opt](nereids) do not change RuntimeFilter Type from IN-OR_BLOOM to BLOOM on broadcast join (#30148 ) 1. do not change RuntimeFilter Type from IN-OR_BLOOM to BLOOM on broadcast join tpcds1T, q48 improved from 4.x sec to 1.x sec 2. skip some redunant runtime filter example: A join B on A.a1=B.b and A.a1 = A.a2 RF B.b->(A.a1, A.a2) however, RF(B.b->A.a2) is implied by RF(B.a->A.a1) and A.a1=A.a2 we skip RF(B.b->A.a2) Issue Number: close #xxx	2024-01-23 10:07:51 +08:00
Chester	ead3b4ac1d	[feature](function) support ip function is_ipv4_compat, is_ipv4_mapped (#29954 )	2024-01-23 10:07:51 +08:00
minghong	ddeed079d4	[opt](Nereids)make orToIn rule appliable to in-pred (#29990 ) make orToIn rule appliable to in-pred	2024-01-19 15:48:56 +08:00
yangshijie	97b2a3b993	[improvement](ip function) refactor some ip functions and remove dirty codes (#30080 )	2024-01-19 15:48:56 +08:00
谢健	e560f31692	[fix](Nereids): fix eliminate join test for pk-fk constraint (#30094 )	2024-01-19 15:48:56 +08:00
qiye	fac0580eae	[opt](docker)optimize ES docker compose (#30068 ) 1. add volume for es logs 2. optimize health check, waiting for es status to be green 3. fix es6 valume path error 4. optimize disk watermark to avoid es disk watermark error 5. fix es6 create index error 6. add custom elasticsearch.yml for es6 7. add log4j2.properties for es6, es7, es8	2024-01-19 15:48:56 +08:00
jakevin	097641b543	[fix](Nereids): fix AssertNumRows StatsCalculator (#30053 )	2024-01-19 15:48:15 +08:00
Pxl	2ccb69dbed	[Feature](materialized-view) support some case unmached to materialized-view (#30036 ) same column appears in key and value like select id,count(id) group by id; complex expr in sum select sum(if(xxx));	2024-01-18 12:03:07 +08:00
zy-kkk	0ccd706a30	[Enhancement](Jdbc Catalog) Map Jdbc Catalog JSON Type to String for Improved Performance and Compatibility (#30035 ) This PR proposes mapping external catalog JSON types to String instead of JsonB in Apache Doris. This change is motivated by the realization that JDBC retrieves JSON data as a String JSON string, regardless of its storage format (Json(String) or Json(Binary)). Mapping to String streamlines data retrieval, simplifies write-backs, and ensures compatibility with all JSON(String) and JSON(Binary) functions, despite potentially misleading displays of JSON data as Strings in Doris. This approach avoids the performance overhead and complexity of converting each row of data from JsonB to String, making the process more efficient and elegant. About Upgrade To ensure query compatibility with existing Catalogs in the upgraded version,we currently still retain the capability to query external JSON types as JSONB. However, once you upgrade to the new version and either refresh the Catalog or create a new one, all external JSON types will be treated as Strings. To ensure consistent behavior,and possible future removal of support for JSON as JSONB query code, it is highly recommended that you manually refresh your Catalog as soon as possible after upgrading to the new version.	2024-01-18 12:03:07 +08:00
wuwenchi	44ba9e102c	[feature](statistics)support statistics for iceberg/paimon/hudi table (#29868 )	2024-01-18 12:03:07 +08:00
amory	ade720470d	[Improve](config)delete confused config for nested complex type (#29988 )	2024-01-18 12:03:07 +08:00
zhangstar333	e894911cda	[function](char) change char function behaviour same with mysql (#30034 ) select char(0) = '\0'; should return true;	2024-01-18 10:04:21 +08:00
Pxl	b0c49024cb	[Feature](materialized-view) support match function with alias in materialized-view (#30025 ) support match function with alias in materialized-view	2024-01-18 10:04:21 +08:00
jakevin	3deee14680	[fix](Nereids): find hash condition after infer predicate (#30026 )	2024-01-18 10:03:01 +08:00
wuwenchi	74991c4af2	[bugfix](paimon)support native and jni to read paimon for minio/cos #29933	2024-01-16 18:49:01 +08:00
谢健	4bf4239d7a	[feature](Nereids): optimize logical group expression in dphyp (#30000 )	2024-01-16 18:48:20 +08:00
zy-kkk	f53d2c28cb	[improvement](catalog) fix jdbc mysql catalog to_date fun pushdown (#29900 )	2024-01-16 18:46:19 +08:00
minghong	22978726e3	[opt](nereids) if column stats are unknown, 10-20 table-join optimization use cascading instead of dphyp (#29902 ) * if column stats are unknown, do not use dphyp tpcds query64 is optimized in case of no stats sf500, query64 improved from 15sec to 7sec on hdfs, and from 4sec to 3.85sec on olaptable	2024-01-16 18:46:19 +08:00
morrySnow	07de535c4c	[fix](Nereids) should not fold constant when do ordinal group by (#29976 )	2024-01-16 18:46:19 +08:00
yangshijie	66513d57f9	[feature](function) support ip function named ipv6_cidr_to_range(addr, cidr) (#29812 )	2024-01-16 18:42:09 +08:00
amory	d5dcdf3e07	[Improve](array) support array_enumerate_uniq and array_suffle for nereids (#29936 )	2024-01-16 18:40:32 +08:00
zy-kkk	f6dc6ea13b	[improvement](catalog) Escape characters for columns in recovery predicate pushdown in SQL (#29854 ) In the previous logic, when we restored the Column in the predicate pushdown based on the logical syntax tree for JdbcScanNode, in order to avoid query errors caused by keywords such as `key`, we added escape characters for it, but before we only Binary predicates are processed, which is imperfect. We should add escape characters to all columns that appear in the predicate to avoid errors with keywords or illegal characters.	2024-01-16 18:39:00 +08:00
yujun	8ca807578f	[fix](migrate disk) fix migrate disk lost data during publish version (#29887 ) Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>	2024-01-16 18:37:06 +08:00
minghong	a69ce49b07	[fix](Nereids) adjust min/max stats for cast function if types are comparable (#28166 ) estimate column stats for "cast(col, XXXType)" -----cast-est------ query4 41169 40335 40267 40267 query58 463 361 401 361 Total cold run time: 41632 ms Total hot run time: 40628 ms ----master------ query4 40624 40180 40299 40180 query58 487 389 420 389 Total cold run time: 41111 ms Total hot run time: 40569 ms	2024-01-16 18:31:59 +08:00
seawinde	0b16938b7f	[Fix](Nereids) Fix datatype length wrong when string contains chinese (#29885 ) When varchar literal contains chinese, the length of varchar should not be the length of the varchar, it should be the actual length of the using byte. Chinese is represented by unicode, a chinese char occypy 4 byte at mostly. So if meet chinese in varchar literal, we set the length is 4* length. for example as following: > CREATE MATERIALIZED VIEW test_varchar_literal_mv > BUILD IMMEDIATE REFRESH AUTO ON MANUAL > DISTRIBUTED BY RANDOM BUCKETS 2 > PROPERTIES ('replication_num' = '1') > AS > select case when l_orderkey > 1 then "一二三四" else "五六七八" end as field_1 from lineitem; mysql> desc test_varchar_literal_mv; the def of materialized view is as following: +---------+-------------+------+-------+---------+-------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +---------+-------------+------+-------+---------+-------+ \| field_1 \| VARCHAR(16) \| No \| false \| NULL \| NONE \| +---------+-------------+------+-------+---------+-------+	2024-01-16 18:31:59 +08:00
zhangstar333	115815739c	[bugfix](fe) add check for leg/lead function params (#29617 )	2024-01-16 18:31:59 +08:00
seawinde	d47adbb81f	[Fix](nereids) Fix cte rewrite by mv failure and predicates compensation by mistake (#29820 ) Fix cte rewrite by mv wrongly when query has scalar aggregate but view no For example as following, it should not be rewritten by materialized view successfully // materialzied view define def mv20_1 = """ select l_shipmode, l_shipinstruct, sum(l_extendedprice), count() from lineitem left join orders on lineitem.L_ORDERKEY = orders.O_ORDERKEY group by l_shipmode, l_shipinstruct; """ // query sql def query20_1 = """ select sum(l_extendedprice), count() from lineitem left join orders on lineitem.L_ORDERKEY = orders.O_ORDERKEY """ Fix predicates compensation by mistake For example as following, it can return right result, but it's wrong earlier. // materialzied view define def mv7_1 = """ select l_shipdate, o_orderdate, l_partkey, l_suppkey from lineitem left join orders on lineitem.l_orderkey = orders.o_orderkey where l_shipdate = '2023-12-08' and o_orderdate = '2023-12-08'; """ // query sql def query7_1 = """ select l_shipdate, o_orderdate, l_partkey, l_suppkey from (select * from lineitem where l_shipdate = '2023-10-17' ) t1 left join orders on t1.l_orderkey = orders.o_orderkey; """ and optimize some code usage and add more comment for method	2024-01-16 18:31:27 +08:00
zhangstar333	e417128fb9	[bug](bitmap) should return error status when execute failed (#29841 )	2024-01-16 18:30:23 +08:00
yangshijie	1998735432	[Improvement](function) enable ipv6_num_to_string function to support handling of IPv6 type (#29886 ) Enable ipv6_num_to_string function to handle IPv6 type normally in addition to handling 16 byte string types	2024-01-16 18:30:23 +08:00
xzj7019	ee66f1563e	[fix](Nereids) fix rf push down union (#29847 ) Current union rf push down only support rf from parent join, but not support ancestor join. The pr fixes this problem on project/distribute node's rf pushing down checking.	2024-01-16 18:30:22 +08:00

1 2 3 4 5 ...

2431 Commits