doris

Author	SHA1	Message	Date
Chester	dfde10d4c8	[improvement](function) switch inet(6)_aton alias origin function (#30196 )	2024-01-23 10:09:54 +08:00
lihangyu	4480f751e6	[Improve](Variant) support implicit cast to numeric and string type (#30029 )	2024-01-23 10:09:54 +08:00
zzzxl	e5f1d8d7ec	[fix](phrase_prefix) fix match_phrase_prefix query incorrect result (#29946 )	2024-01-23 10:09:54 +08:00
minghong	332b9cb619	[opt](nereids) do not change RuntimeFilter Type from IN-OR_BLOOM to BLOOM on broadcast join (#30148 ) 1. do not change RuntimeFilter Type from IN-OR_BLOOM to BLOOM on broadcast join tpcds1T, q48 improved from 4.x sec to 1.x sec 2. skip some redunant runtime filter example: A join B on A.a1=B.b and A.a1 = A.a2 RF B.b->(A.a1, A.a2) however, RF(B.b->A.a2) is implied by RF(B.a->A.a1) and A.a1=A.a2 we skip RF(B.b->A.a2) Issue Number: close #xxx	2024-01-23 10:07:51 +08:00
Chester	ead3b4ac1d	[feature](function) support ip function is_ipv4_compat, is_ipv4_mapped (#29954 )	2024-01-23 10:07:51 +08:00
minghong	ddeed079d4	[opt](Nereids)make orToIn rule appliable to in-pred (#29990 ) make orToIn rule appliable to in-pred	2024-01-19 15:48:56 +08:00
yangshijie	97b2a3b993	[improvement](ip function) refactor some ip functions and remove dirty codes (#30080 )	2024-01-19 15:48:56 +08:00
谢健	e560f31692	[fix](Nereids): fix eliminate join test for pk-fk constraint (#30094 )	2024-01-19 15:48:56 +08:00
qiye	fac0580eae	[opt](docker)optimize ES docker compose (#30068 ) 1. add volume for es logs 2. optimize health check, waiting for es status to be green 3. fix es6 valume path error 4. optimize disk watermark to avoid es disk watermark error 5. fix es6 create index error 6. add custom elasticsearch.yml for es6 7. add log4j2.properties for es6, es7, es8	2024-01-19 15:48:56 +08:00
jakevin	097641b543	[fix](Nereids): fix AssertNumRows StatsCalculator (#30053 )	2024-01-19 15:48:15 +08:00
Pxl	2ccb69dbed	[Feature](materialized-view) support some case unmached to materialized-view (#30036 ) same column appears in key and value like select id,count(id) group by id; complex expr in sum select sum(if(xxx));	2024-01-18 12:03:07 +08:00
zy-kkk	0ccd706a30	[Enhancement](Jdbc Catalog) Map Jdbc Catalog JSON Type to String for Improved Performance and Compatibility (#30035 ) This PR proposes mapping external catalog JSON types to String instead of JsonB in Apache Doris. This change is motivated by the realization that JDBC retrieves JSON data as a String JSON string, regardless of its storage format (Json(String) or Json(Binary)). Mapping to String streamlines data retrieval, simplifies write-backs, and ensures compatibility with all JSON(String) and JSON(Binary) functions, despite potentially misleading displays of JSON data as Strings in Doris. This approach avoids the performance overhead and complexity of converting each row of data from JsonB to String, making the process more efficient and elegant. About Upgrade To ensure query compatibility with existing Catalogs in the upgraded version,we currently still retain the capability to query external JSON types as JSONB. However, once you upgrade to the new version and either refresh the Catalog or create a new one, all external JSON types will be treated as Strings. To ensure consistent behavior,and possible future removal of support for JSON as JSONB query code, it is highly recommended that you manually refresh your Catalog as soon as possible after upgrading to the new version.	2024-01-18 12:03:07 +08:00
wuwenchi	44ba9e102c	[feature](statistics)support statistics for iceberg/paimon/hudi table (#29868 )	2024-01-18 12:03:07 +08:00
amory	ade720470d	[Improve](config)delete confused config for nested complex type (#29988 )	2024-01-18 12:03:07 +08:00
zhangstar333	e894911cda	[function](char) change char function behaviour same with mysql (#30034 ) select char(0) = '\0'; should return true;	2024-01-18 10:04:21 +08:00
Pxl	b0c49024cb	[Feature](materialized-view) support match function with alias in materialized-view (#30025 ) support match function with alias in materialized-view	2024-01-18 10:04:21 +08:00
jakevin	3deee14680	[fix](Nereids): find hash condition after infer predicate (#30026 )	2024-01-18 10:03:01 +08:00
wuwenchi	74991c4af2	[bugfix](paimon)support native and jni to read paimon for minio/cos #29933	2024-01-16 18:49:01 +08:00
谢健	4bf4239d7a	[feature](Nereids): optimize logical group expression in dphyp (#30000 )	2024-01-16 18:48:20 +08:00
zy-kkk	f53d2c28cb	[improvement](catalog) fix jdbc mysql catalog to_date fun pushdown (#29900 )	2024-01-16 18:46:19 +08:00
minghong	22978726e3	[opt](nereids) if column stats are unknown, 10-20 table-join optimization use cascading instead of dphyp (#29902 ) * if column stats are unknown, do not use dphyp tpcds query64 is optimized in case of no stats sf500, query64 improved from 15sec to 7sec on hdfs, and from 4sec to 3.85sec on olaptable	2024-01-16 18:46:19 +08:00
morrySnow	07de535c4c	[fix](Nereids) should not fold constant when do ordinal group by (#29976 )	2024-01-16 18:46:19 +08:00
yangshijie	66513d57f9	[feature](function) support ip function named ipv6_cidr_to_range(addr, cidr) (#29812 )	2024-01-16 18:42:09 +08:00
amory	d5dcdf3e07	[Improve](array) support array_enumerate_uniq and array_suffle for nereids (#29936 )	2024-01-16 18:40:32 +08:00
zy-kkk	f6dc6ea13b	[improvement](catalog) Escape characters for columns in recovery predicate pushdown in SQL (#29854 ) In the previous logic, when we restored the Column in the predicate pushdown based on the logical syntax tree for JdbcScanNode, in order to avoid query errors caused by keywords such as `key`, we added escape characters for it, but before we only Binary predicates are processed, which is imperfect. We should add escape characters to all columns that appear in the predicate to avoid errors with keywords or illegal characters.	2024-01-16 18:39:00 +08:00
yujun	8ca807578f	[fix](migrate disk) fix migrate disk lost data during publish version (#29887 ) Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>	2024-01-16 18:37:06 +08:00
minghong	a69ce49b07	[fix](Nereids) adjust min/max stats for cast function if types are comparable (#28166 ) estimate column stats for "cast(col, XXXType)" -----cast-est------ query4 41169 40335 40267 40267 query58 463 361 401 361 Total cold run time: 41632 ms Total hot run time: 40628 ms ----master------ query4 40624 40180 40299 40180 query58 487 389 420 389 Total cold run time: 41111 ms Total hot run time: 40569 ms	2024-01-16 18:31:59 +08:00
seawinde	0b16938b7f	[Fix](Nereids) Fix datatype length wrong when string contains chinese (#29885 ) When varchar literal contains chinese, the length of varchar should not be the length of the varchar, it should be the actual length of the using byte. Chinese is represented by unicode, a chinese char occypy 4 byte at mostly. So if meet chinese in varchar literal, we set the length is 4* length. for example as following: > CREATE MATERIALIZED VIEW test_varchar_literal_mv > BUILD IMMEDIATE REFRESH AUTO ON MANUAL > DISTRIBUTED BY RANDOM BUCKETS 2 > PROPERTIES ('replication_num' = '1') > AS > select case when l_orderkey > 1 then "一二三四" else "五六七八" end as field_1 from lineitem; mysql> desc test_varchar_literal_mv; the def of materialized view is as following: +---------+-------------+------+-------+---------+-------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +---------+-------------+------+-------+---------+-------+ \| field_1 \| VARCHAR(16) \| No \| false \| NULL \| NONE \| +---------+-------------+------+-------+---------+-------+	2024-01-16 18:31:59 +08:00
zhangstar333	115815739c	[bugfix](fe) add check for leg/lead function params (#29617 )	2024-01-16 18:31:59 +08:00
seawinde	d47adbb81f	[Fix](nereids) Fix cte rewrite by mv failure and predicates compensation by mistake (#29820 ) Fix cte rewrite by mv wrongly when query has scalar aggregate but view no For example as following, it should not be rewritten by materialized view successfully // materialzied view define def mv20_1 = """ select l_shipmode, l_shipinstruct, sum(l_extendedprice), count() from lineitem left join orders on lineitem.L_ORDERKEY = orders.O_ORDERKEY group by l_shipmode, l_shipinstruct; """ // query sql def query20_1 = """ select sum(l_extendedprice), count() from lineitem left join orders on lineitem.L_ORDERKEY = orders.O_ORDERKEY """ Fix predicates compensation by mistake For example as following, it can return right result, but it's wrong earlier. // materialzied view define def mv7_1 = """ select l_shipdate, o_orderdate, l_partkey, l_suppkey from lineitem left join orders on lineitem.l_orderkey = orders.o_orderkey where l_shipdate = '2023-12-08' and o_orderdate = '2023-12-08'; """ // query sql def query7_1 = """ select l_shipdate, o_orderdate, l_partkey, l_suppkey from (select * from lineitem where l_shipdate = '2023-10-17' ) t1 left join orders on t1.l_orderkey = orders.o_orderkey; """ and optimize some code usage and add more comment for method	2024-01-16 18:31:27 +08:00
zhangstar333	e417128fb9	[bug](bitmap) should return error status when execute failed (#29841 )	2024-01-16 18:30:23 +08:00
yangshijie	1998735432	[Improvement](function) enable ipv6_num_to_string function to support handling of IPv6 type (#29886 ) Enable ipv6_num_to_string function to handle IPv6 type normally in addition to handling 16 byte string types	2024-01-16 18:30:23 +08:00
xzj7019	ee66f1563e	[fix](Nereids) fix rf push down union (#29847 ) Current union rf push down only support rf from parent join, but not support ancestor join. The pr fixes this problem on project/distribute node's rf pushing down checking.	2024-01-16 18:30:22 +08:00
Mingyu Chen	ebfbe0c8dd	[opt](information_schema) support information_schema in external catalog (#28919 ) Add `information_schema` database for all catalog. This is useful when using BI tools to connect to Doris, the tools can get meta info from `information_schema`. This PR mainly changes: 1. There will be a `information_schema` db in each catalog. 2. Each `information_schema` db only store the meta info of the catalog it belongs to. 3. For `information_schema`, the `TABLE_SCHEMA` column's value is the database name. 4. There is a new global variable `show_full_dbname_in_info_schema_db`, default is false, if set to true, The `TABLE_SCHEMA` column's value is the like `ctl.db`, because: When connect to Doris, the `database` info in connection url will be: `xxx?db=ctl.db`. And then some BI will try to query `information_schema` with sql like: `select * from information_schema.columns where TABLE_SCHEMA = "ctl.db"` So it has to be format as `ctl.db` eg, the `information_schema.columns` table in external catalog `doris` is like: ``` mysql> select * from information_schema.columns limit 1\G ************************* 1. row ************************* TABLE_CATALOG: doris TABLE_SCHEMA: doris.__internal_schema TABLE_NAME: column_statistics COLUMN_NAME: id ORDINAL_POSITION: 1 COLUMN_DEFAULT: NULL IS_NULLABLE: NO DATA_TYPE: varchar CHARACTER_MAXIMUM_LENGTH: 4096 CHARACTER_OCTET_LENGTH: 16384 NUMERIC_PRECISION: NULL NUMERIC_SCALE: NULL DATETIME_PRECISION: NULL CHARACTER_SET_NAME: NULL COLLATION_NAME: NULL COLUMN_TYPE: varchar(4096) COLUMN_KEY: EXTRA: PRIVILEGES: COLUMN_COMMENT: COLUMN_SIZE: 4096 DECIMAL_DIGITS: NULL GENERATION_EXPRESSION: NULL SRS_ID: NULL ``` 6. Modify the behavior of - show tables - shwo databases - show columns - show table status The above statements may query the `information_schema` db if there is `where` predicate after them	2024-01-12 13:58:19 +08:00
minghong	f67a00ffbb	[opt](nereids) prune runtime redundant filters (#29828 ) 1. expand_runtime_filter_by_inner_join will create some redundant rfs，e.g., tpch q5 and q9, we need to remove one 2. hive: prune rf if target only used as probe	2024-01-12 13:58:19 +08:00
zclllyybb	4d97f8ea75	[enhance](function) support two special format for str_to_date (#29823 )	2024-01-12 12:00:32 +08:00
谢健	885d8b28ba	[fix](Nerids): fix function deps when check unique and not null #29797	2024-01-12 11:59:52 +08:00
abmdocrt	c9a949130b	[Case](wal) Add wal group commit sink case with low disk space fault injection (#29731 )	2024-01-12 11:59:52 +08:00
morrySnow	e93a16ac6e	[fix](Nereids) support complex literal cast in fe (#29599 )	2024-01-12 11:59:52 +08:00
minghong	17a2b89945	[runtimeFilter](nereids) expand runtime filter by join condition by default (#29633 ) 1. expand rf by join condition 2. fix ignore_shape_nodes bug	2024-01-12 11:59:27 +08:00
starocean999	e17809a684	[fix](nereids)logicalhaving is in wrong place after logicalagg and logicalwindow (#29463 )	2024-01-12 11:48:39 +08:00
starocean999	2c44951543	[fix](planner)only allow null safe equal when both children are nullable (#29470 )	2024-01-12 11:46:29 +08:00
Pxl	7738eca6da	[Bug](stream-load) fix stream load failed on table with rollup (#29665 ) fix stream load failed on table with rollup	2024-01-12 11:46:29 +08:00
morrySnow	9cbb55d49b	[fix](Nereids) create double literal when create decimal literal failed (#28959 ) FIX 1. remove float and double literal toString and getStringValue introduced by PR #23504 and PR #23271 These functions lead to wrong cast result of double and float literal 2. fix compute signature for datetimev2 always produce scale 6 3. fix stats calculator failed when generate node stats with two same column 4. constant fold on fe failed when cast double to integral TODO after fix the first problem, some mv matching not work well, fix them later - test_dup_mv_div - test_dup_mv_json - test_tcu	2024-01-12 11:46:29 +08:00
seawinde	fda001b6d3	[Improvement](nereids) Support join derivation when mv rewrite (#29609 ) materialized view def is as following: > select l_linenumber, o_custkey > from orders > left join lineitem on lineitem.L_ORDERKEY = orders.O_ORDERKEY > where o_custkey = 1; when query is as following, it can be rewritten by mv above it requires that query has reject null filters on the join right input, current supported filter are "=", "<", "<=", ">", ">=", "<=>" > select IFNULL(orders.O_CUSTKEY, 0) as custkey_not_null, > case when l_linenumber in (1,2,3) then l_linenumber else o_custkey end as case_when > from orders > inner join lineitem on orders.O_ORDERKEY = lineitem.L_ORDERKEY > where o_custkey = 1 and l_linenumber > 0;	2024-01-12 11:44:21 +08:00
谢健	34fe5ee38b	[feat](Nereids) support show constraint command (#29667 ) show constraints from t1; +------+-------------+-----------------------------------------+ \| Name \| Type \| Definition \| +------+-------------+-----------------------------------------+ \| fk \| FOREIGN KEY \| FOREIGN KEY (id) REFERENCES cir.t1 (id) \| \| uk \| UNIQUE \| UNIQUE (id) \| \| pk \| PRIMARY KEY \| PRIMARY KEY (id) \| +------+-------------+-----------------------------------------+	2024-01-12 11:44:21 +08:00
yangshijie	be56bf06cf	[feature](function) support ip function named is_ip_address_in_range(addr, cidr) (#29681 )	2024-01-12 11:44:21 +08:00
jakevin	028e59efab	[refactor](Nereids): unify all `replaceNamedExpressions` (#28228 ) Use a unified function `replaceNamedExpressions ` instead of implementing it yourself repeatedly.	2024-01-12 11:44:21 +08:00
seawinde	d50c8b6d3a	[Improvement](nereids) Query rewrite by mv support bitmap_union and bitmap_union_count roll up (#29418 ) Query rewrite by mv support bitmap_union and bitmap_union_count roll up, aggregate functions which supports roll up is listed as following: \| 查询中函数 \| 物化视图中函数 \| 函数上卷后 \| \|------------------\|--------------\|--------------------\| \| max \| max \| max \| \| min \| min \| min \| \| sum \| sum \| sum \| \| count \| count \| sum \| \| count(distinct ) \| bitmap_union \| bitmap_union_count \| \| bitmap_union \| bitmap_union \| bitmap_union\| \| bitmap_union_count \| bitmap_union \| bitmap_union_count \| this depends on https://github.com/apache/doris/pull/29256	2024-01-12 11:44:21 +08:00
airborne12	87023d3b7a	[Fix](inverted index) fix memory leak in inverted index when encountering fault (#29676 )	2024-01-12 11:44:21 +08:00

1 2 3 4 5 ...

2414 Commits