doris

Author	SHA1	Message	Date
slothever	c794ea18c8	[fix](multi-catalog)put java udf to custom lib (#35984 ) bp #34990	2024-06-06 22:54:24 +08:00
zhangdong	9efc7b63ec	[fix](mtmv)Mtmv support row column (#35860 ) (#35956 ) pick from master: #35860	2024-06-06 22:53:08 +08:00
amory	5966354165	[FIX](cases)fix cases for test_ip_in_inverted_index (#35971 ) bp #35881	2024-06-06 21:52:53 +08:00
amory	b5a35b9cef	[FIX] Pick array inverted index bugfix (#35837 ) here with some array with inverted index bugfix: see also: https://github.com/apache/doris/pull/34766 https://github.com/apache/doris/pull/35086 https://github.com/apache/doris/pull/34683 https://github.com/apache/doris/pull/34076	2024-06-06 09:54:14 +08:00
feiniaofeiafei	4b5163c905	[Feat](nereids) add transform rule MergePercentileToArray (#35809 ) cherry-pick #34313 to branch-2.1 MergePercentileToArray is to perform a transformation in this case: select ss_item_sk, percentile(ss_quantity,0.9), percentile(ss_quantity,0.6), percentile(ss_quantity,0.3) from store_sales group by ss_item_sk; ==> select ss_item_sk, percentile_array(ss_quantity,[0.3,0.6,0.9]) from store_sales group by ss_item_sk;	2024-06-04 17:50:36 +08:00
starocean999	c23ab25474	[fix](nereids)keep equal predicate as join conjunct even if it can be fold to null literal (#35842 ) pick from master https://github.com/apache/doris/pull/35811 ## Proposed changes Issue Number: close #xxx <!--Describe your changes.-->	2024-06-04 14:46:58 +08:00
amory	fe1a4c4136	[Feature](IP) support ipv4/ipv6 with inverted index and conjuncts for query (#35734 ) support data type ipv4/ipv6 with inverted index and then we can query like "> or < or >= or <= or in/not in " this conjuncts expr for ip with inverted index speeding up	2024-06-03 23:24:03 +08:00
airborne12	f67bd4b03b	[Fix](inverted index) fix fast execute condition for vexpr (#35673 )	2024-06-01 11:24:54 +08:00
wuwenchi	cb96a79d07	[bugfix](iceberg)fix datetime conversion error and data path error (#35708 ) ## Proposed changes Issue #31442 <!--Describe your changes.--> 1. The unit of the seventh parameter of `ZonedDateTime.of` is nanosecond, so we should multiply the microsecond by 1000. 2. When writing to a non-partitioned iceberg table, the data path has an extra slash	2024-06-01 00:42:48 +08:00
daidai	bc062a2595	[fix](orc)fix orc reader missing column. (#35735 ) ## Proposed changes bp #35583 Issue Number: close #xxx <!--Describe your changes.-->	2024-05-31 22:51:44 +08:00
lw112	48d4601ee3	[regression-test](load) add something like $.tag.[a.b] key's json case (#35134 )	2024-05-31 22:45:09 +08:00
zhangdong	4414edd66d	[enhance](mtmv)Mv refresh on commit (#35702 ) pick from master #34548 The modification involving CloudGlobalTransactionMgr was not picked up to 2.1 because the 2.1 branch does not yet have the Thunderbolt CloudGlobalTransactionMgr	2024-05-31 13:57:57 +08:00
谢健	885df89c5e	[Nereids](Nereids): fix shape change in nereids regression test (#35488 ) ## Proposed changes This pr fixes some failed regression test about checking shape <!--Describe your changes.--> ## Further comments If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...	2024-05-31 10:53:12 +08:00
Kaijie Chen	c2fc485327	[fix](auto-partition) fix auto partition load lost data in multi sender (#35287 ) (#35630 ) ## Proposed changes Change `use_cnt` mechanism for incremental (auto partition) channels and streams, it's now dynamically counted. Use `close_wait()` of regular partitions as a synchronize point to make sure all sinks are in close phase before closing any incremental (auto partition) channels and streams. Add dummy (fake) partition and tablet if there is no regular partition in the auto partition table. Backport #35287 Co-authored-by: zhaochangle <zhaochangle@selectdb.com>	2024-05-31 10:27:03 +08:00
zhangdong	373d9ab988	[enhance](mtmv)add truncate table case (#35599 ) truncate table or truncate partition,mtmv should can detect data change	2024-05-30 19:59:37 +08:00
Jerry Hu	fb9363f042	[fix](set) incorrect result of set operator (#35607 ) If there are duplicated expressions in the select list, the result will be incorrect. ## Proposed changes Issue Number: close #28438 <!--Describe your changes.-->	2024-05-30 19:59:37 +08:00
lihangyu	3cd7b88868	[Fix](Variant) fix variant with empty key (#35671 ) in some senario empty key will cause crash like ``` * tablet * SIGSEGV unknown detail explain (@0x0) received by PID 1527747 ( TID 1544788 OR 0x7f3302988700) from PID 0; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t* , void) at /mnt/disk2/lihangyu/doris/be/src/common/signal_handler.h:429 1# 0x00007F4880A12B50 in /lib64/libc.so.6 2# doris::vectorized::PathInDataBuilder::append(std::basic_string_view<char, std::char_traits<char> >, bool) at /mnt/disk2/lihangyu/doris/be/src/vec/json/p ath_in_data.cpp:193 3# doris::vectorized::JSONDataParser<doris::vectorized::SimdJSONParser, false >::traverseObject(doris::vectorized::SimdJSONParser::Object const&, doris::vec torized::JSONDataParser<doris::vectorized::SimdJSONParser, false>::ParseContex t&) at /mnt/disk2/lihangyu/doris/be/src/vec/json/json_parser.cpp:121 4# doris::vectorized::JSONDataParser<doris::vectorized::SimdJSONParser, false >::traverse(doris::vectorized::SimdJSONParser::Element const&, doris::vectoriz ed::JSONDataParser<doris::vectorized::SimdJSONParser, false>::ParseContext&) a t /mnt/disk2/lihangyu/doris/be/src/vec/json/json_parser.cpp:95 5# doris::vectorized::JSONDataParser<doris::vectorized::SimdJSONParser, false >::parse(char const, unsigned long) at /mnt/disk2/lihangyu/doris/be/src/vec/j son/json_parser.cpp:81 ``` ## Proposed changes Issue Number: close #xxx <!--Describe your changes.-->	2024-05-30 19:55:25 +08:00
lw112	4d16856536	[fix](short circurt) fix return default value issue (#34186 )	2024-05-29 20:31:07 +08:00
zzzxl	bef931de9d	[fix](inverted index) add in list to fast execute logic after hit index (#35344 ) resolve the issue where the case fails when enable_common_expr_pushdown is set to false	2024-05-29 20:30:43 +08:00
seawinde	eefea4c7e6	[fix](mtmv) Fix partition mv rewrite result wrong (#35236 ) this is brought by https://github.com/apache/doris/pull/33800 if mv is partitioned materialzied view, the data will be wrong by using the hited materialized view when the paritions in related base partiton table are deleted, created and so on. this fix the problem. if SET enable_materialized_view_union_rewrite=true; this will use the materializd view and make sure the data is corrent if SET enable_materialized_view_union_rewrite=false; this will query base table directly to make sure the data is right	2024-05-29 20:30:23 +08:00
Qi Chen	b91d2caab8	[Feature](iceberg-writer) Implements iceberg sink basic functionality for inserting into table. (#35587 ) backport #34929	2024-05-29 16:40:54 +08:00
camby	746c6207fc	[fix](index) bitmap and bloomfilter index should not do light index change (#35225 )	2024-05-29 10:09:31 +08:00
TengJianPing	b06794d619	[opt](spill) add session variable of 'enable_force_spill' (#34664 ) (#35561 ) ## Proposed changes pick #34664 <!--Describe your changes.-->	2024-05-29 09:57:31 +08:00
minghong	50e81d9db7	[feat](nereids) add more rules to eliminate empty relation (#34997 ) -branch-2.1 (#35534 ) eliminate empty relations for following patterns: topn->empty sort->empty distribute->empty project->empty (cherry picked from commit 8340f23946c0c8e40510ce937acd3342cb2e28b7) ## Proposed changes Issue Number: close #xxx <!--Describe your changes.--> ## Further comments If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...	2024-05-28 18:12:42 +08:00
Qi Chen	84e9a14063	[Fix](hive-writer) Fix partition column orders issue when the partition fields inserted into the target table are inconsistent with the field order of the query source table and the schema field order of the query source table. (#35543 ) ## Proposed changes backport #35347 ## Further comments If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...	2024-05-28 18:11:55 +08:00
morrySnow	b78dae040a	Revert "[fix](nereids) push filter through window, using slot equal-set (#35361 )" (#35541 ) This reverts commit d2df392994e8dc00dfb5f8e49cca83fca97cb565. This PR should not pick to branch-2.1, because the infra it relayed on not in branch-2.1	2024-05-28 17:54:13 +08:00
Pxl	87c90094a7	[Bug](materialized-view) fix unmatch mv coz table name (#35444 ) fix unmatch mv coz table name	2024-05-28 13:17:33 +08:00
minghong	d2df392994	[fix](nereids) push filter through window, using slot equal-set (#35361 ) example: filter (y=1) +-- window( ... partition by x) +-- project( A as x, A as y) filter(y=1) is equivalent to filter(x=1), because x and y are in the same equal-set in window#logicalProperties. And hence we could push filter(y=1) through window operator	2024-05-28 13:16:53 +08:00
minghong	dfcabf8d47	[fix](nereids) set mark join reference for bitmap-in-apply (#35435 ) bitmap filter is implemented before mark-join. When support mark-join, we forgot to update the bitmap-filter branch. when convert a bitmap-apply-in to join, we should set markjoinReference to the join if there are markJoinRefereneces	2024-05-28 13:13:41 +08:00
feiniaofeiafei	ac49576229	[Fix](nereids) fix merge aggregate setting top projection bug (#35348 ) introduced by #31811 sql like this: select col1, col2 from (select a as col1, a as col2 from mal_test1 group by a) t group by col1, col2 ; Transformation Description: In the process of optimizing the query, an agg-project-agg pattern is transformed into a project-agg pattern: Before Transformation: LogicalAggregate +-- LogicalPrject +-- LogicalAggregate After Transformation: LogicalProject +-- LogicalAggregate Before the transformation, the projection in the LogicalProject was a AS col1, a AS col2, and the outer aggregate group by keys were col1, col2. After the transformation, the aggregate group by keys became a, a, and the projection remained a AS col1, a AS col2. Problem: When building the project projections, the group by key a, a needed to be transformed to a AS col1, a AS col2. The old code had a bug where it used the slot as the map key and the alias in the projections as the map value. This approach did not account for the situation where aliases might have the same slot. Solution: The new code fixes this issue by using the original outer aggregate group by expression's exprId. It searches within the original project projections to find the NamedExpression that has the same exprId. These expressions are then placed into the new projections. This method ensures that the correct aliases are maintained, resolving the bug.	2024-05-28 13:13:31 +08:00
Lightman	7c808fcecf	[bugfix] Fix the case is unstable because Table[tbl_scalar_types_dup]'s state(ROLLUP) is not NORMAL (#35460 )	2024-05-28 13:12:27 +08:00
TengJianPing	d8eefd0be8	[fix] fix wrong result of spill agg with limit (#35403 )	2024-05-28 13:12:03 +08:00
airborne12	8ff95a00f3	[Fix](test) fix test case output for inverted_index_p0.test_tokenize (#35464 )	2024-05-27 19:19:24 +08:00
Lightman	d71e9d34fe	[Bugfix] Fix mv column type is not changed when do schema change (#34598 )	2024-05-27 15:28:12 +08:00
Qi Chen	68eda58a8c	[Fix](multi-catalog) Fix string dict filtering when use null related function in parquet and orc reader. (#35335 ) The following sql and when the dictionary column contains functions related to null, the results will be incorrect. ``` select * from ( select IF(o_orderpriority IS NULL, 'null', o_orderpriority) AS o_orderpriority from test_string_dict_filter_orc ) as A where o_orderpriority = 'null'; ``` ``` select * from ( select IFNULL(o_orderpriority, 'null') AS o_orderpriority from test_string_dict_filter_parquet ) as A where o_orderpriority = 'null' ``` ``` select * from ( select COALESCE(o_orderpriority, 'null') AS o_orderpriority from test_string_dict_filter_parquet ) as A where o_orderpriority = 'null'; ```	2024-05-27 15:25:29 +08:00
wuwenchi	f98ed4e4c5	[bugfix](hive)Misspelling of class names (#34981 )	2024-05-27 15:24:38 +08:00
wuwenchi	b1795d44ec	[bugfix](hive)fix testcase for test_hive_write_different_path (#35209 ) Hive's test environment uses docker, so when using 127.0.0.1, BE will write the file to the docker of its own machine. But if FE and are not on the same machine, FE cannot read this file because it can only read docker on its own machine. Therefore, the address 127.0.0.1 cannot be used in the test environment.	2024-05-27 15:24:30 +08:00
airborne12	2422439e45	[Update](regression) add case for inverted index (#35305 ) Co-authored-by: Kang <kxiao.tiger@gmail.com>	2024-05-27 15:24:09 +08:00
谢健	af986c370b	[feat](Nereids): Put the Child with Least Row Count in the First Position of Intersect (#34290 ) (#35339 ) In this pull request, we optimize the ordering of children in the Intersect operator to improve query performance. The proposed change is to place the child with the least row count in the first position of the Intersect operator. The rationale behind this optimization is that the Intersect operator works by first evaluating the leftmost child and then iterating through the results of the other children to find matching rows. By placing the child with the least row count first, we can minimize the number of iterations required to find the matching rows, thereby reducing the overall execution time of the query.	2024-05-27 11:52:35 +08:00
seawinde	62998719df	[opt](mtmv) Add threshold for relation mapping num when query rewrite (#34694 ) (#35378 ) if query and mv def is as following: def mv1_1 = """ select t1.L_LINENUMBER,t2.l_extendedprice, t2.L_ORDERKEY from lineitem t1 inner join lineitem t2 on t1.L_ORDERKEY = t2.L_ORDERKEY; """ def query1_1 = """ select t1.L_LINENUMBER, t2.L_ORDERKEY from lineitem t1 inner join lineitem t2 on t1.L_ORDERKEY = t2.L_ORDERKEY; """ this will generate relation mapping by Cartesian, if the num of self join is too much, this will cause the performance problem so we add `materialized_view_relation_mapping_max_count` session varaible, default 8. if actual num is greater than the value, the excess relation mapping is discarded.	2024-05-24 20:36:29 +08:00
TengJianPing	639c7ee7fb	[fix](decimalv2) fix scale of decimalv2 to string (#35222 ) (#35359 ) * [fix](decimalv2) fix scale of decimalv2 to string	2024-05-24 17:20:43 +08:00
feiniaofeiafei	1e07971a98	[Feat](nereids)when dealing insert into stmt with empty table source, fe returns directly (#35333 ) * [Feat](nereids) when dealing insert into stmt with empty table source, fe returns directly (#34418) When a LogicalOlapScan has no partitions, transform it to a LogicalEmptyRelation. When dealing insert into stmt with empty table source, fe returns directly. * [Fix](nereids) fix when insert into select empty table --------- Co-authored-by: feiniaofeiafei <moailing@selectdb.com>	2024-05-24 16:25:00 +08:00
Tiewei Fang	f6beeb1ddd	[Enhencement](tvf) select tvf supports using resource (#35139 ) Create an S3/HDFS resource that TVF can use it directly to access the data source.	2024-05-24 16:23:58 +08:00
seawinde	d6e8fb7d77	[feature](mtmv) Support agg state roll up and optimize the roll up code (#35026 ) agg_state is agg intermediate state, detail see state combinator: https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/combinators/state this support agg function roll up as following +---------------------+---------------------------------------------+---------------------+ \| query \| materialized view \| roll up \| \| ------------------- \| ------------------------------------------- \| ------------------- \| \| agg_funtion() \| agg_funtion_unoin() or agg_funtion_state() \| agg_funtion_merge() \| \| agg_funtion_unoin() \| agg_funtion_unoin() or agg_funtion_state() \| agg_funtion_union() \| \| agg_funtion_merge() \| agg_funtion_unoin() or agg_funtion_state() \| agg_funtion_merge() \| +---------------------+---------------------------------------------+---------------------+ for example which can be rewritten by mv sucessfully as following MV defination is ``` select o_orderstatus, l_partkey, l_suppkey, sum_union(sum_state(o_shippriority)), group_concat_union(group_concat_state(l_shipinstruct)), avg_union(avg_state(l_linenumber)), max_by_union(max_by_state(l_shipmode, l_suppkey)), count_union(count_state(l_orderkey)), multi_distinct_count_union(multi_distinct_count_state(l_shipmode)) from lineitem left join orders on lineitem.l_orderkey = o_orderkey and l_shipdate = o_orderdate group by o_orderstatus, l_partkey, l_suppkey; ``` Query is ``` select o_orderstatus, l_suppkey, sum(o_shippriority), group_concat(l_shipinstruct), avg(l_linenumber), max_by(l_shipmode,l_suppkey), count(l_orderkey), multi_distinct_count(l_shipmode) from lineitem left join orders on l_orderkey = o_orderkey and l_shipdate = o_orderdate group by o_orderstatus, l_suppkey; ```	2024-05-24 16:23:58 +08:00
Xujian Duan	dd567fa774	[fix](function) support return JsonType for If function (#35199 ) add a FunctionSignature for If to support return Type is JsonType.	2024-05-24 16:23:58 +08:00
morrySnow	98b2bda660	[opt](Nereids) remove restrict for count() in window (#35220 ) support count() used for window function CREATE TABLE `t1` ( `id` INT NULL, `dt` TEXT NULL ) DISTRIBUTED BY HASH(`id`) BUCKETS 10 PROPERTIES ( "replication_allocation" = "tag.location.default: 1" ); select , count() over() from t1;	2024-05-24 16:23:58 +08:00
yiguolei	b3f6668464	fix case: test_create_table_without_distribution	2024-05-23 19:03:30 +08:00
seawinde	4075408b84	[feature](mtmv)Support single table mv rewrite (#34185 ) (#35242 ) Support Single table query rewrite with out group by this is useful for complex filter or expresission the mv def and query is as following which can be query rewritten mv def: ``` select * from lineitem where l_comment like '%xx%' ``` query: ``` select l_linenumber, l_receiptdate from lineitem where l_comment like '%xx%' ``` Co-authored-by: zfr9527 <qhu15zhang3294197@163.com>	2024-05-23 19:00:36 +08:00
Mingyu Chen	adc364a6fd	[feature](Paimon) support deletion vector for Paimon naive reader (#34743 ) (#35241 ) bp #34743 Co-authored-by: 苏小刚 <suxiaogang223@icloud.com>	2024-05-23 00:01:30 +08:00
zy-kkk	24990383ff	[refactor](jdbc catalog) split clickhouse jdbc executor (#34794 ) (#35174 ) pick master #34794	2024-05-22 19:09:05 +08:00

1 2 3 4 5 ...

2965 Commits