doris

Author	SHA1	Message	Date
Kang	ffadaa4935	[improvement](inverted index) skip write index on load and generate index on compaction (#20325 )	2023-06-03 16:03:21 +08:00
YueW	b62c5a70c7	[fix](match query) fix array column match query failed without inverted index (#20344 )	2023-06-02 21:10:12 +08:00
YueW	adc3acb283	[fix](match) fix match query with compound predicates return -6003 (#20361 )	2023-06-02 18:25:37 +08:00
zy-kkk	a20a6d2bea	[refactor](jdbc catalog) Refactor the JdbcClient code (#20109 ) This PR does the following: 1. This PR is a substantial refactor of the JDBC client architecture. The previous monolithic JDBC client has been refactored into an abstract base class `JdbcClient`, and a set of database-specific subclasses (e.g., `JdbcMySQLClient`, `JdbcOracleClient`, etc.), and the JdbcClient required config, abstract into an object. This allows for improved modularity, easier addition of support for new databases, and cleaner, more maintainable code. This change is backward-compatible and does not affect existing functionality. 2. As a result of client refactoring, OceanBaseClient can automatically recognize the mode of operation as MySQL or Oracle, so we cancel the oceanbase_mode property in the Jdbc Catalog, but due to the cancellation of the property, When creating a single OceanBase Jdbc Table, the table type needs to be filled in as oceanbase(mysql mode) or oceanbase_oracle(oracle_mode). The above work is a change in the usage behavior, please note. 3. For the PostgreSQL Jdbc Catalog, I did two things: 1. The adaptation to MATERIALIZED VIEW and FOREIGN TABLE is added 2. Fixed reading jsonb, which had been incorrectly changed to json in a previous PR 4. fix some jdbc catalog test case 5. modify oceanbase jdbc doc And,Thanks @wolfboys for the guidance	2023-06-02 17:58:10 +08:00
amory	d68f3f3b3d	[Feature](array-functions)improve array functions for array_last_index (#20294 ) Now we just support array_first_index for lambda input , but no array_last_index	2023-06-02 13:54:03 +08:00
Jerry Hu	8ff8705b3f	[fix](olap) deletion statement with space conditions did not take effect (#20349 ) Deletion statement like this: delete from tb where k1 = ' '; The rows whose k1's value is ' ' will not be deleted.	2023-06-02 13:52:57 +08:00
starocean999	a8a4da9b9e	[fix](nereids)dphyper join reorder may cache wrong project list for project node (#20209 ) * [fix](nereids)dphyper join reorder may cache wrong project list for project node	2023-06-02 09:35:28 +08:00
xueweizhang	ecdc5124be	[feature-wip](duplicate-no-keys) schame change support for duplicate no keys (#19326 )	2023-06-02 09:22:41 +08:00
HappenLee	608d2a3eca	[Bug](exec) push down no group by agg min cause error result (#20289 ) sql """ CREATE TABLE t1_int ( num int(11) NULL, dgs_jkrq bigint(20) NULL ) ENGINE=OLAP DUPLICATE KEY(num) COMMENT 'OLAP' DISTRIBUTED BY HASH(num) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "storage_format" = "V2", "light_schema_change" = "true", "disable_auto_compaction" = "false", "enable_single_replica_compaction" = "false" ); """ sql """insert into t1_int values(1,1),(1,2),(1,3),(1,4),(1,null);""" qt_sql """ select min(dgs_jkrq) from t1_int; """ get the error result：4 after change we get the right result：1	2023-06-01 17:29:46 +08:00
Gabriel	a8b273ae31	[P2](test) Fix P2 output (#20311 )	2023-06-01 15:11:12 +08:00
Mryange	519f01133a	[feature](decimal)support cast rounding half up and div precision increment in decimalv3. (#19811 )	2023-06-01 13:09:58 +08:00
Jibing-Li	1b968c4ade	[fix](multi catalog)Fix nereids planner text format include extra column index bug (#20260 ) Nereids planner include all columns index in TFileScanRangeParams, this may cause the column projection incorrect for text format table. Because csv reader use the column index position to split a line. Extra column index will cause get wrong split result. This PR is to reset the column index after Projection, remove the useless column index.	2023-06-01 12:17:47 +08:00
mch_ucchi	cc41cb0e7e	[Fix](Nereids) fix some insert into select bugs (#20052 ) fix 3 bugs: 1. failed to insert into a table with mv. ```sql create table t ( id int, c1 int, c2 int, c3 int ) duplicate key(id) distributed by hash(id) buckets 4 create materialized view k12s3m as select id, sum(c1), max(c3) from t group by id; insert into t select -4, -4, -4, 'd'; ``` insert will rise exception because mv column is not handled. now we will add a target column and value as defineExpr. 2. failed to insert into a table with not all the columns. ```sql insert into t(c1, c2) select c1, c2 from t ``` and t(id ukey, c1, c2, c3), will insert too many data, we fix it by change the output partitions. 3. failed to insert into a table with complex select. the select statement has join or agg, fix the bug by the way similar to the one at 2nd bug.	2023-06-01 12:15:19 +08:00
starocean999	68e593fbf1	[fix](nereids)(planner) case when should return NullLiteral when all case result is NullLiteral (#20280 )	2023-06-01 11:11:41 +08:00
lihangyu	9e21318834	[refactor](dynamic table) Make segment_writer unaware of dynamic schema, and ensure parsing is exception-safe. (#19594 ) 1. make ColumnObject exception safe 2. introduce FlushContext and construct schema at memtable flush stage to make segment independent from dynamic schema 3. add more test cases	2023-06-01 10:25:04 +08:00
LiBinfeng	65a75abecb	[Fix](Nereids) bitmap type should not be used in comparison predicate (#19807 ) When using nereids, if we use compare operator of bitmap type, an analyze exception need to be throwed. like: select id from (select BITMAP_EMPTY() as c0 from expr_test) as ref0 where c0 = 1 order by id Which c0 in subq0 is a bitmap type, this scenario is not supported right now.	2023-05-31 23:09:36 +08:00
YueW	6adb3fdf11	[fix](match_phrase) Fix the inconsistent query result for 'match_phrase' after creating index without support_phrase property (#20258 ) if create inverted index without support_phrase property, remaining the match_phrase condition to filter by match function.	2023-05-31 18:09:50 +08:00
AKIRA	d93ff5d1ab	[fix](pipeline) Enable pipeline explicitly in the plan shape check cases. (#20221 ) enable pipeline explicitly in tpcds plan shape check	2023-05-31 14:40:24 +08:00
starocean999	1f22aa6961	[fix](nereids) like function's nullable property should be PropagateNullable (#20237 )	2023-05-31 12:13:38 +08:00
Jibing-Li	3f91127854	[fix](regression)Update external Brown test case out file. #20232 Update external Brown test case out file to match the new precision.	2023-05-31 09:21:04 +08:00
Gabriel	ff05217a1e	[regression](p0) fix test for `array_enumerate_uniq` (#20231 )	2023-05-30 22:14:19 +08:00
Ashin Gau	b7a69fbf4b	[test](regression) add regression test from materialized slot bug (#20207 ) The test query includes the conversion of string types to other types, and the processing of materialized columns for nested subqueries, which is the regression test for bug fix(#18783)	2023-05-30 21:23:05 +08:00
Chenyang Sun	accaff1026	[Feature](compaction) wip: single replica compaction (#19237 ) Currently, compaction is executed separately for each backend, and the reconstruction of the index during compaction leads to high CPU usage. To address this, we are introducing single replica compaction, where a specific primary replica is selected to perform compaction, and the remaining replicas fetch the compaction results from the primary replica. The Backend (BE) requests replica information for all peers corresponding to a tablet from the Frontend (FE). This information includes the host where the replica is located and the replica_id. By calculating hash(replica_id), the replica with the smallest hash value is responsible for executing compaction, while the remaining replicas are responsible for fetching the compaction results from this replica. The compaction task producer thread, before submitting a compaction task, checks whether the local replica should fetch from its peer. If it should, the task is then submitted to the single replica compaction thread pool. When performing single replica compaction, the process begins by requesting rowset versions from the target replica. These rowset_versions are then compared with the local rowset versions. The first version that can be fetched is selected.	2023-05-30 21:12:48 +08:00
Chengpeng Yan	a855253543	[fix](Nereids) filter should not push through union to OneRowRelation (#20132 ) ## Problem summary When we want to push the filter through the union. We should check whether the union's children are `OneRowRelation` or not. If there are some `OneRowRelation`, we shouldn't push down the filter to that part Before this PR ``` mysql> select * from (select 1 as a, 2 as b union all select 3, 3) t where a = 1; +------+------+ \| a \| b \| +------+------+ \| 1 \| 2 \| \| 3 \| 3 \| +------+------+ 2 rows in set (0.01 sec) ``` After this PR ``` mysql> select * from (select 1 as a, 2 as b union all select 3, 3) t where a = 1; +------+------+ \| a \| b \| +------+------+ \| 1 \| 2 \| +------+------+ 1 row in set (0.38 sec) ```	2023-05-30 17:06:52 +08:00
Mingyu Chen	0c98355fff	[fix](catalog) fix create catalog with resource replay issue and kerberos auth issue (#20137 ) 1. Fix create catalog with resource replay bug. If user create catalog using `create catalog hive with resource xxx`, when replaying edit log, there is a bug that resource may be dropped, causing NPE and FE will fail to start. In this PR, I add a new FE config `disallow_create_catalog_with_resource`, default is true. So that `with resource` will not be allowed, and it will be deprecated later. And also fix the replay bug to avoid NPE. 2. Fix issue when creating 2 hive catalogs to connect with and without kerberos authentication. When user create 2 hive catalogs, one use simple auth, the other use kerberos auth. The query may fail with error like: `Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.` So I add a default property for hive catalog: `"ipc.client.fallback-to-simple-auth-allowed" = "true"`. Which means this property will be added automatically when user creating hive catalog, to avoid such problem. 3. Fix calling `hdfsExists()` issue When calling `hdfsExists()` with non-zero return code, should check if it encounters error or is file not found. 3. Some code refactor Avoid import `org.apache.parquet.Strings`	2023-05-30 16:57:39 +08:00
Mingyu Chen	49ce4e6fda	[fix](test) fix p2 broker load (#20196 )	2023-05-30 16:26:00 +08:00
Gabriel	631494e05d	[regression](decimalv3) Fix output for P1 regression (#20213 )	2023-05-30 15:21:29 +08:00
bobhan1	bb12a1cb49	[Enhance](array function) add support for DecimalV3 for array_enumerate_uniq() (#17724 )	2023-05-30 13:09:19 +08:00
Mryange	94e1072d14	Revert "[fix](DECIMALV3) Fix the error in DECIMALV3 when explicitly casting. (#19926 )" (#20204 ) This reverts commit 8ca4f9306763b5a18ffda27a07ab03cc77351e35.	2023-05-30 10:35:33 +08:00
AKIRA	72cfe5865a	[feat](optimizer) Support CTE reuse (#19934 ) Before this PR, new optimizer would inline CTE directly. However in many scenario a CTE could be referenced many times, such as in TPC-DS tests, for these cases materialize the result sets of CTE and reuse it would significantly agument performance. In our tests on tpc-ds related sqls, it would improve the performance by up to almost 4 times than before. We introduce belowing plan node in optimizer 1. CTEConsumer: which hold a reference to CTEProducer 2. CTEProducer: Plan defined by CTE stmt 3. CTEAnchor: the father node of CTEProducer, a CTEProducer could only be referenced from corresponding CTEAnchor's right child. A CTEConsumer would be converted to a inlined plan if corresponding CTE referenced less than or equal `inline_cte_referenced_threshold` (it's a session variable, by default is 1). For SQL: ```sql EXPLAIN REWRITTEN PLAN WITH cte AS (SELECT col2 FROM t1) SELECT * FROM t1 WHERE (col3 IN (SELECT c1.col2 FROM cte c1)) UNION ALL SELECT * FROM t1 WHERE (col3 IN (SELECT c1.col2 FROM cte c1)); ``` Rewritten plan before this PR: ``` +------------------------------------------------------------------------------------------------------------------------------------------------------+ \| Explain String \| +------------------------------------------------------------------------------------------------------------------------------------------------------+ \| LogicalUnion ( qualifier=ALL, outputs=[col1#14, col2#15, col3#16], hasPushedFilter=false ) \| \| \|--LogicalJoin[559] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(col3#6 = col2#8)], otherJoinConjuncts=[] ) \| \| \| \|--LogicalProject[551] ( distinct=false, projects=[col1#4, col2#5, col3#6], excepts=[], canEliminate=true ) \| \| \| \| +--LogicalFilter[549] ( predicates=(__DORIS_DELETE_SIGN__#7 = 0) ) \| \| \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| \| \| +--LogicalProject[555] ( distinct=false, projects=[col2#20 AS `col2`#8], excepts=[], canEliminate=true ) \| \| \| +--LogicalFilter[553] ( predicates=(__DORIS_DELETE_SIGN__#22 = 0) ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| \| +--LogicalProject[575] ( distinct=false, projects=[col1#9, col2#10, col3#11], excepts=[], canEliminate=false ) \| \| +--LogicalJoin[573] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(col3#11 = col2#13)], otherJoinConjuncts=[] ) \| \| \|--LogicalProject[565] ( distinct=false, projects=[col1#9, col2#10, col3#11], excepts=[], canEliminate=true ) \| \| \| +--LogicalFilter[563] ( predicates=(__DORIS_DELETE_SIGN__#12 = 0) ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| \| +--LogicalProject[569] ( distinct=false, projects=[col2#24 AS `col2`#13], excepts=[], canEliminate=true ) \| \| +--LogicalFilter[567] ( predicates=(__DORIS_DELETE_SIGN__#26 = 0) ) \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| +------------------------------------------------------------------------------------------------------------------------------------------------------+ ``` After this PR ``` +------------------------------------------------------------------------------------------------------------------------------------------------------+ \| Explain String \| +------------------------------------------------------------------------------------------------------------------------------------------------------+ \| LogicalUnion ( qualifier=ALL, outputs=[col1#14, col2#15, col3#16], hasPushedFilter=false ) \| \| \|--LOGICAL_CTE_ANCHOR#-1164890733 \| \| \| \|--LOGICAL_CTE_PRODUCER#-1164890733 \| \| \| \| +--LogicalProject[427] ( distinct=false, projects=[col2#1], excepts=[], canEliminate=true ) \| \| \| \| +--LogicalFilter[425] ( predicates=(__DORIS_DELETE_SIGN__#3 = 0) ) \| \| \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| \| \| +--LogicalJoin[373] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(col3#6 = col2#8)], otherJoinConjuncts=[] ) \| \| \| \|--LogicalProject[370] ( distinct=false, projects=[col1#4, col2#5, col3#6], excepts=[], canEliminate=true ) \| \| \| \| +--LogicalFilter[368] ( predicates=(__DORIS_DELETE_SIGN__#7 = 0) ) \| \| \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| \| \| +--LOGICAL_CTE_CONSUMER#-1164890733#1038782805 \| \| +--LogicalProject[384] ( distinct=false, projects=[col1#9, col2#10, col3#11], excepts=[], canEliminate=false ) \| \| +--LogicalJoin[382] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(col3#11 = col2#13)], otherJoinConjuncts=[] ) \| \| \|--LogicalProject[379] ( distinct=false, projects=[col1#9, col2#10, col3#11], excepts=[], canEliminate=true ) \| \| \| +--LogicalFilter[377] ( predicates=(__DORIS_DELETE_SIGN__#12 = 0) ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| \| +--LOGICAL_CTE_CONSUMER#-1164890733#858618008 \| +------------------------------------------------------------------------------------------------------------------------------------------------------+ ```	2023-05-30 10:18:59 +08:00
Jibing-Li	6f31ee9492	[fix](p0 regression)Update hive docker test case result data (#20176 ) Doris updated array type output format, using double quote for Strings. Before, it was using single quote. So we need to update the case out file using double quote.	2023-05-30 00:17:30 +08:00
airborne12	90b4e127e3	[Feature](inverted index) add parser_mode properties for inverted index parser (#20116 ) We add parser mode for inverted index, usage like this: ``` CREATE TABLE `inverted` ( `FIELD0` text NULL, `FIELD1` text NULL, `FIELD2` text NULL, `FIELD3` text NULL, INDEX idx_name1 (`FIELD0`) USING INVERTED PROPERTIES("parser" = "chinese", "parser_mode" = "fine_grained") COMMENT '', INDEX idx_name2 (`FIELD1`) USING INVERTED PROPERTIES("parser" = "chinese", "parser_mode" = "coarse_grained") COMMENT '' ) ENGINE=OLAP ); ```	2023-05-29 23:21:52 +08:00
Mryange	8ca4f93067	[fix](DECIMALV3) Fix the error in DECIMALV3 when explicitly casting. (#19926 ) before mysql [test]>select cast(1 as DECIMALV3(16, 2)) / cast(3 as DECIMALV3(16, 2)); +-----------------------------------------------------------+ \| CAST(1 AS DECIMALV3(16, 2)) / CAST(3 AS DECIMALV3(16, 2)) \| +-----------------------------------------------------------+ \| 0.00 \| +-----------------------------------------------------------+ mysql [test]>select * from divtest; +------+------+ \| id \| val \| +------+------+ \| 3 \| 5.00 \| \| 2 \| 4.00 \| \| 1 \| 3.00 \| +------+------+ mysql [test]>select cast(1 as decimalv3(16,2)) / val from divtest; +-------------------------------------+ \| CAST(1 AS DECIMALV3(16, 2)) / `val` \| +-------------------------------------+ \| 0 \| \| 0 \| \| 0 \| +-------------------------------------+ after mysql [test]>select cast(1 as DECIMALV3(16, 2)) / cast(3 as DECIMALV3(16, 2)); +-----------------------------------------------------------+ \| CAST(1 AS DECIMALV3(16, 2)) / CAST(3 AS DECIMALV3(16, 2)) \| +-----------------------------------------------------------+ \| 0.33 \| +-----------------------------------------------------------+ mysql [test]>select cast(1 as decimalv3(16,2)) / val from divtest; +-------------------------------------+ \| CAST(1 AS DECIMALV3(16, 2)) / `val` \| +-------------------------------------+ \| 0.250000 \| \| 0.200000 \| \| 0.333333 \| +-------------------------------------+ This is because in the previous code, the constant 1.000 would be transformed into 1. remove "ReduceType	2023-05-29 19:51:12 +08:00
Pxl	5788214416	[Bug](function) fix equals implements not judge order by elements of function call expr (#20083 ) fix equals implements not judge order by elements of function call expr #19296	2023-05-29 19:03:05 +08:00
Gabriel	55ccddb62c	[Conf](decimalv3) enable decimalv3 by default	2023-05-29 15:38:31 +08:00
Mryange	a86134cb39	[fix](executor) Fixed an error with cast as time. #20144 before mysql [(none)]>select cast("10:10:10" as time); +-------------------------------+ \| CAST('10:10:10' AS TIMEV2(0)) \| +-------------------------------+ \| 00:00:00 \| +-------------------------------+ after mysql [(none)]>select cast("10:10:10" as time); +-------------------------------+ \| CAST('10:10:10' AS TIMEV2(0)) \| +-------------------------------+ \| 10:10:10 \| +-------------------------------+ In the past, we supported this syntax. mysql [(none)]>select cast("2023:05:01 13:14:15" as time); +------------------------------------------+ \| CAST('2023:05:01 13:14:15' AS TIMEV2(0)) \| +------------------------------------------+ \| 13:14:15 \| +------------------------------------------+ However, "10:10:10" is also a valid datetime. mysql [(none)]>select cast("10:10:10" as datetime); +-----------------------------------+ \| CAST('10:10:10' AS DATETIMEV2(0)) \| +-----------------------------------+ \| 2010-10-10 00:00:00 \| +-----------------------------------+ So here, the order of parsing has been adjusted.	2023-05-29 12:17:21 +08:00
zhengshiJ	970efdc1cb	[Feature](Nereids) support advanced materialized view (#19650 ) Increase the functionality of advanced materialized view This feature already supported by legacy planner with PR #19650 This PR implement it in Nereids. This PR implement the features as below: 1. Support multiple columns in aggregate function. eg: select sum(c1 + c2) from t1; 2. Supports complex expressions. eg: select abs(c1), sum(abc(c1+1) + 1) from t1; TODO: 1. Support adding where in materialized view	2023-05-29 10:37:44 +08:00
Kang	859b03dfdf	[Improvement](topn) prevent memory usage of key topn increasing unlimited (#19978 )	2023-05-29 10:16:15 +08:00
YueW	ae352997b4	[Enhancement](alter inverted index) Improve alter inverted index performance with light weight add or drop inverted index (#19063 )	2023-05-28 11:23:07 +08:00
starocean999	4cbb6ece10	[fix](fe)ordering exprs should be substituted in the same way as select part (#20091 )	2023-05-27 21:00:57 +08:00
Yanko	f54a068d82	[feature](function) add json->operator convert to json_extract (#19899 )	2023-05-27 12:45:45 +08:00
lihangyu	f3d8af330a	[Bug](point query) check point query before check two phase read (#20055 ) * [Bug](point query) checkAndSetPointQuery before checkEnableTwoPhaseRead 1. checkEnableTwoPhaseRead rely on thr short circuit flag 2. add more metric to display lookup profile * fix rebase	2023-05-27 12:38:58 +08:00
HappenLee	9539bbf8ae	Revert "[test](executor)add crud regression test for resource group (#19659 )" (#20121 ) This reverts commit 8b9813663d87afa7b359b31782f3864dc54881df.	2023-05-27 08:25:00 +08:00
lihangyu	23c95d15da	[regression-test](sort) Fix unstable sorting (#20125 )	2023-05-26 23:42:05 +08:00
Qi Chen	860e28a3a3	[Fix](multi-catalog) Fix db name is not lower case when jdbc catalog configuration `lower_case_table_names` is `true`. (#20021 ) Fix db name is not lower case when jdbc catalog configuration lower_case_table_names is true. Fix regression-test test_oracle_jdbc_catalog.	2023-05-26 21:35:38 +08:00
amory	ce45d6119d	[FIX](regress-test) fix struct_export out data (#20111 ) fix struct_export out data	2023-05-26 19:57:51 +08:00
lihangyu	317338913c	[Bug](topn) Fix topn fetch set real default value (#20074 ) 1. Before this PR if rowset does not contain column which should be read for related SlotDescriptor will call `insert_default` to column, but it's not this real defautl value.Real default value relevant information should be provided by the frontend side. 2. Support fetch when light schema change is not enabled, but disable for AGG or UNIQUE MOR model	2023-05-26 16:06:55 +08:00
TengJianPing	488c9ba7c2	[improvement](exchange) test: data stream sender stop sending data to receiver if it returns eos early (#20081 )	2023-05-26 16:05:38 +08:00
Pxl	43aa062fb1	[Chore](hash-join) remove useless conditions and add some case (#20050 )	2023-05-26 14:45:24 +08:00
TengJianPing	315b30c23d	[testcase](union) add test case for union of decimal (#20080 )	2023-05-26 14:12:14 +08:00

1 2 3 4 5 ...

1240 Commits