doris

Author	SHA1	Message	Date
YueW	77855fcd43	[fix](inverted index) fix transaction id changed when light index change (#20302 )	2023-06-03 16:05:02 +08:00
Kang	ffadaa4935	[improvement](inverted index) skip write index on load and generate index on compaction (#20325 )	2023-06-03 16:03:21 +08:00
caiconghui	6958a8f92f	[fix](dynamic_partition) fix dead lock when modify dynamic partition property for olap table (#20390 ) Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2023-06-03 08:25:20 +08:00
morrySnow	299c3dc396	[fix](Nereids) should not inherit child's limit and offset when generate exchange node (#20373 ) in legacy planner, when we new exchange, it inherit its child's limit and offset. but in Nereids, we should not do this. because if we need set limit or offset, we will set it manually. In this PR, we use a new ctor of ExchangeNode to ensure not set limit or offset unexpected.	2023-06-02 19:55:33 +08:00
luozenglin	a8e0841ef1	[fix](workload-group) fix incorrect memoryLimitPercent value (#20377 )	2023-06-02 18:57:57 +08:00
zy-kkk	a20a6d2bea	[refactor](jdbc catalog) Refactor the JdbcClient code (#20109 ) This PR does the following: 1. This PR is a substantial refactor of the JDBC client architecture. The previous monolithic JDBC client has been refactored into an abstract base class `JdbcClient`, and a set of database-specific subclasses (e.g., `JdbcMySQLClient`, `JdbcOracleClient`, etc.), and the JdbcClient required config, abstract into an object. This allows for improved modularity, easier addition of support for new databases, and cleaner, more maintainable code. This change is backward-compatible and does not affect existing functionality. 2. As a result of client refactoring, OceanBaseClient can automatically recognize the mode of operation as MySQL or Oracle, so we cancel the oceanbase_mode property in the Jdbc Catalog, but due to the cancellation of the property, When creating a single OceanBase Jdbc Table, the table type needs to be filled in as oceanbase(mysql mode) or oceanbase_oracle(oracle_mode). The above work is a change in the usage behavior, please note. 3. For the PostgreSQL Jdbc Catalog, I did two things: 1. The adaptation to MATERIALIZED VIEW and FOREIGN TABLE is added 2. Fixed reading jsonb, which had been incorrectly changed to json in a previous PR 4. fix some jdbc catalog test case 5. modify oceanbase jdbc doc And,Thanks @wolfboys for the guidance	2023-06-02 17:58:10 +08:00
yongjinhou	4395fb70c4	[Enhancement](tvf) Backends tvf supports authentication (#20333 ) Add authentication for backends tvf.	2023-06-02 17:53:44 +08:00
minghong	386a4a0b43	[fix](nereids) add fragment id on all PhysicalRelation (#20371 ) fix "cannot find fragment id for scan" exception	2023-06-02 17:13:09 +08:00
morrySnow	422fcd6377	[fix](Nereids) forbid unexpected expression on filter and fix two more bugs (#20331 ) fix below bugs: 1. not check filter's expression, aggregate function, grouping scalar function and window expression should not appear in filter 2. show not change nullable of aggregate function when it is window function in window expression 3. bitmap and other metric types should not appear in order by or partition by of window expression	2023-06-02 16:19:50 +08:00
Yongqiang YANG	b1e6c6ffe5	[enhancement](txn) print commit backends when commit fails (#20367 ) Print commit backends when a commit fails.	2023-06-02 15:10:38 +08:00
amory	d68f3f3b3d	[Feature](array-functions)improve array functions for array_last_index (#20294 ) Now we just support array_first_index for lambda input , but no array_last_index	2023-06-02 13:54:03 +08:00
AKIRA	e32eba8fdf	[refactor](stats) Persist status of analyze task to FE meta data (#20264 ) 1. In the past, we use a BE table named `analysis_jobs` to persist the status of analyze jobs/tasks, however there are many flaws such as, if BE crashed analyze job/task would failed however the status of analyze job/task couldn't get updated. 2. Support `DROP ANALYZE JOB [job_id]` to delete analyze job 3. Support `SHOW ANALYZE TASK STATUS [job_id] ` to get the task status of specific job 4. Restrict the execute condition of auto analyze, only when the last execution of auto analyze job finished a while ago could be executed again 5. Support analyze whole DB	2023-06-02 12:33:31 +08:00
mch_ucchi	9d8043e4c1	[Fix](Nereids) should not gather data when sink (#20330 )	2023-06-02 10:33:11 +08:00
xy720	5a3b97bbf2	[enhancement](struct-type)support comment for struct field (#20200 ) support comment for struct field	2023-06-02 10:29:56 +08:00
Gabriel	937f04033f	[Bug](runtime filter) fix NPE if runtime filter has no target (#20338 )	2023-06-02 09:54:37 +08:00
starocean999	a8a4da9b9e	[fix](nereids)dphyper join reorder may cache wrong project list for project node (#20209 ) * [fix](nereids)dphyper join reorder may cache wrong project list for project node	2023-06-02 09:35:28 +08:00
xueweizhang	ecdc5124be	[feature-wip](duplicate-no-keys) schame change support for duplicate no keys (#19326 )	2023-06-02 09:22:41 +08:00
wangbo	0df073699d	[fix](planner)Fix missing kw for workload #20319 1 add usage docment for Workload Group query queue; 2 Fix missing KW for workload, this may cause create workload group failed.	2023-06-02 09:04:22 +08:00
Yongqiang YANG	363e78f08f	[enhancement](publish) print detailed info for failed publish (#20309 )	2023-06-01 22:24:16 +08:00
zhangstar333	34c1cda14a	[bug](udaf) fix java-udaf test case failed with decimal (#20315 ) java-udaf have some test case with decimal will be failed in P0, because the decimal of scale is not set correctly	2023-06-01 20:14:54 +08:00
lihangyu	f0513a861d	[Improve](Scan) add a session variable to make scan run serial (#20220 ) Parallel scanning can result in some read amplification, for example, select * from xx where limit 1 actually requires only one row of data. However, due to parallel scanning of multiple tablets, read amplification occurs, leading to performance bottlenecks in high-concurrency scenarios. This PR Adding a SessionVariable to enforce serial scanning can help mitigate this issue.	2023-06-01 15:06:35 +08:00
jakevin	0ff3073fc4	[improvement](Nereids): limit Memo groupExpression size. (#20272 )	2023-06-01 13:30:19 +08:00
Mryange	519f01133a	[feature](decimal)support cast rounding half up and div precision increment in decimalv3. (#19811 )	2023-06-01 13:09:58 +08:00
Jibing-Li	1b968c4ade	[fix](multi catalog)Fix nereids planner text format include extra column index bug (#20260 ) Nereids planner include all columns index in TFileScanRangeParams, this may cause the column projection incorrect for text format table. Because csv reader use the column index position to split a line. Extra column index will cause get wrong split result. This PR is to reset the column index after Projection, remove the useless column index.	2023-06-01 12:17:47 +08:00
mch_ucchi	cc41cb0e7e	[Fix](Nereids) fix some insert into select bugs (#20052 ) fix 3 bugs: 1. failed to insert into a table with mv. ```sql create table t ( id int, c1 int, c2 int, c3 int ) duplicate key(id) distributed by hash(id) buckets 4 create materialized view k12s3m as select id, sum(c1), max(c3) from t group by id; insert into t select -4, -4, -4, 'd'; ``` insert will rise exception because mv column is not handled. now we will add a target column and value as defineExpr. 2. failed to insert into a table with not all the columns. ```sql insert into t(c1, c2) select c1, c2 from t ``` and t(id ukey, c1, c2, c3), will insert too many data, we fix it by change the output partitions. 3. failed to insert into a table with complex select. the select statement has join or agg, fix the bug by the way similar to the one at 2nd bug.	2023-06-01 12:15:19 +08:00
yiguolei	6befa53caa	fix fe meta upgrade error (#20291 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-06-01 12:09:08 +08:00
Gabriel	4387f47fb5	[pipeline](load) support pipeline load (#20217 )	2023-06-01 11:42:43 +08:00
zhangstar333	e748b43d3d	[bug](parse) fix can't create aggregate column with agg_state (#20235 ) fix can't create aggregate column with agg_state	2023-06-01 11:18:40 +08:00
starocean999	68e593fbf1	[fix](nereids)(planner) case when should return NullLiteral when all case result is NullLiteral (#20280 )	2023-06-01 11:11:41 +08:00
lvshaokang	90cd791789	[fix](tvf) s3 tvf specify region and s3.region params failed (#19921 )	2023-06-01 10:00:49 +08:00
LiBinfeng	65a75abecb	[Fix](Nereids) bitmap type should not be used in comparison predicate (#19807 ) When using nereids, if we use compare operator of bitmap type, an analyze exception need to be throwed. like: select id from (select BITMAP_EMPTY() as c0 from expr_test) as ref0 where c0 = 1 order by id Which c0 in subq0 is a bitmap type, this scenario is not supported right now.	2023-05-31 23:09:36 +08:00
minghong	5f591a6d12	[opt](nereids) generate in-bloom filter if target is local for pipeline mode (#20112 ) update in-filter usage in pipeline mode: 1. if the target is local, we use in-bloom filter. Let BE choose in or bloom according to actual distinctive number 2. set default runtime_filter_max_in_num to 1024	2023-05-31 17:24:38 +08:00
mch_ucchi	b53c42636e	[Fix](Nereids) fold constant result is wrong on functions relative to timezone (#19863 )	2023-05-31 15:52:40 +08:00
luozenglin	a1e3f49fb5	[enhancement](ldap) Support refresh ldap cache (#20183 ) Support refreshing ldap cache: refresh ldap all; refresh ldap; refresh ldap for user1; Support for caching non-existent ldap users. When logging in with a doris user that does not exist in the Ldap service after ldap is enabled, avoid accessing the ldap service every time in scenarios such as show databases; that require a lot of authentication.	2023-05-31 15:38:12 +08:00
Lijia Liu	f9dfcb923d	[Enhancement] Change Create Resource Group Grammar (#20249 )	2023-05-31 15:23:24 +08:00
mch_ucchi	c39943f699	[Fix](Planner)fix incorrect pattern when format pattern contains %x%v (#19994 )	2023-05-31 14:55:33 +08:00
AKIRA	d93ff5d1ab	[fix](pipeline) Enable pipeline explicitly in the plan shape check cases. (#20221 ) enable pipeline explicitly in tpcds plan shape check	2023-05-31 14:40:24 +08:00
Xiangyu Wang	6d75d56e7b	[Fix](dynamic-partition) Try to avoid setting a zero-bucket-size partition. (#20177 ) A fallback to avoid BE crash problem when partition's bucket size is 0, but not resolved.	2023-05-31 13:09:03 +08:00
starocean999	1f22aa6961	[fix](nereids) like function's nullable property should be PropagateNullable (#20237 )	2023-05-31 12:13:38 +08:00
Gabriel	6a8fdb45c6	[Bug](runtimefilter) Fix waiting for runtime filter (#20155 )	2023-05-31 10:25:18 +08:00
luozenglin	8a54be3318	[feature-wip](workload-group) Support setting user default workload group (#20180 ) Issue Number: close #xxx SET PROPERTY 'default_workload_group' = 'group_name';	2023-05-31 09:18:25 +08:00
zy-kkk	56fa38de1d	[Enhencement](JDBC Catalog) refactor jdbc catalog insert logic (#19950 ) This PR refactors the old way of writing data to JDBC External Table & JDBC Catalog, mainly including the following tasks 1. Continuing the work of @BePPPower 's PR #18594, changing the logic of splicing Inster sql to operating off-heap memory and using preparedStatement.set to write data logic to complete 2. Supplement the support written by largeint type, mainly to adapt to Java.Math.BigInteger, which uses binary operations 3. Delete the splicing SQL logic in the JDBC External Table & JDBC Catalog related written code ToDo: Binary type，like bit,binary, blob... Finally, special thanks to @BePPPower , @AshinGau for his work Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>	2023-05-30 22:03:39 +08:00
Chengpeng Yan	ccfc4978c1	[feature](nereids) support the rewrite rule for push-down filter through sort (#20161 ) Support the rewrite rule for push-down filter through sort. We can directly push-down the filter through sort without any conditions check. Before this PR: ``` mysql> explain select * from (select * from t1 order by a) t2 where t2.b > 2; +-------------------------------------------------------------+ \| Explain String \| +-------------------------------------------------------------+ \| PLAN FRAGMENT 0 \| \| OUTPUT EXPRS: \| \| a[#2] \| \| b[#3] \| \| PARTITION: UNPARTITIONED \| \| \| \| VRESULT SINK \| \| \| \| 3:VSELECT \| \| \| predicates: b[#3] > 2 \| \| \| \| \| 2:VMERGING-EXCHANGE \| \| offset: 0 \| \| \| \| PLAN FRAGMENT 1 \| \| \| \| PARTITION: HASH_PARTITIONED: a[#0] \| \| \| \| STREAM DATA SINK \| \| EXCHANGE ID: 02 \| \| UNPARTITIONED \| \| \| \| 1:VTOP-N \| \| \| order by: a[#2] ASC \| \| \| offset: 0 \| \| \| \| \| 0:VOlapScanNode \| \| TABLE: default_cluster:test.t1(t1), PREAGGREGATION: ON \| \| partitions=0/1, tablets=0/0, tabletList= \| \| cardinality=1, avgRowSize=0.0, numNodes=1 \| +-------------------------------------------------------------+ 30 rows in set (0.06 sec) ``` After this PR: ``` mysql> explain select * from (select * from t1 order by a) t2 where t2.b > 2; +-------------------------------------------------------------+ \| Explain String \| +-------------------------------------------------------------+ \| PLAN FRAGMENT 0 \| \| OUTPUT EXPRS: \| \| a[#2] \| \| b[#3] \| \| PARTITION: UNPARTITIONED \| \| \| \| VRESULT SINK \| \| \| \| 2:VMERGING-EXCHANGE \| \| offset: 0 \| \| \| \| PLAN FRAGMENT 1 \| \| \| \| PARTITION: HASH_PARTITIONED: a[#0] \| \| \| \| STREAM DATA SINK \| \| EXCHANGE ID: 02 \| \| UNPARTITIONED \| \| \| \| 1:VTOP-N \| \| \| order by: a[#2] ASC \| \| \| offset: 0 \| \| \| \| \| 0:VOlapScanNode \| \| TABLE: default_cluster:test.t1(t1), PREAGGREGATION: ON \| \| PREDICATES: b[#1] > 2 \| \| partitions=0/1, tablets=0/0, tabletList= \| \| cardinality=1, avgRowSize=0.0, numNodes=1 \| +-------------------------------------------------------------+ 28 rows in set (0.40 sec) ```	2023-05-30 21:38:16 +08:00
Jibing-Li	5c8e801761	[Fix](multi catalog, nereids)Fix text file required slot bug (#20214 ) required_slots in TFileScanRangeParams params for external hive table may be updated after FileQueryScanNode finalize. For text file, we need to use the origin required_slots in params so that the list could be updated later. Otherwise, query text file may get the following error: [INTERNAL_ERROR]Unknown source slot descriptor, slot_id=3	2023-05-30 21:29:33 +08:00
Chenyang Sun	accaff1026	[Feature](compaction) wip: single replica compaction (#19237 ) Currently, compaction is executed separately for each backend, and the reconstruction of the index during compaction leads to high CPU usage. To address this, we are introducing single replica compaction, where a specific primary replica is selected to perform compaction, and the remaining replicas fetch the compaction results from the primary replica. The Backend (BE) requests replica information for all peers corresponding to a tablet from the Frontend (FE). This information includes the host where the replica is located and the replica_id. By calculating hash(replica_id), the replica with the smallest hash value is responsible for executing compaction, while the remaining replicas are responsible for fetching the compaction results from this replica. The compaction task producer thread, before submitting a compaction task, checks whether the local replica should fetch from its peer. If it should, the task is then submitted to the single replica compaction thread pool. When performing single replica compaction, the process begins by requesting rowset versions from the target replica. These rowset_versions are then compared with the local rowset versions. The first version that can be fetched is selected.	2023-05-30 21:12:48 +08:00
wangbo	6f68ec9de0	support query queue (#20048 ) support query queue (#20048)	2023-05-30 19:52:27 +08:00
Chengpeng Yan	f505eed253	[opt](Nereids) refactor the PartitionTopN (#20102 ) Do some small refactoring for the `PartitionTopN` and also address the left comment in #18784	2023-05-30 17:34:47 +08:00
Chengpeng Yan	a855253543	[fix](Nereids) filter should not push through union to OneRowRelation (#20132 ) ## Problem summary When we want to push the filter through the union. We should check whether the union's children are `OneRowRelation` or not. If there are some `OneRowRelation`, we shouldn't push down the filter to that part Before this PR ``` mysql> select * from (select 1 as a, 2 as b union all select 3, 3) t where a = 1; +------+------+ \| a \| b \| +------+------+ \| 1 \| 2 \| \| 3 \| 3 \| +------+------+ 2 rows in set (0.01 sec) ``` After this PR ``` mysql> select * from (select 1 as a, 2 as b union all select 3, 3) t where a = 1; +------+------+ \| a \| b \| +------+------+ \| 1 \| 2 \| +------+------+ 1 row in set (0.38 sec) ```	2023-05-30 17:06:52 +08:00
Mingyu Chen	0c98355fff	[fix](catalog) fix create catalog with resource replay issue and kerberos auth issue (#20137 ) 1. Fix create catalog with resource replay bug. If user create catalog using `create catalog hive with resource xxx`, when replaying edit log, there is a bug that resource may be dropped, causing NPE and FE will fail to start. In this PR, I add a new FE config `disallow_create_catalog_with_resource`, default is true. So that `with resource` will not be allowed, and it will be deprecated later. And also fix the replay bug to avoid NPE. 2. Fix issue when creating 2 hive catalogs to connect with and without kerberos authentication. When user create 2 hive catalogs, one use simple auth, the other use kerberos auth. The query may fail with error like: `Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.` So I add a default property for hive catalog: `"ipc.client.fallback-to-simple-auth-allowed" = "true"`. Which means this property will be added automatically when user creating hive catalog, to avoid such problem. 3. Fix calling `hdfsExists()` issue When calling `hdfsExists()` with non-zero return code, should check if it encounters error or is file not found. 3. Some code refactor Avoid import `org.apache.parquet.Strings`	2023-05-30 16:57:39 +08:00
Mingyu Chen	3735c21ef9	[fix](session-variable) fix set global var on non-master FE return error (#20179 )	2023-05-30 16:26:28 +08:00

1 2 3 4 5 ...

4810 Commits