Commit Graph

12635 Commits

Author SHA1 Message Date
f5da9f4ccc [fix](muti-catalog)convert to s3 path when use aws endpoint (#22784)
Convert to s3 path when use aws endpoint
For compatibility, we can also use s3 client to access other cloud by setting s3 endpoint properties
2023-08-17 14:28:00 +08:00
6e51632ca9 [docs](kerberos)add FAQ cases and enable krb5 debug (#22821) 2023-08-17 14:25:09 +08:00
8b51da0523 [Fix](load) fix partiotion Null pointer exception (#22965) 2023-08-17 14:09:47 +08:00
41bce29ae3 [docs](docs)Rename Title and URL of Bitwise Functions (#22722) 2023-08-17 11:18:02 +08:00
92c8f842f7 [fix](nereids) dphyper join reorder use wrong method to get hash and other conjuncts (#22966)
should use getHashJoinConjuncts() and getOtherJoinConjuncts() to get hash and other conjuncts of hash join node instead of categorizing them by checking if it's 'EqualTo' expression
2023-08-17 11:03:45 +08:00
a288377118 [fix](regresstion) Fix sql server external case (#23031) 2023-08-17 10:54:54 +08:00
d71b99b88a [fix](dbt) fix dbt doris user non-root user permission for show frintends sql (#22815) 2023-08-17 09:40:53 +08:00
343a6dc29d [improvement](hash join) Return result early if probe side has no data (#23044) 2023-08-17 09:17:09 +08:00
a77e9fbc99 (chores)(ui) download profile filename add profile_id (#23065) 2023-08-17 09:11:01 +08:00
7a9ff47528 [Improve](CI)Modify Deadline-check trigger mode, and add maven cache for Sonarcheck (#23069)
There are a lot of deadlinks in stock, we will reopen it after a full repair…
2023-08-16 22:31:50 +08:00
814acbf331 [pipeline](exec) disable pipeline load in master code (#23061)
disable pipeline load in master code
2023-08-16 21:53:58 +08:00
390c52f73a [Improve](complex-type) update for array/map element_at with nested complex type with local tvf (#22927) 2023-08-16 20:47:36 +08:00
a5c73c7a39 [fix](partial update) set io_ctx.reader_type when reading columns for partial update (#22630) 2023-08-16 19:34:39 +08:00
0aa57d159e [Fix](Partial update) Fix wrong position using in segment writer (#22782) 2023-08-16 19:31:06 +08:00
0594acfcf1 [fix](Nereids) scan should output all invisiable column (#23003) 2023-08-16 18:07:59 +08:00
b815cf327a [enhancement](merge-on-write) Add more log info when delete bitmap correctness check failed (#22984) 2023-08-16 17:25:11 +08:00
f1880d32d9 [fix](nereids)bind slot failed because of "default_cluster" #23008
slot bind failed for following querys:
select tpch.lineitem.* from lineitem
select tpch.lineitem.l_partkey from lineitem

the unbound slot is tpch.lineitem.l_partkey, but the bounded slot is default_cluster:tpch.lineitem.l_partkey. They are not matched.
we need to ignore default_cluster: when compare dbName
2023-08-16 17:22:44 +08:00
92f443b3b8 [enhancement](Nereids): count(1) to count(*) #22999
add a rule to transform count(1) to count(*)
2023-08-16 17:19:23 +08:00
2dbca7a688 [Fix](Planner) fix multi phase analysis failed in multi instance environment substitution (#22840)
Problem:
When executing group_concat with order by inside in view, column can not be found when analyze.

Example:
create view if not exists test_view as select group_concat(c1,',' order by c1 asc) from table_group_concat;
select * from test_view;
it will return an error like: "can not find c1 in table_list"

Reason:
When we executing this sql in multi-instance environment, Planner would try to create plan in multi phase
aggregation. And because we analyze test_view independent with tables outside view. So we can not get
table informations inside view.

Solution:
Substitute order by expression of merge aggregation expressions.
2023-08-16 16:46:26 +08:00
7adb2be360 [Fix](Nereids) fix insert into return npe from follower node. (#22734)
insert into table command run at a follower node, it will forward to the master node, and the parsed statement is not set to the cascades context, but set to the executor::parsedStmt, we use the latter to get the user info.
2023-08-16 16:37:17 +08:00
6cf1efc997 [refactor](load) use smart pointers to manage writers in memtable memory limiter (#23019) 2023-08-16 16:34:57 +08:00
4512569a3a [docs](releasenote)Update en release note 2.0.0 (#23041) 2023-08-16 15:13:09 +08:00
5148bc6fa7 [fix](partial update)allow delete sign column in partial update in planForPipeline (#23034) 2023-08-16 14:20:39 +08:00
4510e16845 [improvement](delete) support delete predicate on value column for merge-on-write unique table (#21933)
Previously, delete statement with conditions on value columns are only supported on duplicate tables. After we introduce delete sign mechanism to do batch delete, a delete statement with conditions on value columns on unique tables will be transformed into the corresponding insert into ..., __DELETE_SIGN__ select ... statement. However, for unique table with merge-on-write enabled, the overhead of inserting these data can be eliminated. So this PR add the ability to allow delete predicate on value columns for merge-on-write unique tables.
2023-08-16 12:18:05 +08:00
3efa06e63e [Fix](View)varchar type conversion error (#22987) 2023-08-16 11:49:04 +08:00
c41179b8e9 [fix](regression) Improve the robustness when close target connection (#23012) 2023-08-16 11:42:58 +08:00
221e7bdd17 [test](jdbc external) fix mysql and pg external regression test (#22998) 2023-08-16 10:44:47 +08:00
a2095b7d9e [fix](docs) add enable_single_replica_load on be config doc (#22948) 2023-08-16 10:31:01 +08:00
Pxl
d5df3bae25 [Bug](exchange) fix dcheck fail when VDataStreamRecvr input empty block (#22992)
fix dcheck fail when VDataStreamRecvr input empty block
2023-08-16 10:21:19 +08:00
3b8981bee7 [chore](third-party) Speed the download up for aws-crt-cpp (#22997)
The package aws-sdk-cpp was upgraded in #20252. We can speed the download up for aws-crt-cpp.
2023-08-16 09:47:18 +08:00
da097629ea [chore](build) Fix the build with MySQL support (#23020) 2023-08-16 09:28:56 +08:00
cb6678adb9 [fix](case) Update repositoryAffinityList1.sql (#22941) 2023-08-16 09:23:46 +08:00
c8c46e042d [Improve](regress-test)add regress test for map_agg with nested type and insert to doris inner table #23006 2023-08-16 09:21:02 +08:00
d3dddeea8a [fix](load) remove incorrect DCHECK in BetaRowsetWriter dtor (#23016)
The DCHECK may not always be right in case of Vertical compaction.
remove it to let DEBUG run.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-08-15 23:55:02 +08:00
423002b20a [fix](nereids) partitionTopN & Window estimation (#22953)
* partitionTopN & winExpr estimation

* tpcds 44/47/57
2023-08-15 20:19:03 +08:00
fe08db191f [typo](docs) Optimize the release note 2.0.0 (#22926) 2023-08-15 20:09:56 +08:00
61d2f37bdc [fix](jdbc catalog) fix string type insert into odbc table (#22961) 2023-08-15 20:09:38 +08:00
f191736bfe [bug](shuffle) Fix DCHECK failure if exchange node has limit (#22993) 2023-08-15 19:14:37 +08:00
41a52d45d3 [pipeline](branch-2.0) pr to branch-2.0 also run checks (#23004) 2023-08-15 19:13:13 +08:00
80566f7fed [stats](nereids)support partition stats (#22606) 2023-08-15 17:52:25 +08:00
9b2323b7fd [Pipeline](exec) support async writer in pipelien query engine (#22901) 2023-08-15 17:32:53 +08:00
50f66b1246 [fix](pipeline) fix bug of datastream sender when doing BUCKET_SHFFULE_HASH_PARTITIONED shuffle (#22988)
This issue is introduced by #22765, if #22765 is picked to 2.0, then also need to pick this PR.

When shuffle type is BUCKET_SHFFULE_HASH_PARTITIONED, since data of multi buckets maybe sent to the same channel, send eos too early may cause data lost.
2023-08-15 17:30:27 +08:00
d7a5c37672 [improvement](tablet clone) update the capacity coeficient for calculating backend load score (#22857)
update the capacity coeficient for calcutating the backend load score:
1. Add fe config entry `backend_load_capacity_coeficient` to allow setting the capacity coeficient manually;
2. Adjust calculating capacity coeficient as below.

We emphasize disk usage for calculating load score. 
If a be has a high used capacity percent, we should increase its load score.
So we increase capacity coefficient with a be's used capacity percent.

But this is not enough. For example, if the tablets have a big difference in data size.
Then for below two BEs, their load score maybe the same:
BE A:  disk usage = 60%,  replica number = 2000  (it contains the big tablets)
BE B:  disk usage = 30%,  replica number = 4000  (it contains the small tablets)

But what we want is: firstly move some big tablets from A to B, after their disk usages are close,
then move some small tablets from B to A, finally both of their disk usages and replica number
are close.

To achieve this, when the max difference between all BE's disk usages >= 30%,  we set the capacity cofficient to 1.0 and avoid the affect of replica num. After the disk usage difference decrease, then decrease the capacity cofficient to make replica num effective.
2023-08-15 17:27:31 +08:00
7de362f646 [fix](Nereids): expand other join which has or condition (#22809) 2023-08-15 16:49:19 +08:00
dd09e42ca9 [enhancement](Nereids): expression unify constructor by using List (#22985) 2023-08-15 16:47:58 +08:00
140ab60a74 [Enhancement](multi-catalog) add a BE selection strategy for hdfs short-circuit-read. (#22697)
Sometimes the BEs will be deployed on the same node with DataNode, so we can use a more reasonable BE selection policy to use the hdfs short-circuit-read as much as possible.
2023-08-15 15:34:39 +08:00
a2e00727d6 [feature](auth)support Col auth (#22629)
support GRANT privilege[(col1,col2...)] [, privilege] ON db.tbl TO user_identity [ROLE 'role'];
2023-08-15 15:32:51 +08:00
f1864d9fcf [fix](function) fix str_to_date with specific format #22981 2023-08-15 15:30:48 +08:00
9b42093742 [feature](agg) Make 'map_agg' support array type as value (#22945) 2023-08-15 14:44:50 +08:00
1d825f57bc [fix](load) expose error root cause msg for load (#22968)
Currently, we only return ambiguous "INTERNAL ERROR" to the user when
load. This commit will no more hide the root cause.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-08-15 13:22:45 +08:00