Commit Graph

12628 Commits

Author SHA1 Message Date
343a6dc29d [improvement](hash join) Return result early if probe side has no data (#23044) 2023-08-17 09:17:09 +08:00
a77e9fbc99 (chores)(ui) download profile filename add profile_id (#23065) 2023-08-17 09:11:01 +08:00
7a9ff47528 [Improve](CI)Modify Deadline-check trigger mode, and add maven cache for Sonarcheck (#23069)
There are a lot of deadlinks in stock, we will reopen it after a full repair…
2023-08-16 22:31:50 +08:00
814acbf331 [pipeline](exec) disable pipeline load in master code (#23061)
disable pipeline load in master code
2023-08-16 21:53:58 +08:00
390c52f73a [Improve](complex-type) update for array/map element_at with nested complex type with local tvf (#22927) 2023-08-16 20:47:36 +08:00
a5c73c7a39 [fix](partial update) set io_ctx.reader_type when reading columns for partial update (#22630) 2023-08-16 19:34:39 +08:00
0aa57d159e [Fix](Partial update) Fix wrong position using in segment writer (#22782) 2023-08-16 19:31:06 +08:00
0594acfcf1 [fix](Nereids) scan should output all invisiable column (#23003) 2023-08-16 18:07:59 +08:00
b815cf327a [enhancement](merge-on-write) Add more log info when delete bitmap correctness check failed (#22984) 2023-08-16 17:25:11 +08:00
f1880d32d9 [fix](nereids)bind slot failed because of "default_cluster" #23008
slot bind failed for following querys:
select tpch.lineitem.* from lineitem
select tpch.lineitem.l_partkey from lineitem

the unbound slot is tpch.lineitem.l_partkey, but the bounded slot is default_cluster:tpch.lineitem.l_partkey. They are not matched.
we need to ignore default_cluster: when compare dbName
2023-08-16 17:22:44 +08:00
92f443b3b8 [enhancement](Nereids): count(1) to count(*) #22999
add a rule to transform count(1) to count(*)
2023-08-16 17:19:23 +08:00
2dbca7a688 [Fix](Planner) fix multi phase analysis failed in multi instance environment substitution (#22840)
Problem:
When executing group_concat with order by inside in view, column can not be found when analyze.

Example:
create view if not exists test_view as select group_concat(c1,',' order by c1 asc) from table_group_concat;
select * from test_view;
it will return an error like: "can not find c1 in table_list"

Reason:
When we executing this sql in multi-instance environment, Planner would try to create plan in multi phase
aggregation. And because we analyze test_view independent with tables outside view. So we can not get
table informations inside view.

Solution:
Substitute order by expression of merge aggregation expressions.
2023-08-16 16:46:26 +08:00
7adb2be360 [Fix](Nereids) fix insert into return npe from follower node. (#22734)
insert into table command run at a follower node, it will forward to the master node, and the parsed statement is not set to the cascades context, but set to the executor::parsedStmt, we use the latter to get the user info.
2023-08-16 16:37:17 +08:00
6cf1efc997 [refactor](load) use smart pointers to manage writers in memtable memory limiter (#23019) 2023-08-16 16:34:57 +08:00
4512569a3a [docs](releasenote)Update en release note 2.0.0 (#23041) 2023-08-16 15:13:09 +08:00
5148bc6fa7 [fix](partial update)allow delete sign column in partial update in planForPipeline (#23034) 2023-08-16 14:20:39 +08:00
4510e16845 [improvement](delete) support delete predicate on value column for merge-on-write unique table (#21933)
Previously, delete statement with conditions on value columns are only supported on duplicate tables. After we introduce delete sign mechanism to do batch delete, a delete statement with conditions on value columns on unique tables will be transformed into the corresponding insert into ..., __DELETE_SIGN__ select ... statement. However, for unique table with merge-on-write enabled, the overhead of inserting these data can be eliminated. So this PR add the ability to allow delete predicate on value columns for merge-on-write unique tables.
2023-08-16 12:18:05 +08:00
3efa06e63e [Fix](View)varchar type conversion error (#22987) 2023-08-16 11:49:04 +08:00
c41179b8e9 [fix](regression) Improve the robustness when close target connection (#23012) 2023-08-16 11:42:58 +08:00
221e7bdd17 [test](jdbc external) fix mysql and pg external regression test (#22998) 2023-08-16 10:44:47 +08:00
a2095b7d9e [fix](docs) add enable_single_replica_load on be config doc (#22948) 2023-08-16 10:31:01 +08:00
Pxl
d5df3bae25 [Bug](exchange) fix dcheck fail when VDataStreamRecvr input empty block (#22992)
fix dcheck fail when VDataStreamRecvr input empty block
2023-08-16 10:21:19 +08:00
3b8981bee7 [chore](third-party) Speed the download up for aws-crt-cpp (#22997)
The package aws-sdk-cpp was upgraded in #20252. We can speed the download up for aws-crt-cpp.
2023-08-16 09:47:18 +08:00
da097629ea [chore](build) Fix the build with MySQL support (#23020) 2023-08-16 09:28:56 +08:00
cb6678adb9 [fix](case) Update repositoryAffinityList1.sql (#22941) 2023-08-16 09:23:46 +08:00
c8c46e042d [Improve](regress-test)add regress test for map_agg with nested type and insert to doris inner table #23006 2023-08-16 09:21:02 +08:00
d3dddeea8a [fix](load) remove incorrect DCHECK in BetaRowsetWriter dtor (#23016)
The DCHECK may not always be right in case of Vertical compaction.
remove it to let DEBUG run.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-08-15 23:55:02 +08:00
423002b20a [fix](nereids) partitionTopN & Window estimation (#22953)
* partitionTopN & winExpr estimation

* tpcds 44/47/57
2023-08-15 20:19:03 +08:00
fe08db191f [typo](docs) Optimize the release note 2.0.0 (#22926) 2023-08-15 20:09:56 +08:00
61d2f37bdc [fix](jdbc catalog) fix string type insert into odbc table (#22961) 2023-08-15 20:09:38 +08:00
f191736bfe [bug](shuffle) Fix DCHECK failure if exchange node has limit (#22993) 2023-08-15 19:14:37 +08:00
41a52d45d3 [pipeline](branch-2.0) pr to branch-2.0 also run checks (#23004) 2023-08-15 19:13:13 +08:00
80566f7fed [stats](nereids)support partition stats (#22606) 2023-08-15 17:52:25 +08:00
9b2323b7fd [Pipeline](exec) support async writer in pipelien query engine (#22901) 2023-08-15 17:32:53 +08:00
50f66b1246 [fix](pipeline) fix bug of datastream sender when doing BUCKET_SHFFULE_HASH_PARTITIONED shuffle (#22988)
This issue is introduced by #22765, if #22765 is picked to 2.0, then also need to pick this PR.

When shuffle type is BUCKET_SHFFULE_HASH_PARTITIONED, since data of multi buckets maybe sent to the same channel, send eos too early may cause data lost.
2023-08-15 17:30:27 +08:00
d7a5c37672 [improvement](tablet clone) update the capacity coeficient for calculating backend load score (#22857)
update the capacity coeficient for calcutating the backend load score:
1. Add fe config entry `backend_load_capacity_coeficient` to allow setting the capacity coeficient manually;
2. Adjust calculating capacity coeficient as below.

We emphasize disk usage for calculating load score. 
If a be has a high used capacity percent, we should increase its load score.
So we increase capacity coefficient with a be's used capacity percent.

But this is not enough. For example, if the tablets have a big difference in data size.
Then for below two BEs, their load score maybe the same:
BE A:  disk usage = 60%,  replica number = 2000  (it contains the big tablets)
BE B:  disk usage = 30%,  replica number = 4000  (it contains the small tablets)

But what we want is: firstly move some big tablets from A to B, after their disk usages are close,
then move some small tablets from B to A, finally both of their disk usages and replica number
are close.

To achieve this, when the max difference between all BE's disk usages >= 30%,  we set the capacity cofficient to 1.0 and avoid the affect of replica num. After the disk usage difference decrease, then decrease the capacity cofficient to make replica num effective.
2023-08-15 17:27:31 +08:00
7de362f646 [fix](Nereids): expand other join which has or condition (#22809) 2023-08-15 16:49:19 +08:00
dd09e42ca9 [enhancement](Nereids): expression unify constructor by using List (#22985) 2023-08-15 16:47:58 +08:00
140ab60a74 [Enhancement](multi-catalog) add a BE selection strategy for hdfs short-circuit-read. (#22697)
Sometimes the BEs will be deployed on the same node with DataNode, so we can use a more reasonable BE selection policy to use the hdfs short-circuit-read as much as possible.
2023-08-15 15:34:39 +08:00
a2e00727d6 [feature](auth)support Col auth (#22629)
support GRANT privilege[(col1,col2...)] [, privilege] ON db.tbl TO user_identity [ROLE 'role'];
2023-08-15 15:32:51 +08:00
f1864d9fcf [fix](function) fix str_to_date with specific format #22981 2023-08-15 15:30:48 +08:00
9b42093742 [feature](agg) Make 'map_agg' support array type as value (#22945) 2023-08-15 14:44:50 +08:00
1d825f57bc [fix](load) expose error root cause msg for load (#22968)
Currently, we only return ambiguous "INTERNAL ERROR" to the user when
load. This commit will no more hide the root cause.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-08-15 13:22:45 +08:00
c2ff940947 [refactor](parquet)change decimal type export as fixed-len-byte on parquet write (#22792)
before the parquet write export decimal as byte-binary,
but can't be import those fied to Hive.
Now, change to export decimal as fixed-len-byte-array in order to import hive directly.
2023-08-15 13:17:50 +08:00
94bf8fb3c5 [performance](executor) optimize time_round function only one arg (#22855) 2023-08-15 13:16:42 +08:00
f6ca16e273 [fix](analysis) fix error msg #22950 2023-08-15 13:15:13 +08:00
1eab93b368 [chore](Nereids): remove useless code (#22960) 2023-08-15 13:14:20 +08:00
707a527775 [FIX](map)insert into doris table with array/map type by local tvf (#22955) 2023-08-15 13:11:23 +08:00
Pxl
34399e2965 [Bug](exchange) init _instance_to_rpc_ctx on register_sink (#22976)
init _instance_to_rpc_ctx on register_sink
2023-08-15 13:02:28 +08:00
13d24297a7 [fix](Nereids) type check could not work when root node is table or file sink (#22902)
type check could not work because no expression in plan.
sink and scan have no expression at all. so cannot check type.
this pr add expression on logical sink to let type check work well
2023-08-15 11:45:16 +08:00