Commit Graph

13248 Commits

Author SHA1 Message Date
03757d0672 [bug](explode) fix table node not implement alloc_resource function (#24031)
fix table node not implement alloc_resource function
2023-09-09 08:25:28 +08:00
698fe55662 remove unused configs in be and broker (#24021) 2023-09-09 08:24:50 +08:00
153c7982f3 [Optimize](invert index) Optimize multiple terms conjunction query (#23871) 2023-09-09 01:52:58 +08:00
0f408d1192 [improvement](executor)Add name for task scheduler #23983 2023-09-09 00:56:39 +08:00
b5e1e36750 [fix](pipeline)add logs for unstable cases #24073
Issue Number: close #xxx

ShowTableStmtTest.testNoDb and DropDbStmtTest.testNoPriv are unstable cases,error msg is:

java.lang.Exception: Unexpected exception, expected<org.apache.doris.common.AnalysisException> but was<mockit.internal.expectations.invocation.MissingInvocation>
we can not know what is missing ,and this issue cannot be reproduced locally,so add some log
2023-09-09 00:49:40 +08:00
7abd23cad1 [fix](tablet clone) fix be load rebalancer choose candidate tablets #23915
When be load reblancer choose candidate tablets, it will try moving tablets from high load backends to low backend backends. If the higher HIGH BE has no available slot num, it should try next HIGH BE.
2023-09-09 00:48:27 +08:00
2fb4c818da [fix](tablet clone) delete tablet check other catchup #24038
Sometimes FE replica's version is unreliable. FE's replica may bigger than BE's real version. Need check if BE missing version (last failed version > 0).
2023-09-09 00:42:32 +08:00
aad3eb257f update gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b to 3.0.0 (#24056)
There are 1 security vulnerabilities found in gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b

CVE-2022-28948
What did I do?
Upgrade gopkg.in/yaml.v3 from v3.0.0-20210107192922-496545a6307b to 3.0.0 for vulnerability fix

What did you expect to happen?
Ideally, no insecure libs should be used.

How can we automate the detection of these types of issues?
By using the GitHub Actions configurations provided by murphysec, we can conduct automatic code security checks in our CI pipeline.

The specification of the pull request
PR Specification from OSCS
2023-09-09 00:37:39 +08:00
3e7f531d2b [fix](sec)upgrade org.yaml:snakeyaml to 2.0 #24057 2023-09-09 00:37:07 +08:00
0f0ffa3482 [Fix](Parquet Reader) fix parquet read issue (#24092) 2023-09-09 00:35:18 +08:00
0143ae8266 [fix]Add logging before _builtin_unreachable() (#24101)
Co-authored-by: 宋光璠 <songguangfan@sf.com>
2023-09-09 00:30:11 +08:00
8336bf0b06 [regression](pipelineX) disable runtime filter for pipelineX test cases (#24119) 2023-09-08 23:31:26 +08:00
69cc6fee97 [fix](explain) fix explain physical plan with external table issue (#23845)
The `SelectedPartitions` can not be null, or it will throw NPE and fallback to origin planner.
2023-09-08 21:11:48 +08:00
894aa48743 [fix](Nereids) remove PARTITIONS from non-reserved list (#24110)
according to PR #24053, it removed partitions from non-reserved list in
legacy planner's parser. For consistency, remove it from Nereids' parser
too.
2023-09-08 20:47:07 +08:00
929a9ad143 [Fix](RoutineLoad) Delete duplicate attribute in job property #24037 2023-09-08 20:42:28 +08:00
5c2f9eb92e [Improvement] (pipeline) Cancel related query if backend restarts or dead (#23863) 2023-09-08 20:30:52 +08:00
e140938d81 [Perfomance][export] Opt the export of CSV tranformer (#24003) 2023-09-08 20:26:54 +08:00
0b24bd6a42 [Bug](pipelineX) init runtime filter profile at first (#24106) 2023-09-08 20:01:02 +08:00
2638ad0550 [fix](compaction) rowid_conversion should ignore deleted row on normal compaction (#24005) 2023-09-08 19:44:24 +08:00
f8fd8a3d17 [fix](trash) fix clean trash not working (#23936)
When executing admin clean trash, if the backend daemon clean thread is cleaning trash, then SQL command will return immediately. But for the backend daemon thread, it doesn't clean all the trashes, it clean only the expired trashes.
Also if there's lots of trashes, the daemon clean thread will busy handling trashes for a long time.
2023-09-08 18:13:22 +08:00
76ca57cf21 [bug](join) fix outer join not add tuple is null column when build rows is 0 (#23974)
fix outer join not add tuple is null column when build rows is 0
2023-09-08 17:55:03 +08:00
Pxl
69868f18d6 [Bug](join) fix nested loop join some problems (#24034) 2023-09-08 17:40:41 +08:00
01ea024497 [fix](nereids) runtimefilter not generated after postprocessor (#23948)
fix bug: rf not generated
2023-09-08 17:37:04 +08:00
1abf5e779d [pipelineX](refactor) refactor debug string (#24083) 2023-09-08 16:58:53 +08:00
161520feb4 [feature](Nereids): enable convert CASE WHEN to IF (#24050)
enable rule to convert CASE WHEN to IF.
2023-09-08 16:58:33 +08:00
c0a41dc0f8 [fix](nereids) external scan use STORAGE_ANY instead of ANY as distibution (#24039) 2023-09-08 16:25:35 +08:00
d8bdd6c137 [fix](nereids) avoid throw analysis exception for unsupported type to make ext table goes nereids (#24089)
void throw analysis ex for unsupported type to make ext table goes nereids.
this will improve the nereids' availability for external table if unsupported type is in the basic table schema but not referenced in the real sql.

tested in external table env.

Consider the following case:
select pu.pk_ct_pu as id
  from fms_rd_nc65_zb.NC65P.CT_PU pu
  left join fms_rd_nc65_zb.NC65P.PUB_WF_INSTANCE pwi
    on pu.pk_ct_pu = pwi.billid 
    and pu.vtrantypecode=pwi.billtype
 left join fms_rd_nc65_zb.NC65P.SM_USER su
    on pu.creator = su.cuserid
 where pu.pk_ct_pu='1001A110000000K8XPVN'; 

PUB_WF_INSTANCE table has a BLOB type column and currently it will throw analysis exception and fallback to old optimizer, although this column is not referenced in the real sql. The old optimizer doesn't have the outer join -> inner join rule and the "pu.pk_ct_pu='1001A110000000K8XPVN'; " is not pushed down and the performance will drop down. After the pr, we add the unsupported type instead of throw exception directly, it will decide the unused case and can continue goes nereids and use all advanced optimization for the sql.
2023-09-08 15:56:29 +08:00
82dc970916 [feature](insert) Support group commit insert (#22829) 2023-09-08 15:51:03 +08:00
741665d37f [Fix](regression) Fix test_partial_update_schema_change (#23960) 2023-09-08 15:40:07 +08:00
84c1f5692e [Fix](autobucket) use single replica partition size to calc bucket number #24045 2023-09-08 14:54:02 +08:00
576855acb2 [fix](Nereids): fix regression-test (#24065) 2023-09-08 14:14:48 +08:00
2965b9b3b4 fix update delete bitmap when rowset is blank (#24075)
If the rowset (derived from a clone) does not have a segment, there is no need to update the delete bitmap.
2023-09-08 12:43:42 +08:00
c68e6a9ca8 [Fix](catalog) Doris datetime type conversion failed (#23906)
1. The catalog is connected to an old version of Doris, and an error is reported when using the datetime field type on the Doris surface .
2. error message [fe]:
  Caused by: java.lang.NumberFormatException: For input string: "DATETIM"
  at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_291]
  at java.lang.Integer.parseInt(Integer.java:580) ~[?:1.8.0_291]
  at java.lang.Integer.parseInt(Integer.java:615) ~[?:1.8.0_291]
  at org.apache.doris.datasource.jdbc.client.JdbcMySQLClient.dorisTypeToDoris(JdbcMySQLClient.java:401) ~[doris-fe.jar:1.2- 
 SNAPSHOT]
2023-09-08 10:49:35 +08:00
cb29d1a395 fix compile error with gcc12 (#24049) 2023-09-08 10:36:30 +08:00
b73f345479 [fix](intersect) fix wrong result of intersect node (#24044)
Issue Number: close #24046
2023-09-08 10:27:37 +08:00
3927ceac95 [Bug](runtime filter) Fix runtime filter initialization (#24063)
In be.WARNING, print lots of logs like 'runtime filter params meet error'. This is misleading message
2023-09-08 10:27:20 +08:00
a27349c83a [fix](Export) Concatenation the outfile sql for Export (#23635)
In the original logic, the `Export` statement generates `Selectstmt` for execution. But there is no way to make the `SelectStmt` use the new optimizer.

Now, we change the `Export` statement to generate the `outfile SQL`, and then use the new optimizer to parse the SQL so that outfile can use the new optimizer.
2023-09-08 10:20:18 +08:00
cdb1b341c7 [pipelineX](runtime filter) Support runtime filter (#24054) 2023-09-08 10:17:22 +08:00
Pxl
ac6028a731 [Bug](partition) fix cannot delete from partitions (#24053) 2023-09-08 10:11:30 +08:00
b6b8ef3a18 [chore](script) avoid failed while buiding on non-git repository (#23982)
Co-authored-by: yiguolei <676222867@qq.com>
2023-09-08 10:08:00 +08:00
0bdd078b41 [fix](jdbc catalog) fixed the sqlserver jdbc url parm concatenation error (#23841) 2023-09-08 09:58:20 +08:00
fb5a77b726 [Fix](statistics)Handle external table in statistics cleaner. (#23843)
Before, Statistics Cleaner only handles olap db and table.
External db and tables would be removed without verification. So that external stats could stored no more than 2 days, which is the interval of Stats cleaner thread.
This pr is to add verification for external db and tables.
2023-09-08 09:43:46 +08:00
cb43f07487 [Improvement](statistics)Support basic jdbc external table stats collection (#23963)
Support jdbc external table stats collection.
2023-09-08 09:40:13 +08:00
68acb8597b [fix](nested_loop_join) null value should be output in semi-anti join (#23971)
create table t1
        (k1 bigint, k2 bigint)
        ENGINE=OLAP
DUPLICATE KEY(k1, k2)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(k2) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"is_being_synced" = "false",
"storage_format" = "V2",
"light_schema_change" = "true",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false"
);
create table t3
        (k1 bigint, k2 bigint)
        ENGINE=OLAP
DUPLICATE KEY(k1, k2)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(k2) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"is_being_synced" = "false",
"storage_format" = "V2",
"light_schema_change" = "true",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false"
);
Data:

insert into t1 values (1,null),(null,1),(1,2), (null,2),(1,3), (2,4), (2,5), (3,3), (3,4), (20,2), (22,3), (24,4),(null,null);
insert into t3 values (1,null),(null,1),(1,4), (1,2), (null,3), (2,4), (3,7), (3,9),(null,null),(5,1);
Query:

 select t1.* from t1 where not exists ( select k1 from t3 where t1.k2 < t3.k2 );
Result:

Empty set
Expect result:

+------+------+
| k1   | k2   |
+------+------+
| NULL | NULL |
|    1 | NULL |
+------+------+
2023-09-08 09:28:55 +08:00
0dee7246bc Revert "[opt](stats) remove table stats when table has been removed (#23803)" (#24058)
This reverts commit 66d3371400207f568c7ff6ff6bf5f4f0da32bd2c.
Reverts #23803
2023-09-07 23:25:09 +08:00
26337543bf [fix](Nereids) make TVF's distribution spec always be RANDOM (#24020)
Nereids make TVF number as Gather distribution if backend num = 1.
But coordinator could not process gather fragment with scan node.
In the long run, we need to get coordinators to support this scenario.
But it is need a lot of refactor. So, we just forbid Gather distribution
for ScanNode now.
2023-09-07 22:06:45 +08:00
Pxl
ab7c2b9d22 [Bug](type) fix wildcard char's tostring get wrong result (#24041)
fix wildcard char's tostring get wrong result
2023-09-07 20:25:38 +08:00
f0bd2c9c53 [opt](Nereids) optimize error msg of unbound slot (#23933)
for example:
```sql
select avg(c3) from (select c2 from t2) v;
```
the error msg before this PR
```
Invalid call to c3.getDataType() on unbound object
```
the error msg after this PR
```
Unknown column 'c3' in 'table list' in AGGREGATE clause
```
2023-09-07 20:15:59 +08:00
4b6552e929 [fix](regression) create table failed in 'map_agg' (#24030)
Co-authored-by: yiguolei <676222867@qq.com>
2023-09-07 20:14:07 +08:00
b2ca281395 [fix](Nereids) record wrong best plan properties (#23973)
when output meet order by not meet distribution. we use a trick way to
do enforce by set current output to any. but when we do enforce later,
we still use the old output. So when we do choose best plan, we could
not find the older output's plan, since we have replace it by any.
For example:

```
  lowest Plan(cost, properties, plan, childrenRequires)

    18.0 ANY
     id:138#4 cost=0 [0/0/0/] estRows=4 children=[@0 ] (plan=PhysicalWindow[139]@4 ( windowFrameGroup=(Funcs=[row_number() WindowSpec(PARTITION BY b#1, a#0 ROWS BETWEEN UNBOUNDED_PRECEDING AND CURRENT_ROW) AS `r1`#2], PartitionKeys=[b#1, a#0], OrderKeys=[], WindowFrame=WindowFrame(ROWS, UNBOUNDED_PRECEDING, CURRENT_ROW)), requiredProperties=[DistributionSpecHash ( orderedShuffledColumns=[1, 0], shuffleType=REQUIRE, tableId=-1, selectedIndexId=-1, partitionIds=[], equivalenceExprIds=[[1], [0]], exprIdToEquivalenceSet={0=1, 1=0} ) Order: ([b#1 asc, a#0 asc])], stats=null ))
     [DistributionSpecHash ( orderedShuffledColumns=[0], shuffleType=NATURAL, tableId=3547296, selectedIndexId=3547297, partitionIds=[3547295], equivalenceExprIds=[[0]], exprIdToEquivalenceSet={0=0} ) Order: ([b#1 asc, a#0 asc])]

    32.01171875 DistributionSpecHash ( orderedShuffledColumns=[1], shuffleType=REQUIRE, tableId=-1, selectedIndexId=-1, partitionIds=[], equivalenceExprIds=[[1]], exprIdToEquivalenceSet={1=0} ) Order: ([b#1 asc])
     id:161#4 cost=14 [4/4/4/] estRows=4 children=[@4 ] (plan=PhysicalQuickSort[162]@4 ( orderKeys=[b#1 asc], phase=LOCAL_SORT, stats=null ))
     [DistributionSpecHash ( orderedShuffledColumns=[0], shuffleType=NATURAL, tableId=3547296, selectedIndexId=3547297, partitionIds=[3547295], equivalenceExprIds=[[0]], exprIdToEquivalenceSet={0=0} ) Order: ([b#1 asc, a#0 asc])]

    32.01171875 DistributionSpecHash ( orderedShuffledColumns=[1], shuffleType=EXECUTION_BUCKETED, tableId=-1, selectedIndexId=-1, partitionIds=[], equivalenceExprIds=[[1]], exprIdToEquivalenceSet={1=0} ) Order: ([b#1 asc])
     id:161#4 cost=14 [4/4/4/] estRows=4 children=[@4 ] (plan=PhysicalQuickSort[162]@4 ( orderKeys=[b#1 asc], phase=LOCAL_SORT, stats=null ))
     [DistributionSpecHash ( orderedShuffledColumns=[1], shuffleType=EXECUTION_BUCKETED, tableId=-1, selectedIndexId=-1, partitionIds=[], equivalenceExprIds=[[1]], exprIdToEquivalenceSet={1=0} ) Order: ([])]

    18.01171875 DistributionSpecHash ( orderedShuffledColumns=[1], shuffleType=EXECUTION_BUCKETED, tableId=-1, selectedIndexId=-1, partitionIds=[], equivalenceExprIds=[[1]], exprIdToEquivalenceSet={1=0} ) Order: ([])
     id:157#4 cost=0 [0/0/0/] estRows=4 children=[@4 ] (plan=PhysicalDistribute[158]@4 ( distributionSpec=DistributionSpecHash ( orderedShuffledColumns=[1], shuffleType=EXECUTION_BUCKETED, tableId=-1, selectedIndexId=-1, partitionIds=[], equivalenceExprIds=[[1]], exprIdToEquivalenceSet={1=0} ), stats=null ))
     [DistributionSpecHash ( orderedShuffledColumns=[0], shuffleType=NATURAL, tableId=3547296, selectedIndexId=3547297, partitionIds=[3547295], equivalenceExprIds=[[0]], exprIdToEquivalenceSet={0=0} ) Order: ([b#1 asc, a#0 asc])]
```

the last one require a natural shuffle type property from this group.
but this property already been removed when we do
enforceDistributionButMeetSort. So, such exception will be thrown

```
Caused by: org.apache.doris.nereids.exceptions.AnalysisException: Failed to choose best plan
    at org.apache.doris.nereids.NereidsPlanner.chooseBestPlan(NereidsPlanner.java:340) ~[classes/:?]
    at org.apache.doris.nereids.NereidsPlanner.chooseBestPlan(NereidsPlanner.java:323) ~[classes/:?]
    ... 18 more
Caused by: org.apache.doris.nereids.exceptions.AnalysisException: lowestCostPlans with physicalProperties(DistributionSpecHash ( orderedShuffledColumns=[0], shuffleType=NATURAL, tableId=3547296, selectedIndexId=3547297, partitionIds=[35
47295], equivalenceExprIds=[[0]], exprIdToEquivalenceSet={0=0} ) Order: ([b#1 asc, a#0 asc])) doesn't exist in root group
    at org.apache.doris.nereids.NereidsPlanner.lambda$chooseBestPlan$1(NereidsPlanner.java:318) ~[classes/:?]
    at java.util.Optional.orElseThrow(Optional.java:408) ~[?:?]
    at org.apache.doris.nereids.NereidsPlanner.chooseBestPlan(NereidsPlanner.java:317) ~[classes/:?]
    at org.apache.doris.nereids.NereidsPlanner.chooseBestPlan(NereidsPlanner.java:323) ~[classes/:?]
    ... 18 more
```
2023-09-07 20:12:53 +08:00