[fix](statistics) statistics inaccurate after analyze same table more than once (#14279)

If a table already been analyzed, then we analyze it again, the new statistics would larger than expected since the incremental would contain the values from table level statistics since the SQL lack the predication for the nullability of part_id
This commit is contained in:
Kikyou1997
2022-11-17 20:18:14 +08:00
committed by GitHub
parent a382bb95e7
commit 98956dfa19

View File

@ -124,7 +124,8 @@ public class AnalysisJob {
+ " FROM ${internalDB}.${columnStatTbl}"
+ " WHERE ${internalDB}.${columnStatTbl}.db_id = '${dbId}' AND "
+ " ${internalDB}.${columnStatTbl}.tbl_id='${tblId}' AND "
+ " ${internalDB}.${columnStatTbl}.col_id='${colId}'"
+ " ${internalDB}.${columnStatTbl}.col_id='${colId}' AND "
+ " ${internalDB}.${columnStatTbl}.part_id IS NOT NULL"
+ " ) t1, \n"
+ " (SELECT NDV(${colName}) AS ndv FROM `${dbName}`.`${tblName}`) t2\n";