[fix](statistics) statistics inaccurate after analyze same table more than once (#14279)
If a table already been analyzed, then we analyze it again, the new statistics would larger than expected since the incremental would contain the values from table level statistics since the SQL lack the predication for the nullability of part_id
This commit is contained in:
@ -124,7 +124,8 @@ public class AnalysisJob {
|
||||
+ " FROM ${internalDB}.${columnStatTbl}"
|
||||
+ " WHERE ${internalDB}.${columnStatTbl}.db_id = '${dbId}' AND "
|
||||
+ " ${internalDB}.${columnStatTbl}.tbl_id='${tblId}' AND "
|
||||
+ " ${internalDB}.${columnStatTbl}.col_id='${colId}'"
|
||||
+ " ${internalDB}.${columnStatTbl}.col_id='${colId}' AND "
|
||||
+ " ${internalDB}.${columnStatTbl}.part_id IS NOT NULL"
|
||||
+ " ) t1, \n"
|
||||
+ " (SELECT NDV(${colName}) AS ndv FROM `${dbName}`.`${tblName}`) t2\n";
|
||||
|
||||
|
||||
Reference in New Issue
Block a user