[enhancement](recover) Support skipping bad tablet in select by session variable (#30241)

In some scenarios, user has a huge amount of data and only a single replica was specified when creating the table, if one of the tablet is damaged, the table will not be able to be select. If the user does not care about the integrity of the data, they can use this variable to temporarily skip the bad tablet for querying and load the remaining data into a new table.
This commit is contained in:
zxealous
2024-01-23 16:56:01 +08:00
committed by yiguolei
parent 1b9f1f6483
commit 4cbacb5b39
5 changed files with 27 additions and 0 deletions

View File

@ -579,6 +579,10 @@ Note that the comment must start with /*+ and can only follow the SELECT.
In some scenarios, all replicas of tablet are having missing versions, and the tablet is unable to recover. This config can control the behavior of query. When it is opened, the query will ignore the visible version recorded in FE partition, use the replica version. If the replica on be has missing versions, the query will directly skip this missing version, and only return the data of the existing version, In addition, the query will always try to select the one with the highest lastSuccessVersion among all surviving BE replicas, so as to recover as much data as possible. You should only open it in the emergency scenarios mentioned above, only used for temporary recovery queries. Note that, this variable conflicts with the a variable, when the a variable is not -1, this variable will not work.
* `skip_bad_tablet`
In some scenarios, user has a huge amount of data and only a single replica was specified when creating the table, if one of the tablet is damaged, the table will not be able to be select. If the user does not care about the integrity of the data, they can use this variable to temporarily skip the bad tablet for querying and load the remaining data into a new table.
* `default_password_lifetime`
Default password expiration time. The default value is 0, which means no expiration. The unit is days. This parameter is only enabled if the user's password expiration property has a value of DEFAULT. like:

View File

@ -567,6 +567,10 @@ try (Connection conn = DriverManager.getConnection("jdbc:mysql://127.0.0.1:9030/
有些极端场景下,表的 Tablet 下的所有的所有副本都有版本缺失,使得这些 Tablet 没有办法被恢复,导致整张表都不能查询。这个变量可以用来控制查询的行为,当设置为`true`时,查询会忽略 FE partition 中记录的 visibleVersion,使用 replica version。如果 Be 上的 Replica 有缺失的版本,则查询会直接跳过这些缺失的版本,只返回仍存在版本的数据。此外,查询将会总是选择所有存活的 BE 中所有 Replica 里 lastSuccessVersion 最大的那一个,这样可以尽可能的恢复更多的数据。这个变量应该只在上述紧急情况下才被设置为`true`,仅用于临时让表恢复查询。注意,此变量与 use_fix_replica 变量冲突,当 use_fix_replica 变量不等于 -1 时,此变量会不起作用
* `skip_bad_tablet`
在某些情况下,用户某张单副本表中有大量数据,如果其中某个Tablet损坏,将导致整张表无法查询。如果用户不关心数据的完整性,他们可以使用此变量暂时跳过坏的Tablet进行查询,并将剩余数据导入到新表中。
* `default_password_lifetime`
默认的密码过期时间。默认值为 0,即表示不过期。单位为天。该参数只有当用户的密码过期属性为 DEFAULT 值时,才启用。如:

View File

@ -765,6 +765,9 @@ public class OlapScanNode extends ScanNode {
// random shuffle List && only collect one copy
List<Replica> replicas = tablet.getQueryableReplicas(visibleVersion, skipMissingVersion);
if (replicas.isEmpty()) {
if (ConnectContext.get().getSessionVariable().skipBadTablet) {
continue;
}
LOG.warn("no queryable replica found in tablet {}. visible version {}", tabletId, visibleVersion);
StringBuilder sb = new StringBuilder(
"Failed to get scan range, no queryable replica found in tablet: " + tabletId);

View File

@ -320,6 +320,8 @@ public class SessionVariable implements Serializable, Writable {
public static final String SKIP_MISSING_VERSION = "skip_missing_version";
public static final String SKIP_BAD_TABLET = "skip_bad_tablet";
public static final String ENABLE_PUSH_DOWN_NO_GROUP_AGG = "enable_push_down_no_group_agg";
public static final String ENABLE_CBO_STATISTICS = "enable_cbo_statistics";
@ -1146,6 +1148,14 @@ public class SessionVariable implements Serializable, Writable {
@VariableMgr.VarAttr(name = SKIP_MISSING_VERSION)
public boolean skipMissingVersion = false;
// This variable is used to control whether to skip the bad tablet.
// In some scenarios, user has a huge amount of data and only a single replica was specified when creating
// the table, if one of the tablet is damaged, the table will not be able to be select. If the user does not care
// about the integrity of the data, they can use this variable to temporarily skip the bad tablet for querying and
// load the remaining data into a new table.
@VariableMgr.VarAttr(name = SKIP_BAD_TABLET)
public boolean skipBadTablet = false;
// This variable is used to avoid FE fallback to the original parser. When we execute SQL in regression tests
// for nereids, fallback will cause the Doris return the correct result although the syntax is unsupported
// in nereids for some mistaken modification. You should set it on the
@ -2853,6 +2863,7 @@ public class SessionVariable implements Serializable, Writable {
tResult.setEnableParallelScan(enableParallelScan);
tResult.setParallelScanMaxScannersCount(parallelScanMaxScannersCount);
tResult.setParallelScanMinRowsPerScanner(parallelScanMinRowsPerScanner);
tResult.setSkipBadTablet(skipBadTablet);
return tResult;
}

View File

@ -269,6 +269,11 @@ struct TQueryOptions {
96: optional i32 parallel_scan_max_scanners_count = 0;
97: optional i64 parallel_scan_min_rows_per_scanner = 0;
98: optional bool skip_bad_tablet = false;
// For cloud, to control if the content would be written into file cache
1000: optional bool disable_file_cache = false
}