Currently the index change job and clone task can be executed at the same time. If the clone task gets stuck at this point, it will cause the index change job to get stuck as well and keep retrying. To solve this problem, we can refer to alter job and make index change job exclusive with clone task, and introduce the timeout to prevent infinite retries of build index. Add the following checks and status in FE. 1. Check if table is stable (build index is not allowed when clone is in progress) 1.1. Tablet is HEALTHY. 1.2. Whether the tablet is included in the Tablet scheduler, if so, it means the current tablet is doing clone. 2. When creating the index change job, set the timeout at the same time. pick from master #35724
新加case注意事项
-
变量名前要写 def,否则是全局变量,并行跑的 case 的时候可能被其他 case 影响。
Problematic code:
ret = ***Correct code:
def ret = *** -
尽量不要在 case 中 global 的设置 session variable,或者修改集群配置,可能会影响其他 case。
Problematic code:
sql """set global enable_pipeline_x_engine=true;"""Correct code:
sql """set enable_pipeline_x_engine=true;""" -
如果必须要设置 global,或者要改集群配置,可以指定 case 以 nonConcurrent 的方式运行。
-
case 中涉及时间相关的,最好固定时间,不要用类似 now() 函数这种动态值,避免过一段时间后 case 就跑不过了。
Problematic code:
sql """select count(*) from table where created < now();"""Correct code:
sql """select count(*) from table where created < '2023-11-13';""" -
case 中 streamload 后请加上 sync 一下,避免在多 FE 环境中执行不稳定。
Problematic code:
streamLoad { ... } sql """select count(*) from table """Correct code:
streamLoad { ... } sql """sync""" sql """select count(*) from table """