这是我的
snakemake
工作流程的一部分:
规则 bagel2bf 在某些数据集上偶尔会失败,但我不希望整个
snakemake
运行失败。
如何调整工作流程,以便如果规则 bagel2bf 存在退出代码 1,它将跳过依赖于其输出的规则(也许还打印警告消息)?
这是规则代码:
rule install_bagel2:
output:
directory("bagel2_software/"),
shell:
"git clone https://github.com/hart-lab/bagel.git {output}"
rule convert_count_table:
input:
"count/counts-aggregated.tsv"
output:
"count/counts-aggregated-bagel2.tsv"
params:
fa=fasta,
resources:
runtime=config["resources"]["stats"]["time"]
conda:
"envs/stats.yaml"
script:
"scripts/convert_count_table.py"
rule bagel2fc:
input:
"bagel2_software/",
"count/counts-aggregated-bagel2.tsv",
output:
"bagel2/{bcomparison}/{bcomparison}.foldchange"
resources:
runtime=config["resources"]["stats"]["time"]
conda:
"envs/stats.yaml"
log:
"logs/bagel2/fc/{bcomparison}.log"
script:
"scripts/bagel2fc.py"
rule bagel2bf:
input:
"bagel2_software/",
"bagel2/{bcomparison}/{bcomparison}.foldchange",
output:
"bagel2/{bcomparison}/{bcomparison}.bf"
params:
species=config["lib_info"][library]["species"],
resources:
runtime=config["resources"]["stats"]["time"]
conda:
"envs/stats.yaml"
log:
"logs/bagel2/bf_{bcomparison}.log"
script:
"scripts/bagel2bf.py"
rule bagel2pr:
input:
"bagel2_software/",
"bagel2/{bcomparison}/{bcomparison}.bf",
output:
report("bagel2/{bcomparison}/{bcomparison}.pr", caption="report/bagel2.rst", category="BAGEL2")
params:
species=config["lib_info"][library]["species"]
resources:
runtime=config["resources"]["stats"]["time"]
conda:
"envs/stats.yaml"
log:
"logs/bagel2/pr_{bcomparison}.log"
script:
"scripts/bagel2pr.py"
rule plot_bf:
input:
"bagel2/{bcomparison}/{bcomparison}.bf"
output:
report("bagel2_plots/{bcomparison}/{bcomparison}.bf.pdf", caption="report/bagel2_plots.rst", category="BAGEL2 plots", subcategory="{bcomparison}", labels={"Comparison":"{bcomparison}", "Figure":"BF plot"})
conda:
"envs/stats.yaml"
script:
"scripts/plot_bf.py"
rule plot_pr:
input:
"bagel2/{bcomparison}/{bcomparison}.pr"
output:
report("bagel2_plots/{bcomparison}/{bcomparison}.pr.pdf", caption="report/bagel2_plots.rst", category="BAGEL2 plots", subcategory="{bcomparison}", labels={"Comparison":"{bcomparison}", "Figure":"Precision-recall plot"})
conda:
"envs/stats.yaml"
script:
"scripts/plot_pr.py"
我会捕获规则
bagel2bf
中的异常,例如使用pythontry/except
或类似机制。然后写出预期的文件,其中要么没有任何内容,要么包含一些有关异常的有用的、可解析的信息。下游规则将以类似的方式通过写入空输出文件或一些标志文件来处理此类文件。
我不知道有一种机制可以明确告诉snakemake在出现故障时跳过部分管道,我认为这违背了snakemake的原则。我认为上述解决方案是明智的,因为如果在某种程度上预期会发生故障,那么下游规则应该相应地处理它们,而不是跳过它们。