提交的 Snakemake 规则开始运行并立即失败，没有抛出任何错误，但 Snakemake 继续运行

Question

我为我的项目编写了一个snakemake管道，其中一部分看起来像这样：

SAMPLES, = glob_wildcards("/absolute/path/to/samples/{sample}.bam")
rule all:
    input:
       expand("splits/sample_check/{sample}_done.txt", sample=SAMPLES)

rule GVCFSplit:
    input:
        "gvcf/{SAMPLES}/",
        #"chr_pos_test/chr{c}/chr{c}_reg{i}.txt"
    output:
        "splits/sample_check/{SAMPLES}_done.txt"
    log:
        "logs/GVCFSplit/{SAMPLES}_done.log"
    benchmark:
        "benchmarks/GVCFSplit/{SAMPLES}_done.benchmark.txt"
    envmodules:
        "bcftools"
    resources:
        mem='1g',
        time='4:00:00',
        threads=1
    shell:
        r"""
            python3 /absolute/path/to/python/script/GVCF_split.py {wildcards.SAMPLES}
        """

借助下面的 python 脚本，该规则将每个染色体文件分成 50 Mb 的块：

from pathlib import Path 
import subprocess
from sys import argv
import os,sys

sample_id=argv[1].strip()
chrs=list(range(1,23))

for c in chrs:
    sample_file=("/path/to/chromosome/files/per/sample/%s/%s_chr%i.g.vcf.gz") % (sample_id,sample_id,c)
    for r in range(1,chr_reg[c]+1):
        reg_file=("/path/to/per/chromosome/regions/chr%i/chr%i_reg%i.txt") % (c,c,r)
        #out_file=("try/gvcf/splits/chr%i/%s_reg%i.g.vcf.gz") % (c,sample_id,r)
        out_file=("/path/to/spiltted/vcf/files/chr%i/%s_reg%i.g.vcf.gz") % (c,sample_id,r)
        #Path(out_file).touch()
        proc = subprocess.run(["bcftools", "view", sample_file, "-Oz", "-o", out_file, "-R", reg_file])
        result = proc.returncode
        exit += result

if exit == 0:
    Path("splits/sample_check/"+sample_id+"_done.txt").touch() #creates a file for snakemake to track the changes if everything went fine
    sys.exit(0)
else:
    sys.exit(1)

当我手动运行 python 脚本时：

python3 GVCF_split.py "sample_id"

它运行得很顺利，但是当我将这个带有

--profile

的snakemake文件提交到集群时，规则按预期按样本提交，但它们在开始运行后立即失败。此后，snakemake 文件继续运行，并且不会引发任何错误。这是我使用

--profile

标志的配置文件：

cluster: mkdir -p slurm_snake/`basename {workflow.main_snakefile}`/{rule} &&
  sbatch
  --partition={resources.partition}
  --cpus-per-task={resources.threads}
  --mem={resources.mem}
  --time={resources.time}
  --job-name=smk-{rule}-{wildcards}
  --output=try/slurm_snake/`basename {workflow.main_snakefile}`/{rule}/{rule}-{wildcards}-%j.out
default-resources:
  - partition=main
  - mem='4G'
  - time="24:0:0"
  - threads=1
restart-times: 0
max-jobs-per-second: 5
max-status-checks-per-second: 1
local-cores: 1
latency-wait: 60
jobs: 1000
keep-going: True
rerun-incomplete: True
printshellcmds: True
scheduler: greedy

我的原始 Snakemake 文件有类似的设置（这是它的副本，以便在对原始文件进行更改之前尝试一些操作），并且在原始文件中，每个规则的每次提交的单独 slurm 文件是保存在

slurm_snake

文件夹中。但是，这些规则没有 slurm 文件，原因是什么以及我在将这些规则提交到集群时做错了什么？

以下是主 Snakemake 集群提交的 slurm 输出示例：

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cluster nodes: 1000
Job stats:
job          count    min threads    max threads
---------  -------  -------------  -------------
GVCFSplit        5              1              1
all              1              1              1
total            6              1              1

Select jobs to execute...

[Thu Mar 14 14:47:00 2024]
rule GVCFSplit:
    input: gvcf/12_19264_20
    output: splits/sample_check/12_19264_20_done.txt
    log: logs/GVCFSplit/12_19264_20_done.log
    jobid: 5
    benchmark: benchmarks/GVCFSplit/12_19264_20_done.benchmark.txt
    reason: Missing output files: splits/sample_check/12_19264_20_done.txt
    wildcards: SAMPLES=12_19264_20
    resources: mem_mb=1000, disk_mb=1000, tmpdir=/tmp, partition=main, mem=1g, time=4:00:00, threads=1


            python3 /path/to/script/GVCF_split.py 12_19264_20

当我从终端手动运行 python 脚本时，它运行时没有任何错误。

Answer 1

我过去也遇到过类似的问题，Snakemake 会一直等待不再运行的作业，而不会给出错误。事实证明，我没有通过 Snakemake 为作业分配足够的时间和内存，并且我的 SLURM 配置文件未设置为报告此情况。您可以通过运行

seff <jobid>

来获取有关 SLURM 中失败作业的更多详细信息，这可能会为您提供有关作业失败原因的更多信息。

提交的 Snakemake 规则开始运行并立即失败，没有抛出任何错误，但 Snakemake 继续运行

问题描述投票：0回答：1

1个回答

最新问题

提交的 Snakemake 规则开始运行并立即失败，没有抛出任何错误，但 Snakemake 继续运行

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1