我正在尝试制作一个在 GKE 上运行的教程管道。LINK
失败的具体规则是
rule salmon_quant:
output: directory("salmon.{sample}")
input:
index = "Saccharomyces_cerevisiae.R64-1-1.salmon_index",
fq1 = "trimmed/{sample}_1.fq",
fq2 = "trimmed/{sample}_2.fq",
shell:
"salmon quant -i {input.index} -l A -1 {input.fq1} -2 {input.fq2} --validateMappings -o {output}"
rule salmon_index:
output:
idx = directory("{strain}.salmon_index")
input:
fasta = "transcriptome/{strain}.cdna.all.fa.gz"
shell:
"salmon index -t {input.fasta} -i {output.idx} -k 31"
我从 Pod 日志中得到的错误显示:
Downloading from remote: smk_demo/trimmed/ref_3_2.fq
Finished download.
Downloading from remote: smk_demo/Saccharomyces_cerevisiae.R64-1-1.salmon_index
Finished download.
Downloading from remote: smk_demo/trimmed/ref_3_1.fq
Finished download.
Activating conda environment: .snakemake/conda/c7537b64e219a0e5525266c7cc140f93_
Version Info: Could not resolve upgrade information in the alotted time.
Check for upgrades manually at https://combine-lab.github.io/salmon
### salmon (mapping-based) v1.2.1
#
# other salmon output omitted
#
[2023-07-26 05:28:49.492] [jointLog] [info] There is 1 library.
Exception : [Error: The index version file smk_demo/Saccharomyces_cerevisiae.R64-1-1.salmon_index/versionInfo.json doesn't seem to exist. Please try re
salmon quant was invoked improperly.
我检查了容器内部:
k exec -it snakejob-21242d20-1417-50c4-a83b-8e5e4ae5f8d3 -- ls /workdir/smk_demo/Saccharomyces_cerevisiae.R64-1-1.salmon_index/
ls: cannot access '/workdir/smk_demo/Saccharomyces_cerevisiae.R64-1-1.salmon_index/': Not a directory
command terminated with exit code 2
我确认规则
salmon_index
生成的目录(以及其中的文件)确实已上传到存储桶中。
由于 GCS 存储桶并没有真正的“目录”。我认为原因是:
Downloading from remote: smk_demo/Saccharomyces_cerevisiae.R64-1-1.salmon_index
它已作为文件下载。
我尝试在输入末尾添加尾部
/
:
index = "Saccharomyces_cerevisiae.R64-1-1.salmon_index/",
这没有帮助。
我使用的是snakemake 7.28.3,我找不到任何与我的问题相符的错误或票证。
因为directory()仅适用于输出,不适用于输入。所以我不知道是否有另一种方法让snakemake知道它需要从GCS下载输入作为目录?