问题
我想将我的'value'变量从我的字典(从一个简单的csv文件创建)传递到python中的子进程sed调用,问题是我得到一个错误:
sed:-e表达式#1,char 1:未知命令:`''
当我运行以下脚本时:
import sys
import subprocess
speciesdictfile = open("speciesfiletest.csv",'r')
file = sys.argv[1]
dict = {}
for line in speciesdictfile:
fields = line.split(',')
dict[fields[0]] = fields[1]
for line in file:
for key, value in dict.items():
if file == key:
subprocess.call(["sed", "'s/>/>" + value + "_/g'", file])
当我尝试这样做时:
subprocess.call(['sed', 's/>/>' + value + '_/g', file])
我收到以下错误:
sed:-e expression#1,char 30:unterminated`s'命令
示例输入
字典CSV文件:
file,Species
GCF_000006175.1_ASM617v2_genomic.faa,Methanococcus voltae
GCF_000006805.1_ASM680v1_genomic.faa,Halobacterium sp.
我想要搜索和替换的文件,例如文件名为GCF_000006175.1_ASM617v2_genomic.faa:
>NZ_LT985082.1_1_1
EQVWKSIKKYMAYYLFDTIEFMEKLFEKEFYRIVNRDSYYKNWISKFIMIN*
>NZ_LT985082.1_2_1
MKFNISKLWNPTGFFISFFMSFLMPIMFAVPFGYIPIDIFLYQQLIRWPVAYFIVTLIVI
PISLYLAKSFFTFPPTDRFFNPVTFFISLQMSFIMPFLLGYGFGSMSLNILFLMWPMRWV
VAYFMVNFAIRPLSISLARIVFNVEPQHLIIKF*
期望的输出
一个工作的sed命令,用'>'替换行上'>'的每个实例,后面跟着没有空格的值变量,例如:
>Methanococcus_voltae_NZ_LT985082.1_1_1
EQVWKSIKKYMAYYLFDTIEFMEKLFEKEFYRIVNRDSYYKNWISKFIMIN*
>Methanococcus_voltae_NZ_LT985082.1_2_1
MKFNISKLWNPTGFFISFFMSFLMPIMFAVPFGYIPIDIFLYQQLIRWPVAYFIVTLIVI
PISLYLAKSFFTFPPTDRFFNPVTFFISLQMSFIMPFLLGYGFGSMSLNILFLMWPMRWV
VAYFMVNFAIRPLSISLARIVFNVEPQHLIIKF*
问题是从csv文件中获取了换行符。我解决了它:
import sys
import subprocess
speciesdictfile = open("speciesfiletest.csv",'r')
file = sys.argv[1]
dict = {}
for line in speciesdictfile:
fields = line.rstrip().split(',')
dict[fields[0]] = fields[1]
for line in file:
for key, value in dict.items():
if file == key:
subprocess.call("sed -e 's/>/>" + value + "_/g' " + file, shell=True)
这条线
fields = line.rstrip().split(',')
停止存储在ditionary中的换行符,这允许它们在subprocess.call sed命令中使用。