我有一个文件夹,其中包含以下格式的双端测序数据:
Bk1-ITS2_S102_L001_R1_001.fastq.gz
Bk1-ITS2_S102_L001_R2_001.fastq.gz
Bk2-ITS2_S103_L001_R1_001.fastq.gz
Bk2-ITS2_S103_L001_R2_001.fastq.gz
Fl1-ITS2_S201_L001_R1_001.fastq.gz
Fl1-ITS2_S201_L001_R2_001.fastq.gz
Fl2-ITS2_S202_L001_R1_001.fastq.gz
Fl2-ITS2_S202_L001_R2_001.fastq.gz
Mn1-ITS2_S401_L001_R1_001.fastq.gz
Mn1-ITS2_S401_L001_R2_001.fastq.gz
Mn2-ITS2_S402_L001_R1_001.fastq.gz
Mn2-ITS2_S402_L001_R2_001.fastq.gz
我想使用循环或一些通配符对整个文件夹运行以下命令,因为对每对读取执行此操作很麻烦并且需要很多时间:
reformat.sh in=Bk1-ITS2_S102_L001_R1_001.fastq.gz in2=Bk1-ITS2_S102_L001_R2_001.fastq.gz out=./reformat/Bk1-ITS2_S102_L001_R1_001_reformatted.fastq.gz out2=./reformat/Bk1-ITS2_S102_L001_R2_001_reformatted.fastq.gz mincalledquality=2 maxcalledquality=41 qin=33
这样我就可以为所有具有修改后的质量分数的文件实现以下目标,并在每个文件的名称中添加一个“重新格式化”的附加字符串。
Bk1-ITS2_S102_L001_R1_001_reformatted.fastq.gz
Bk1-ITS2_S102_L001_R2_001_reformatted.fastq.gz
Bk2-ITS2_S103_L001_R1_001_reformatted.fastq.gz
Bk2-ITS2_S103_L001_R2_001_reformatted.fastq.gz
Fl1-ITS2_S201_L001_R1_001_reformatted.fastq.gz
Fl1-ITS2_S201_L001_R2_001_reformatted.fastq.gz
Fl2-ITS2_S202_L001_R1_001_reformatted.fastq.gz
Fl2-ITS2_S202_L001_R2_001_reformatted.fastq.gz
Mn1-ITS2_S401_L001_R1_001_reformatted.fastq.gz
Mn1-ITS2_S401_L001_R2_001_reformatted.fastq.gz`
Mn2-ITS2_S402_L001_R1_001_reformatted.fastq.gz
Mn2-ITS2_S402_L001_R2_001_reformatted.fastq.gz
我一次只能进行一对读取。 有人可以帮助我如何实现这一目标吗?
这是使用 bash 执行此操作的一种方法(我认为“`”是一个拼写错误):
for f in *fastq.gz; do mv $f ${f/.*fastq.gz/}_reformatted.fastq.gz; done