我想编写 Perl 代码来检查终止密码子并将其替换为
NNN
。我写的代码如下:
#!/usr/bin/perl
use strict;
use warnings;
# Check if the file name is provided as an argument
my $file = $ARGV[0];
open(my $fh, "<", $file) or die "Unable to open file";
my $sequence = "";
my $id = "";
while (my $line = <$fh>) {
chomp($line);
if ($line =~ /^>/) {
if ($sequence ne "") {
# Split sequence into codons
my @codon = $sequence =~ /.{1,3}/g;
print join(" ", @codon), "\n";
print $id, "\n";
# Check for stop codons and replace them with "NNN"
foreach my $codon (@codon) {
if ($codon =~ /^(TAG|TGA|TAA)/) {
$codon = "NNN";
}
}
}
$sequence = "";
$id = $line;
} else {
$sequence .= $line;
}
}
# Print last sequence
if ($sequence) {
my @codon = $sequence =~ /.{1,3}/g;
print join(" ", @codon), "\n";
print $id, "\n";
}
close($fh) or die "Unable to close file";
应该从命令行获取输入,即 fasta 序列,并对其进行处理:将序列拆分为三的倍数,用
NNN
替换终止密码子。
我的输入序列如下:
>header
ATGGACCAGCAGCAGCAGCAGCAGTAA
我期待着类似的事情:
>header
ATGGACCAGCAGCAGCAGCAGCAGNNN
它也没有处理文件中的最后一个序列,我得到的输出为:
>header
ATG GAC CAG CAG CAG CAG CAG CAG TAA
除了第一个序列的标头和最后一个标头的序列丢失。
替换没有发生,因为你的程序逻辑不正确。以下条件永远不会成立,因此您的替换代码不会被执行:
if ($sequence ne "")
然后,在
# Print last sequence
代码中,您不要尝试进行替换。
这是一个进行替换的独立示例:
use warnings;
use strict;
while (my $line = <DATA>) {
chomp($line);
if ($line =~ /^>/) {
print "$line\n";
} else {
# Split sequence into codons
my @codon = $line =~ /.{1,3}/g;
# Check for stop codons and replace them with "NNN"
foreach my $codon (@codon) {
if ($codon =~ /^(TAG|TGA|TAA)/) {
$codon = "NNN";
}
}
print join(" ", @codon), "\n";
}
}
__DATA__
>header
ATGGACCAGCAGCAGCAGCAGCAGTAA
输出:
>header
ATG GAC CAG CAG CAG CAG CAG CAG NNN
您熟悉bioperl吗?