在我的数组的Perl哈希中消除未初始化的值

Question

我成功地创建了一个数组哈希，我用它来计算文件中每个DNA序列的对数分数（Creating a hash of arrays for DNA sequences, Perl有输入文件格式）。我得到每个序列的分数，但我得到每个计算的警告。当然，我想澄清警告。警告是：Use of uninitialized value in string eq at line 148。

这是代码的摘要版本（如果需要，我可以发布完整代码）：

use strict;
use warnings;
use Data::Dumper;

#USER SPECIFICATIONS
print "Please enter the filename of the fasta sequence data: ";
my $filename1 = <STDIN>;

#Remove newline from file
chomp $filename1;

#Open the file and store each dna seq in hash
my %id2seq = ();
my %HoA = ();
my %loscore = ();
my $id = '';
open (FILE, '<', $filename1) or die "Cannot open $filename1.",$!;
my $dna;
while (<FILE>)
{
    if($_ =~ /^>(.+)/)
    {
         $id = $1; #Stores 'Sequence 1' as the first $id, for example
    }
    else
    {
        $HoA{$id} = [ split(//) ]; #Splits the contents to allow for position reference later
        $id2seq{$id} .= $_; #Creates a hash with each seq associated to an id number, used for calculating tables that have been omitted for space
        $loscore{$id} .= 0; #Creates a hash with each id number to have a log-odds score
    }
}
close FILE;

#User specifies motif width
print "Please enter the motif width:\n";
my $width = <STDIN>;

#Remove newline from file
chomp $width;

#Default width is 3 (arbitrary number chosen)
if ($width eq '')
{
    $width = 3;
}

#Omitting code about $width<=0, creation of log-odds score hash to save space

foreach $id (keys %HoA, %loscore)
{
    for my $pos (0..($width-1))
    {
        for my $base (qw( A C G T))
        {
            if ($HoA{$id}[$pos] eq $base) #ERROR OCCURS HERE
            {
                $loscore{$id} += $logodds{$base}[$pos];
            }
            elsif ( ! defined $HoA{$id}[$pos]) 
            {
                print "$pos\n"; 
            }
        }
    }
}
print Dumper(\%loscore);

我得到的输出是：

Use of uninitialized value in string eq at line 148, <STDIN> line 2.
2
(This error repeats 4 times for each position - most likely due to matching to each $base?)

$VAR1 = {
         'Sequence 15' => '-1.27764697876093',
         'Sequence 4' => '0.437512962981119',
         (continues for 29 sequences)
        }

总而言之，我想计算每个序列的对数概率分数。我有一个log-odds得分哈希值%loscore，其中包含主题中每个位置的基数得分。通过对参考值求和来计算对数赔率分数。例如，如果log-odds表是

序列CAG的对数分数得分为7+3+2=12。

目前，我认为错误的发生是因为我将DNA字符串拆分为数组哈希的方式。正如我之前所说，如果您想要所有代码以便复制粘贴，我可以提供它。我认为解决方案非常简单，我只需要有人指出我正确的方向。任何和所有的帮助表示赞赏，我可以在问题出现时澄清。此外，任何可以帮助我发布更简洁问题的提示都很受欢迎（我知道这个很长，我只想提供足够的背景信息）。

Answer 1

这是我用来迭代`％HoA的代码。它计算每个序列的对数 - 分数，然后通过每个序列找到每个序列的最大分数。非常感谢大家的帮助！

foreach $id (keys %HoA)
{
    for my $pos1 (0..length($HoA{$id})-1)
    {
        for my $pos2 ($pos1..$pos1+($width-1))
        {
            for my $base (qw( A C G T))
            {
                if ($HoA{$id}[$pos2] eq $base)
                {
                    for my $pos3 (0..$width-1)
                    {
                        $loscore{$id} += $logodds{$base}[$pos3];

                        if ($loscore{$id} > $maxscore{$id})
                        {
                            $maxscore{$id} = $loscore{$id};
                        }
                    }
                }
                elsif ( ! defined $HoA{$id}[$pos2])
                {
                    print "$pos2\n";
                }
            }
        }
    }
}

在我的数组的Perl哈希中消除未初始化的值

问题描述投票：0回答：1

1个回答

最新问题

在我的数组的Perl哈希中消除未初始化的值

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1