Perl根据计数将哈希分成相等的部分并发送以并行执行

问题描述 投票:0回答:2

我有一个哈希(%hash),其中包含节点列表以及需要为各个节点执行的命令。

在此之前,我要列出要在其中执行主机的主机(@alive_hosts)。

这是我的代码:

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper;

my @alive_hosts = qw/10.0.0.1 10.0.0.2/;
print Dumper(\@alive_hosts);

my %hash = (
          'Node1' => 'cmd1 | cmd2 | cmd3',
          'Node2' => 'cmd2 | cmd3',
          'Node3' => 'cmd4 | cmd1',
          'Node4' => 'cmd1',
          'Node5' => 'cmd2',
          'Node6' => 'cmd1 | cmd2',
          'Node7' => 'cmd3 | cmd4',
);
print Dumper(\%hash);

my $num_buckets = scalar @alive_hosts;
print "num_buckets:$num_buckets\n"; 

my $no_of_nodes = scalar keys %hash;

my $per_bucket  = int( $no_of_nodes / $num_buckets ); 
print "per_bucket:$per_bucket\n";

my $num_extras  =      $no_of_nodes % $num_buckets; 
print "num_extras:$num_extras\n";

我想以某种方式对哈希(%hash)进行划分,即应根据活动主机的数量对哈希进行划分。以便将其分发给每个主机。在上面的示例中,Host1(10.0.0.1)应该包含:

'Node1' => 'cmd1 | cmd2 | cmd3',
'Node2' => 'cmd2 | cmd3',
'Node3' => 'cmd4 | cmd1',
'Node4' => 'cmd1'

Host2(10.0.0.2)应该包含:

'Node5' => 'cmd2',
'Node6' => 'cmd1 | cmd2',
'Node7' => 'cmd3 | cmd4'

以上这两个值可以保存在新的哈希中,然后我需要执行一个shell脚本,将上述值(即Node和cmds)作为参数并行传递。为了并行执行此操作,我想使用Parallel::LoopsParallel::ForkManager。任何想法/建议将不胜感激。


更新

一旦创建了新的哈希,就需要运行shell脚本-script.sh,该脚本应将这些值用作参数。

my $trigger_script = "script.sh";

my $pm = Parallel::ForkManager->new(5);

DATA_LOOP:
foreach my $n (1 .. $num_buckets) {
    if( exists $node_hash{$n} ) {
        my $pid = $pm->start and next DATA_LOOP;
    } 

    foreach my $ip( keys %node_hash){
        for my $node (keys %{$node_hash{$ip}}){
            say $ip." ".$trigger_script." ".$node." ".$node_hash{$ip}{$node};
        }
    }
    $pm->finish;
    print "*******\n";
}
$pm->wait_all_children;

我已经使用Parallel::ForkManager模块使该脚本并行执行script.sh

我得到的输出类似于下面的内容,这是不正确的。

10.0.0.2 script.sh Node6 cmd1 | cmd2
10.0.0.2 script.sh Node2 cmd2 | cmd3
10.0.0.2 script.sh Node3 cmd4 | cmd1
10.0.0.1 script.sh Node1 cmd1 | cmd2 | cmd3
10.0.0.1 script.sh Node5 cmd2
10.0.0.1 script.sh Node4 cmd1
10.0.0.1 script.sh Node7 cmd3 | cmd4
*******
10.0.0.2 script.sh Node6 cmd1 | cmd2
10.0.0.2 script.sh Node2 cmd2 | cmd3
10.0.0.2 script.sh Node3 cmd4 | cmd1
10.0.0.1 script.sh Node1 cmd1 | cmd2 | cmd3
10.0.0.1 script.sh Node5 cmd2
10.0.0.1 script.sh Node4 cmd1
10.0.0.1 script.sh Node7 cmd3 | cmd4

我想获得如下输出:

10.0.0.2 script.sh Node6 cmd1 | cmd2
10.0.0.2 script.sh Node2 cmd2 | cmd3
10.0.0.2 script.sh Node3 cmd4 | cmd1
10.0.0.1 script.sh Node1 cmd1 | cmd2 | cmd3
10.0.0.1 script.sh Node5 cmd2
10.0.0.1 script.sh Node4 cmd1
10.0.0.1 script.sh Node7 cmd3 | cmd4

我认为我在迭代部分缺少某些东西。需要专家建议。

perl hash parallel-processing fork
2个回答
3
投票

您已经计算出每个新哈希中需要多少个节点。这样,您就可以从大哈希表中获取键列表,并且每次循环时只需将其编号从slice()中删除即可。

类似这样的东西:

#!/usr/bin/perl

use strict;
use warnings;
use 5.20; # For the new hash slices
use feature 'say';

use Data::Dumper;

my @alive_hosts = qw/10.0.0.1 10.0.0.2/;
print Dumper(\@alive_hosts);

my %hash = (
          'Node1' => 'cmd1 | cmd2 | cmd3',
          'Node2' => 'cmd2 | cmd3',
          'Node3' => 'cmd4 | cmd1',
          'Node4' => 'cmd1',
          'Node5' => 'cmd2',
          'Node6' => 'cmd1 | cmd2',
          'Node7' => 'cmd3 | cmd4',
);
print Dumper(\%hash);

my $no_of_nodes = scalar keys %hash;
my $num_buckets = scalar @alive_hosts;

my $per_bucket  = int( $no_of_nodes / $num_buckets );
$per_bucket++ if $no_of_nodes % $num_buckets;

my @keys = keys %hash;

my %node_hash;

for (1 .. $num_buckets) {
  my @newkeys = splice @keys, 0, $per_bucket;

  $node_hash{$alive_hosts[$_ - 1]} = { %hash{@newkeys} }; # New hash slice syntax
}

say Dumper \%node_hash;

注意:我使用new(ish)(自Perl 5.20起)%hash{...}哈希切片语法。如果您使用的是Perl的早期版本,则需要调整该行。


-1
投票

看看您是否认为可接受的下一种方法

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my @alive_hosts = qw/10.0.0.1 10.0.0.2/;
print Dumper(\@alive_hosts);

my %hash = (
          'Node1' => 'cmd1 | cmd2 | cmd3',
          'Node2' => 'cmd2 | cmd3',
          'Node3' => 'cmd4 | cmd1',
          'Node4' => 'cmd1',
          'Node5' => 'cmd2',
          'Node6' => 'cmd1 | cmd2',
          'Node7' => 'cmd3 | cmd4',
);
print Dumper(\%hash);

my %dispatch;
my @hosts;

while( my($node,$cmd) = each %hash ) {
    @hosts = @alive_hosts unless @hosts;
    my $host = shift @hosts;
    $dispatch{$host}{$node} = $cmd;

}

say Dumper(\%dispatch);

输出

$VAR1 = [
          '10.0.0.1',
          '10.0.0.2'
        ];
$VAR1 = {
          'Node1' => 'cmd1 | cmd2 | cmd3',
          'Node4' => 'cmd1',
          'Node6' => 'cmd1 | cmd2',
          'Node5' => 'cmd2',
          'Node3' => 'cmd4 | cmd1',
          'Node7' => 'cmd3 | cmd4',
          'Node2' => 'cmd2 | cmd3'
        };
$VAR1 = {
          '10.0.0.1' => {
                          'Node6' => 'cmd1 | cmd2',
                          'Node1' => 'cmd1 | cmd2 | cmd3',
                          'Node3' => 'cmd4 | cmd1',
                          'Node2' => 'cmd2 | cmd3'
                        },
          '10.0.0.2' => {
                          'Node4' => 'cmd1',
                          'Node5' => 'cmd2',
                          'Node7' => 'cmd3 | cmd4'
                        }
        };
© www.soinside.com 2019 - 2024. All rights reserved.