Perl列出哈希中所有具有相同值的键

Question

如果我有一个以冒号分隔的文件名FILE，我可以这样做：

cat FILE|perl -F: -lane 'my %hash = (); $hash{@F[0]} = @F[2]'

将第一个和第三个令牌分配为哈希的键=>值对。

1）这是将键值对分配给哈希的一种明智的方法吗？

2）现在最简单的方法是找到具有共享值的所有键并列出它们？

假设FILE看起来像：

 Mike:34:Apple:Male
 Don:23:Corn:Male
 Jared:12:Apple:Male
 Beth:56:Maize:Female
 Sam:34:Apple:Male
 David:34:Apple:Male

所需的输出： Keys with value "Apple": Mike,Jared,David,Sam

Answer 1

您的示例无法按您希望的方式工作，因为-n选项在单行程序周围放置了while循环，因此将为文件中的每个记录创建并保存声明的哈希。 您可以通过不声明哈希来解决该问题，并使其成为一个持久包变量，该变量将保留存储在其中的所有值。

然后，您可以编写push @{ $hash{$F[2]} }, $F[0]但请注意，它应该是$F[0]等，而不是@F[0] ，并且我已经使用push to为每个第3列值创建一个第1列值的列表，而不仅仅是将每个第1列值与其第3列值相关联的一对一值列表。

为了明确起见，您的方法会生成一个类似于以下的哈希，必须对其进行搜索才能生成所需的显示。

(
  Beth  => "Maize",
  David => "Apple",
  Don   => "Corn",
  Jared => "Apple",
  Mike  => "Apple",
  Sam   => "Apple",
)

而我的创建了它，正如您所看到的，它几乎已经以所需的形式存在。

(
  Apple => ["Mike", "Jared", "Sam", "David"],
  Corn  => ["Don"],
  Maize => ["Beth"],
)

但是我认为这个问题太大了，无法使用单行Perl程序解决。 下面的解决方案期望输入文件的路径作为命令行参数，像这样

> perl prog.pl colons.csv

但如果未指定文件，它将默认为myfile.csv 。

use strict;
use warnings;

our @ARGV = 'myfile.csv' unless @ARGV;

my %data;
while (<>) {
  my @fields = split /:/;
  push @{ $data{$fields[2]} }, $fields[0];
}

while (my ($k, $v) = each %data) {
  next unless @$v > 1;
  printf qq{Keys with value "%s": %s\n}, $k, join ', ', @$v;
}

输出

Keys with value "Apple": Mike, Jared, Sam, David

Answer 2

use strict;
use warnings;

open my $in, '<', 'in.txt';
my %data;
while(<$in>){
    chomp;
    my @split = split/:/;
    $data{$split[0]} = $split[2];
}

my $query = 'Apple';

print "Keys with value $query = ";
foreach my $name (keys %data){
    print "$name " if $data{$name} eq $query;
}
print "\n";

Answer 3

数组用于保存值列表，因此请使用数组。

perl -F: -lane'
   push @{ $h{$F[2]} }, $F[0];
   END {
      for my $fruit (keys %h) {
         next if @{ $h{$fruit} } < 2;
         print "$fruit: ", join(",", @{ $h{$fruit} });
      }
   }
' FILE

END块在退出时执行。 在其中，我们遍历哈希键。 如果当前哈希元素的值是一个只有一个元素的数组，则将其跳过。 否则，我们将打印键，然后打印由hash元素引用的数组的内容。

Answer 4

这是另一种方式：

perl -F: -lane'
    push @{ $h{$F[2]} }, $F[0];
}{
    print "$_: ", join(",", @{ $h{$_} }) for grep { @{$h{$_}} > 1 } keys %h;
' file

我们读取每一行并使用第三列作为键，第一列作为匹配键的值列表来创建数组的哈希。 在END块中，我们使用grep和过滤器键（其数组计数大于1）遍历哈希，并打印键和数组元素。

Answer 5

不必是一个班轮，

好。 不会...

这是将键值对分配给哈希的明智方法吗？

您只需将键值对分配为：

$hash{"key"} = "value";

事情就这么简单。 通过map可能有一种方法。 但是，我看到的主要问题是如果您有重复的密钥，应该怎么办。

假设您的文件如下所示：

Mike:34:Apple:Male
Don:23:Corn:Male
Jared:12:Apple:Male
Beth:56:Maize:Female
Sam:34:Apple:Male
David:34:Apple:Male   # Note this entry is here twice!
David:35:Wheat:Male   # Note this entry is here twice!

让我们做一个简单的赋值循环：

my %hash;
while my $line ( <$fh> ) {
    chomp $line;
    my ($name, $age, $category, $sex) = split /:/, $line;
    $hash{$name} = $category;
}

当您使用$hash{David} ，它将首先设置为Apple ，但随后将其值更改为Wheat 。 有四种方法可以解决此问题：

使用任何最后的值。 循环中没有变化。
使用第一个值，然后忽略后续值。 做起来很简单。
如果发生这种情况，那就是错误。 中止程序并报告错误。
保留所有值。

最后一个是最有趣的，因为它涉及到对数组的引用作为哈希值：

my %hash;
while my $line ( <$fh> ) {
    chomp $line;
    my ($name, $age, $category, $sex) = split /:/, $line;
    $hash{$name} = [] if not exists $hash{$name};   # I'm making this an array reference
    push @{ $hash{$name} }, $category;
}

现在，哈希中的每个值都是对数组的引用：

my @values = @{ $hash{David} );   # The values of David...
print "David is in categories " . join ( ", ", @values ) . "\n";

这将打印出来David is in categories Wheat, Apple

现在，找到具有共享值的所有键并列出它们的最简单方法是什么？

最简单的方法是创建第二个由您的值作为键的哈希。 在此哈希中，您将需要使用数组引用。 现在假设没有重复的名称：

my %hash;
my %indexed_hash;
while my $line ( <$fh> ) {
    chomp $line;
    my ($name, $age, $category, $sex) = split /:/, $line;
    $hash{$name} = $category;

    my $indexed_hash{$category} = [] if not exist $indexed_hash{$category};
    push @{ $indexed_hash{$category} }, $name;
}

现在，如果要查找Apple所有重复项：

my @names = @{ $indexed_hash{Apple} };
print "The following are in 'Apple': " . join ( ", " @names ) . "\n";

由于我们正在研究参考，因此可以更进一步，将文件的所有值存储在哈希中。 再次，为简单起见，我假设您每个名称只有一个条目：

my %hash;
while my $line ( <$fh> ) {
    chomp $line;
    my ($name, $age, $category, $sex) = split /:/, $line;
    $hash{$name}->{AGE}      = $age;
    $hash{$name}->{CATEGORY} = $category;
    $hash{$name}->{SEX}      = $sex;
}

for my $name ( sort keys %hash ) {
    print "$name Information:\n";
    print "    Age: " . $hash{$name}->{AGE} . "\n";
    printf "Category: %s\n",  $hash{$name}->{CATEGORY};
    print "    Sex: @{[$hash{$name}->{SEX}]}\n\n";
}

最后两个语句是将复杂数据结构内插到字符串中的简便方法。 printf很清楚。 第二个@{[...]}是一个巧妙的小把戏。

Answer 6

你尝试了什么？

如果将哈希reverse为值=>键对的列表，然后对列表使用List :: Util的pairs() ，则可以将哈希值转换为值=>键arrayrefs的哈希。 即( foo => [ 'bar', 'baz' ] ) ， grep {@{$hash{$_}} > 1} keys %hash ，并打印结果。

Perl列出哈希中所有具有相同值的键

问题描述

6 个解决方案

解决方案1
3 已采纳 2014-08-05 15:23:49

解决方案2
1 2014-08-05 15:18:54

解决方案3
1 2014-08-05 15:45:58

解决方案4
1 2014-08-05 17:00:47

解决方案5
1 2014-08-05 21:06:05

解决方案6
0 2014-08-05 15:15:21

Perl列出哈希中所有具有相同值的键

问题描述

6 个解决方案

解决方案1 3 已采纳 2014-08-05 15:23:49

解决方案2 1 2014-08-05 15:18:54

解决方案3 1 2014-08-05 15:45:58

解决方案4 1 2014-08-05 17:00:47

解决方案5 1 2014-08-05 21:06:05

解决方案6 0 2014-08-05 15:15:21

解决方案1
3 已采纳 2014-08-05 15:23:49

解决方案2
1 2014-08-05 15:18:54

解决方案3
1 2014-08-05 15:45:58

解决方案4
1 2014-08-05 17:00:47

解决方案5
1 2014-08-05 21:06:05

解决方案6
0 2014-08-05 15:15:21