简体   繁体   中英

Perl: Transform an array of hashes into a matrix

I have an array of hashes, many of which have shared keys.

I would like to convert this into a matrix for analysis in [R], such that each row represents a hash, and each unique key is a column, which is (blank) or '.' or 'NA' if the hash does not contain that particular key.

Currently I'm planning to find each unique key in the array of hashes, and construct my matrix by looping through each of these for each hash... but there must be a better way??

Thanks!

Example:

my %hash_A = (
  A=> 12,
  B=> 23,
  C=> 'a string'
  );
my %hash_B = (
  B=> 23,
  C=> 'a different string',
  D=> 99
  );

To give:

A,B,C,D
12,23,'a string',NA
NA, 23, 'a different string', 99

If you make sure each of your hashes are initialized to "NA" for each possible key, then you basically have a matrix and you can just print it out... (the data should get overwritten when it is not "NA")

If you can't initialize them, then simply keep track of all possible keys beforehand, and then loop them while printing your data structure (instead of looping through the keys of each individual hash).

my @possibleKeys = keys %possibleKeys;
foreach my $hashref (@arrayOfHashes)
    foreach my $key (@possibleKeys) {
        if(!defined ${$hashref}{$key}) { 
            print "NA "; 
        else { 
            print "$hashref{$key} "; 
        }
    print "\n"; 
    }
}

Edit: keys %possibleKeys will return differently ordered array for each invocation (See http://perldoc.perl.org/functions/keys.html ) therefor the keys should be stored in an array to preserve order.

This should convert an array of hashes into a 2D array ( @output1 ).

All output cells where there was no corresponding input value will be populated with 'NA' . (If you don't mind unmapped cells being mapped to undef , then this can be done more concisely — see @output2 .)

The array @keys will say which hash key relates each index position in the output rows.

my @array_of_hashes = ...;

my %keys

for my $hash (@array_of_hashes) {
    @keys{keys %$hash} = ();
}

my @keys = sort keys %keys;

my @output1 = map {
    my $hash = $_;

    [ map { exists $$hash{$_} ? $$hash{$_} : 'NA' } @keys ];
} @array_of_hashes;

my @output2 = map [ @$_{@keys} ] => @array_of_hashes;
my @a = ( keys %hash_A, keys %hash_B );
my %r;
@r{@a} = @a;
for my $h ( \%r, \%hash_A, \%hash_B ) {
    print join( ', ', map { $$h{$_} ||= 'NA' } sort keys %r ), "\n";
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM