Finding common elements in arrays

Question

I have a hash whose values are arrays. I need to find the common elements of those arrays, ie. the elements that are present in all the arrays. So I extracted the values of the hash into a multidimensional array whose each row corresponds to an array in the hash. Then I took the first row of this matrix into another array (@arr1) and iterated through it to find if there was any element in arr1 that was also in the rest of the rows of the matrix. If such an element is found, it is pushed onto another array that contains the final list of all the elements. The code is as follows (I hope it is clear enough):

sub construct_arr(my %records) {
    my $len = keys %records;
    my @matrix;
    my $i = 0;

    # Extract the values of the hash into a matrix
    foreach my $key (keys %records) {
        $matrix[$i] = $records{$key};
        $i++;   
    }

    my @arr1 = $matrix[0];
    my @final;

    # Iterate through each element of arr1
    for my $j (0..$#{$arr1[0]}) {
        my $count = 1;

        # Iterate through each row of the matrix, starting from the second
        for ( my $i = 1; $i < $len ; $i++ ) {
            my $flag = 0;

            # Iterate through each element of the row
            for my $k (0..$#{$matrix[$i]}) {
                if ($arr1[0][$j] eq $matrix[$i][$k]) {
                    $flag = 1;
                    $count++;
                }
            }

            # On finding the first instance of the element in a row, go to the next row
            if (!$flag == 1) {
                last;
            }       
        }

        # If element is in all the rows, push it on to the final array
        if ($count == $len) {
            push(@final, $arr1[0][$j]);
        }
    }
    return @final;
}

I know that the above works, but I would like to know if there is any other (perlish) way to do this. I am starting to learn perl and I am very interested in knowing things that could make my work easier in perl as compared to other languages. If my code is the best that can be done, please let me know that too. Any guidance would be appreciated. Thanks!

Answer 1

Take a look at Chris Charley's link for calculating the intersection of arrays.

Hashes are the clear way to go for problems like this. Together with map and grep a solution can be reduced to just a few lines.

This program uses sundar's data for want of anything better, and seems to do what you need.

use strict;
use warnings;

my %records = (
  a => [ qw/ A B C / ],
  b => [ qw/ C D E A / ],
  c => [ qw/ A C E / ],
);

print "$_\n" for construct_arr(\%records);

sub construct_arr {
  my $records = shift;
  my %seen;
  $seen{$_}++ for map @$_, values %$records;
  grep $seen{$_} == keys %$records, keys %seen;
}

output

A
C

Edit

I thought it may help to see a more Perlish, tidied version of your own solution.

use strict;
use warnings;

my %records = (
  a => [ qw/ A B C / ],
  b => [ qw/ C D E A / ],
  c => [ qw/ A C E / ],
);

print "$_\n" for construct_arr(\%records);

sub construct_arr {

  my $records = shift;
  my @matrix = values %$records;
  my @final;

  # iterate through each element the first row
  for my $i ( 0 .. $#{$matrix[0]} ) {

    my $count = 1;

    # look for this value in all the rest of the rows, dropping
    # out to the next row as soon as a match is found
    ROW:
    for my $j ( 1 .. $#matrix ) {
      for my $k (0 .. $#{$matrix[$j]}) {
        next unless $matrix[0][$i] eq $matrix[$j][$k];
        $count++;
        next ROW;
      }
    }

    # If element is in all the rows, push it on to the final array
    push @final, $matrix[0][$i] if $count == @matrix;
  }

  return @final;
}

The output is the same as for my own program, but the functionality is slightly different as mine assumes the values in each row are unique. If the sama value appears more than once my solution will break (the same applies to sundar's ). Please let me know if that is acceptable.

Answer 2

Although the poster explained there aren't duplicates within a single array, here is my attempt which handles that case too (notice the slightly modified test data - "5" should not be printed):

#!/usr/bin/env perl
use warnings;
use strict;

my %records = (
    a => [1, 2, 3],
    b => [3, 4, 5, 1],
    c => [1, 3, 5, 5]
);

my %seen;
while (my ($key, $vals) = each %records) {
    $seen{$_}{$key} = 1 for @$vals;
}

print "$_\n" for grep { keys %{$seen{$_}} == keys %records } keys %seen;

Answer 3

You can find the size of the hash easily using scalar(keys %hash);

Here's an example code that does what you need:

#!/usr/bin/perl

use strict;
use warnings;

my %records = ( a => [1, 2, 3],
                b => [3, 4, 5, 1],
                c => [1, 3, 5]
              );

my %count;
foreach my $arr_ref (values %records) {
    foreach my $elem (@$arr_ref) {
        $count{$elem}++;
    }
}

my @intersection;
my $num_arrays = scalar(keys %records);
foreach my $elem (keys %count) {
    #If all the arrays contained this element, 
    #allowing for multiple entries per array
    if ($count{$elem} >= $num_arrays) {
        push @intersection, $elem;
    }
}

Feel free to comment if you need any clarification in this code. And the second foreach that constructs the @intersection array is written this way only for clarity - if you're learning Perl, I'd suggest you study and rewrite it using the map construct, since that's arguably more idiomatic Perl.

Finding common elements in arrays

Question

3 answers

solution1
6 ACCPTED 2012-04-28 17:51:53

solution2
3 2012-04-29 01:17:32

solution3
1 2012-04-28 16:42:56

Finding common elements in arrays

Question

3 answers

solution1 6 ACCPTED 2012-04-28 17:51:53

solution2 3 2012-04-29 01:17:32

solution3 1 2012-04-28 16:42:56

solution1
6 ACCPTED 2012-04-28 17:51:53

solution2
3 2012-04-29 01:17:32

solution3
1 2012-04-28 16:42:56