简体   繁体   中英

how to transform an array of integers into regions in perl?

I now have an integer array like, (1,13,3,5,6,7,11,10,8,2,12) and I want to get the consecutive sub-regions form this array, the results of above array should be (1,3), (5,8), (10,13)

The array maybe very large. Do anybody have any ideas?

Thanks a lot.

You can use the Set::IntSpan module:

use strict;
use warnings;

use Data::Dump;
use Set::IntSpan;

my @values = (1, 13, 3, 5, 6, 7, 11, 10, 8, 2, 12);
my $set = Set::IntSpan->new(@values);
my @spans = $set->spans;

dd @spans;

Outputs:

([1, 3], [5, 8], [10, 13])

If a run contains only one number, the lower and upper bounds will be the same, eg [42, 42] .

Performance

As salva pointed out in the comments, Set::IntSpan does not perform well with a large number of ranges. An alternative is Set::IntSpan::Fast , which according to the documentation uses binary searches and tends toward O log N performance. If you install Set::IntSpan::Fast::XS as well, you will get even better performance (no need to change the use statement, the XS version will be used automatically if it's installed).

The following iterates through the ranges and pushes them onto an array:

use strict;
use warnings;

use Data::Dump;
use Set::IntSpan::Fast;

my @values = (1, 13, 3, 5, 6, 7, 11, 10, 8, 2, 12);
my $set = Set::IntSpan::Fast->new;
$set->add(@values);

my @ranges;
my $iter = $set->iterate_runs;
while (my ($from, $to) = $iter->()) {
    push @ranges, [ $from, $to ];
}

dd @ranges;

Outputs:

([1, 3], [5, 8], [10, 13])

Note that to do anything useful with the ranges, you'll have to iterate through this array; it would be more efficient to do the work as you iterate through the set the first time instead of iterating through two different structures.

Using a hash, get some random element and look for consecutive elements both before and after:

my @ints = (...);
my %ints = map { $_ => 1 } @ints;
my @ranges;
while (keys %ints) {
  my $bottom = my $top = each %ints;
  delete $ints{$bottom};
  1 while (delete $ints{--$bottom});
  1 while (delete $ints{++$top});
  push @ranges, [$bottom + 1, $top - 1];
}
say join ', ', map "$_->[0]-$_->[1]", @ranges;

If you want to get the job done well and quickly with minimal effort on your part, use Set::IntSpan as ThisSuitIsBlackNot suggests in his answer .

If you want a DIY job, then you can consider using this code as a basis:

#!/usr/bin/env perl
use strict;
use warnings;

$, = " ";

my @data = (1, 13, 3, 5, 6, 7, 11, 10, 8, 2, 12);

sub pr_region
{
    my($lo, $hi) = @_;
    print "($lo";
    print ", $hi" if ($lo != $hi);
    print ")\n";
}

sub print_regions
{
    my(@data) = @_;
    print "Raw: ", @data, "\n";

    my @sorted = sort { $a <=> $b } @data;
    #print "Sorted: ", @sorted, "\n";

    my $lo = $sorted[0];
    for my $i (1 .. scalar(@sorted)-1)
    {
        if ($sorted[$i-1] != $sorted[$i] - 1 &&
            $sorted[$i-1] != $sorted[$i])
        {
            pr_region($lo, $sorted[$i-1]);
            $lo = $sorted[$i];
        }
    }
    pr_region($lo, $sorted[$#sorted]);
}

print_regions(@data);
print_regions(1);
print_regions(1, 10);
print_regions(1, 2, 10);
print_regions(1, 9, 10);
print_regions(@data, 11, 3, 19, -3);

The output from it is:

Raw:  1 13 3 5 6 7 11 10 8 2 12 
(1, 3)
(5, 8)
(10, 13)
Raw:  1 
(1)
Raw:  1 10 
(1)
(10)
Raw:  1 2 10 
(1, 2)
(10)
Raw:  1 9 10 
(1)
(9, 10)
Raw:  1 13 3 5 6 7 11 10 8 2 12 11 3 19 -3 
(-3)
(1, 3)
(5, 8)
(10, 13)
(19)

I've made no effort at minimizing the code. It prints its results rather than packaging them in a data structure for reuse. It doesn't handle an empty array.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM