简体   繁体   中英

Perl splitting File into array and reading from subroutine

I have created a subroutine that creates a HoA out of the tab delimted below.

header_map.txt:

account_number_header   account
account_number_header   Account #
account_number_header   Account No.
account_number_header   Account number
account_number_header   Account_Id
first_Name_header   name1
first_Name_header   first name
first_Name_header   account name1
first_Name_header   first_name
first_Name_header   f name
last_Name_header    name2
last_Name_header    last name
last_Name_header    account name2
last_Name_header    last_name
last_Name_header    l name
address_header  address1
address_header  address
address_header  addresses
address_header  place of residency
address_header  location

The sub then bounces the array off the values of given keys (shown below). Where the values match the array, the index of the matching array element is returned. What I want to do is instead of searching a predefined constant array, I want to search through an array that is read from a file, or in this case data. The working code is below for the constant array.

my @fields = ('Account No.','name1','name2','location'); #array being searched
my $hm = "header_map.txt"; #declare variable to file
my $fh = (readfile($hm));  #declare variable to sub routine call

my $address_header = 'address_header'; #my given key
my $address = hashofarray($fh,$address_header); #looking for($fh,key) in sub
my $account_number_header = 'account_number_header'; #my given key
my $account_number = hashofarray($fh,$account_number_header); #looking for($fh,key) in sub
print $address,",",$account_number,"\n"; #prints desired array indexes of given keys

sub hashofarray {
    my $fh = shift;
    my $key = shift;
    my %hash;
    while (<$fh>) { # creating HoA
        chomp;
        my ( $key, $value  ) = split /\t/;
        push (@{ $header_map{$key} }, $value);
    }
    foreach my $key1 (@{$header_map{$key}}) {
        if (my @index = grep { $fields[$_] eq $key1 } 0..$#fields) {
            return $index[0];
        }
    }
}

sub readfile {
    my $file = shift;
    open my $f, '<', $file or die $!;
    return $f;
}

RESULTS

location,Account No.

This is good and what I want, however I would like to read the array @fields from DATA file instead. Here is my attempt while reading DATA.

Failed Attempt

my $hm = "O:/josh/trade_data/mock_header_map.txt"; # declare variable to file
my $fh = (readfile($hm)); # declare variable to sub routine call

while (<DATA>) { # calling the subroutine after reading DATA
    my @fields = split /\t/;
    my $address_header = 'address_header'; # my given key
    my $address = hashofarray($fh, $address_header); # looking for($fh, key) in sub
    my $account_number_header = 'account_number_header'; # my given key
    # looking for($fh, key) in sub
    my $account_number = hashofarray($fh, $account_number_header);
    # prints desired array indexes of given keys
    print $address, ",", $account_number, "\n";
}

sub hashofarray {
    my $fh = shift;
    my $key = shift;
    my %hash;
    while (<$fh>) {  #creating HoA
        chomp;
        my ( $key, $value  ) = split /\t/;
        push (@{ $header_map{$key} }, $value);
    }
    foreach my $key1 (@{$header_map{$key}}) {
        if(my @index = grep { $fields[$_] eq $key1 } 0..$#fields) {
            return $index[0];
        } else {
            print "not found";
        }
    }
}

sub readfile {
    my $file = shift;
    open my $f, '<', $file or die $!;
    return $f;
}


__DATA__
Account No  name1   name2   location
1   josh    smith   411 s chirris ave. sometown st 12345
1   josh    smith   411 s chirris ave. sometown st 12345
1   josh    smith   411 s chirris ave. sometown st 12345
1   josh    smith   411 s chirris ave. sometown st 12345

My results

,
,
,
,
,

Desired Results

1   411 s chirris ave. sometown st 12345
1   411 s chirris ave. sometown st 12345
1   411 s chirris ave. sometown st 12345
1   411 s chirris ave. sometown st 12345

In the end, I would like to print the desired columns, which I would be able to do if I could read DATA into the array, instead I am getting empty strings because the sub does not recognize @fields. I know I need to do something with array refernces but I'm a little off on those..any suggestions? I hope this is clear.

OK, so. The core problem here is that your hashofarray function tries to read the file handle. You then iterates to the end of the file. And then... you call it again, when there's no more file left to read.

But that isn't the only problem here - there's several. If you're grepping keys out of a hash of arrays... why not use a hash of hashes instead? The way you're doing it, you're getting - effectively - a search through an array, but then returning the zeroth index anyway.

Likewise - @fields isn't globally scoped, so when you try and reuse it in hashofarray ... it's always going to be empty.

Can I suggest taking a step back? Update your question (or ask a new one) with your actual problem spec? Include input data, and expected output.

I think you've gone through a couple of cycles of fixing this code, and it's getting messy, so I think it's time to draw back a little and start over. I think you'll find there's a lot cleaner and more elegant solution.

That said - if you're simply looking at extracting the 'header' line from your existing data block:

my @fields = split /\t/,<DATA>; #read first line, split into array. 
while ( <DATA> ) { #etc.

You can - for example - translate your 'data' sement into a data structure like so:

use strict;
use warnings;
use Data::Dumper;
my @all_records;
my $header_line = <DATA>;
chomp($header_line);
my @headers = split /\t/, $header_line;
while (<DATA>) {
    chomp;
    my @columns = split /\t/;
    my %record;
    @record{@headers} = @columns;
    print Dumper \%record;
    push( @all_records, \%record );
}

print Dumper \@all_records;

foreach my $record ( @all_records ) { 
   print join ",", $record -> {'Account No'}, $record -> {'location'},"\n";
}

__DATA__
Account No  name1   name2   location
1   josh    smith   411 s chirris ave. sometown st 12345
1   josh    smith   411 s chirris ave. sometown st 12345
1   josh    smith   411 s chirris ave. sometown st 12345
1   josh    smith   411 s chirris ave. sometown st 12345

I would suggest though - you can use 'account number' as a unique key, probably so you don't actually need to use an array. You do in this case though, so I've done that in my code.

This will print:

1,411 s chirris ave. sometown st 12345,
1,411 s chirris ave. sometown st 12345,
1,411 s chirris ave. sometown st 12345,
1,411 s chirris ave. sometown st 12345,

You are declaring @fields with my inside while loop.

while (<DATA>) { # calling the subroutine after reading DATA
my @fields = split /\t/;

So scope of that variable is in that while loop only. Instead of doing this try to declare array @fields above while loop.

Also please put these at top of your code.

use strict;
use warnings;

You will have found this error if these lines were at top.

Also you need to improve the way you are reading file. When you read $fh for first time seek pointer will reach at last and after that your code will never read anything from file. It will work on the hash created in first iteration. So if reading file one time is enough for you please take that reading part out of sub else if you want to read again and again then close $fh and reopen it again.

if(my @index = grep { $fields[$_] eq $key1 } 0..$#fields) { will not give actual word from @fields instead will give the index of matched word from @fields so at time of printing this should be used

print $fields[$address],",", $fields[$account_number], "\n";

I hope after these changes you will be able to write a correct solution to your problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM