简体   繁体   中英

How do i read lines from a file into a hash in Perl

I'm working with a file format that has lines of information that looks like this:

ATOM 1 N LYS A 56 20.508 14.774 -7.432 1.00 50.83 N

All i want is the first number, and the three numbers following '56' in the example above; so im using regular expressions to get that information. How do i then put that info into a hash?

So far i have:

my $pdb_file = $ARGV[0];
open (PDBFILE, "<$pdb_file") or die ("$pdb_file not found");
while (<PDBFILE>) { 
if ($_=~ /^ATOM\s+(\d+)\s+\w+\s+\w+\s+\w+\s+\d+\s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)/) {
my $atom = $1;
my $xcor = $2;
my $ycor = $3;
my $zcor = $4;
print "AtomNumber: $atom\t  xyz: $xcor $ycor $zcor\n";
}
}

Instead of using a regex, I would instead recommend using split to split it into fields on whitespace. This will be faster and more robust, it doesn't depend on a detailed knowledge of the format of each field (which could change, like if a number has a minus sign which you forgot to take into account). And it's a lot easier to understand.

my @fields = split /\s+/, $line;

Then you can pick out the fields (for example, the first number is field 2, so $fields[1] ) and put them into your hash.

my %coordinate = (
    atom => $fields[1],
    x    => $fields[6],
    y    => $fields[7],
    z    => $fields[8]
);

You're reading a bunch of lines, so you're going to make a bunch of hashes which have to go somewhere. I'd recommend putting them all in another hash with some sort of unique field as the key. Possibly the atom field.

$atoms{$coordinate{atom}} = \%coordinate;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM