简体   繁体   English

perl使用数组解析并向哈希键分配多个值

[英]perl parsing and assigning multiple values to hash key using array

I have files that I'm trying to parse and build a hash and lookup from a third file. 我有一些文件试图解析,并从第三个文件构建哈希和查找。 File format : 文件格式 :

File 1: 文件1:

ID2
ID4

File 2: 档案2:

x1 y1 z1 ID1
x2 y2 z2 ID2
x3 y3 z3 ID2
x4 y4 z4 ID4

File 3: 文件3:

a1 b1
a2 b2
a3 b3

What I'm trying to do : 我正在尝试做的是:

For all those IDs in File1, look up x and y coordinate using ID field in File2 , and see if 'a' in File3 lies between x and y. 对于File1中的所有这些ID,请使用File2中的ID字段查找x和y坐标,并查看File3中的'a'是否位于x和y之间。

What I have thought so far : 到目前为止,我一直在想:

  1. Take file 2; 取文件2; parse it into a hash using ID as the key 使用ID作为键将其解析为哈希
  2. Take file 1; 取文件1; if ID exists in file 2; 如果文件2中存在ID; open file 3 and check coordinates ranges for 'a' and print it 打开文件3并检查“ a”的坐标范围并打印

How far have I executed? 我执行了多远? Not too far. 不太远。 I am trying to read File 2 and parse all element in a hash, but I'm stuck: 我试图读取文件2并解析哈希中的所有元素,但是我被卡住了:

while (<FILE>){
    chomp $_;
    my $line = $_;
    my @arr  = split ("\t", $line);
    my $id = $arr[3];

    if (exists ($hash{$id})) {
        my $x = $arr[0];
        my $y  = $arr[1];
        my $z   = $arr[2];
        push @{$hash{$id}{'x'}, $x;
        push @{$hash{$id}{'y'}, $y;
        push @{$hash{$id}{'y'}, $y;
    } else {
        $hash{'id'} = $id;
        $hash{$id}{'x'} = $arr[0];   
        $hash{$id}{'y'} = $arr[1];
        $hash{$id}{'z'} = $arr[2];
    }
}
print Dumper %hash;
close FILE;

But of course, I'm doing something wrong here 但是,当然,我在这里做错了

This is how to read your file2 into a hash. 这是将file2读入哈希的方法。 Note that I think it may be easier to use three-element arrays to hold the x, y and z values rather than a hash. 请注意,我认为使用三元素数组保存x,y和z值可能比散列更容易。

I would show more but I'm very unclear about how your file3 works, and how it is related to file1 . 我会显示更多,但是我不清楚您的file3如何工作以及它与file1 Do you want to process all values in file1 and find, for each of them, which values in file3 are between the corresponding limits? 您是否要处理file1中的所有值,并为它们中的每一个查找file3中的哪些值在相应的限制之间?

use strict;
use warnings;
use autodie;

use Data::Dump;

open my $fh, '<', 'file2.txt';

my %data;

while (<$fh>){
    chomp;
    my @fields = split /\t/;
    my $id = pop @fields;
    for ('x' .. 'z') {
      push @{$data{$id}{$_}}, shift @fields;
    }
}

dd \%data;

output 产量

{
  ID1 => { x => ["x1"], y => ["y1"], z => ["z1"] },
  ID2 => { x => ["x2", "x3"], y => ["y2", "y3"], z => ["z2", "z3"] },
  ID4 => { x => ["x4"], y => ["y4"], z => ["z4"] },
}

Update 更新

Although the storage format from the above code is what I think you intended I don't think it's very workable. 尽管上面代码中的存储格式是我认为您想要的,但我认为它不太可行。 I think it would be easier for you to code the rest of the program if you use this 我认为,如果您使用此代码,则对程序的其余部分进行编码会更容易

while (<$fh>){
    chomp;
    my @fields = split /\t/;
    my $id = pop @fields;
    push @{$data{$id}}, \@fields;
}

resulting in this 导致这个

{
  ID1 => [["x1", "y1", "z1"]],
  ID2 => [["x2", "y2", "z2"], ["x3", "y3", "z3"]],
  ID4 => [["x4", "y4", "z4"]],
}

or even this 甚至这个

while (<$fh>){
    chomp;
    my @fields = split /\t/;
    my $id = pop @fields;
    my %item;
    @item{qw/ x y z /} = @fields;
    push @{$data{$id}}, \%item;
}

which results in this data 产生此数据

{
  ID1 => [{ x => "x1", y => "y1", z => "z1" }],
  ID2 => [
           { x => "x2", y => "y2", z => "z2" },
           { x => "x3", y => "y3", z => "z3" },
         ],
  ID4 => [{ x => "x4", y => "y4", z => "z4" }],
}

I would approach your task this way: 我会这样处理您的任务:

  1. load file 1 into memory and keep it as a filter of IDs 将文件1加载到内存中并保留为ID过滤器

  2. tie file 3 or load it into memory and otherwise keep it around to index into 绑定文件3或将其加载到内存中,否则将其保留以索引到

  3. process file 2 as a stream of requests. 将文件2作为请求流处理。

Thus: 从而:

#! /usr/bin/env perl
use common::sense;
use Tie::File;
use autodie;

tie my @table, 'Tie::File', 'f3' or die $!;

my %filter;
open my $f, '<', 'f1';
while (<$f>) {
  chomp;
  $filter{$_}++
}
close $f;

while (<>) {
  next unless /^x(\d+) y(\d+) z\d+ (ID\d+)$/;
  next unless exists $filter{$3};
  say((split ' ', $table[$2])[$1])
}

untie @table;

Usage: 用法:

$ ./example
x1 y1 z2 ID2
b2
x0 y0 z5 ID2
a1

$ ./example file2
<three blank lines, because your examples are 1-indexed>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM