简体   繁体   English

perl脚本在两个文件中查找匹配的行

[英]perl script to find matching lines in two files

I have two files that look like (below) and want to find the fields from the first in the second file, but print every field of the second. 我有两个看起来像(下面)的文件,想要从第二个文件的第一个文件中查找字段,但是要打印第二个文件的每个字段。

#rs116801199 720381
#rs138295790 16057310
#rs131531 16870251
#rs131546 16872281
#rs140375 16873251
#rs131552 16873461

and

#--- rs116801199 720381 0.026 0.939 0.996 0 -1 -1 -1
#1 rs12565286 721290 0.028 1.000 1.000 2 0.370 0.934 0.000
#1 rs3094315 752566 0.432 1.000 1.000 2 0.678 0.671 0.435
#--- rs3131972 752721 0.353 0.906 0.938 0 -1 -1 -1
#--- rs61770173 753405 0.481 0.921 0.950 0 -1 -1 -1

My script looks like: 我的脚本看起来像:

#! perl -w

my $file1 = shift@ARGV;

my @filtered_snps;
open (IN, $file1) or die "couldn't read file one";
while(<IN>){
    my@L=split;
    #next if ($L[0] =~ m/peak/);
    push @filtered_snps,[$L[0],$L[1]];

}
close IN;

my $file2 = shift@ARGV;

my @snps;
open (IN, $file2);
while (<IN>){
    my@L=split;
    foreach (@filtered_snps){

        if (($L[1] eq ${$_}[0]) && ($L[2] == ${$_}[1])) {

            print "@L\n";

            next;
        }
    }
}

I am getting no output, when I should be finding every line from file 1. I've also tried grep with no success. 当我应该从文件1中查找每一行时,我没有输出。我也尝试了grep,但没有成功。

In first while you are assigning to wrong array, you meant @L here. 首先, while您分配给错误的数组时,您的意思是@L

Then you have pretty different strings in your first array (from first file) and in other. 然后,在第一个数组(来自第一个文件)和其他数组中,您将拥有截然不同的字符串。 Try to print them both out in your for-iteration. 尝试将它们同时打印出来。 You'll see they can't match. 您会看到它们无法匹配。

Create a hash table of the items from the first file, then iterate over the second file and check if that rs-name exists... I'm also confirming that the number matches the name. 从第一个文件创建项目的哈希表,然后遍历第二个文件并检查rs-name是否存在...我还要确认数字与名称匹配。

use strict;
use warnings;

my %hash;
my $regex = qr/#.* *(rs\d+) (\d+) *.*/;

open my $file1, '<', shift @ARGV;
while (<$file1>) {
    my ($name, $num) = $_ =~ $regex;
    $hash{$name} = $num;
}
close $file1;

open my $file2, '<', shift @ARGV;
while (<$file2>) {
    my ($name, $num) = $_ =~ $regex;
    print if (exists $hash{$name} and $hash{$name} = $num)
}
close $file2;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM