简体   繁体   English

Perl脚本在一个公共字段上联接两个文件

[英]perl script to join two files on a common field

i am writing a perl script to join fields from two different files into a 3rd new file based on a common field value in the 2 original files. 我正在编写一个Perl脚本,根据两个原始文件中的公共字段值将两个不同文件中的字段连接到第三个新文件中。

i have written the following script but it seems to go into an infinite loop. 我已经编写了以下脚本,但似乎陷入了无限循环。 any suggestions on what i need to change? 关于我需要更改的任何建议?

#!/usr/bin/perl
#
open FILE, ">location.txt" or die$!;
open FILE1, "./checkins.txt" or die$!;
open FILE2, "./locations.txt" or die$!;

while (my $line1 = <FILE1> and my $line2 = <FILE2>) {
    chomp $line1;
    chomp $line2;
    @lines1 = split("\t", $line1);
    @lines2 = split("\t", $line2);

    while($lines2[0] = $lines1[5]) {
        print FILE
            "$lines2[0]"."\t"."$lines2[1]"."\t"."$lines2[2]"."\t"."$lines1[6]"."\t".
            "$lines1[7]"."\t"."$lines1[8]"."\n";
    }
}
close(FILE);
close(FILE1);
close(FILE2);

The 1990s called, and want their Perl syntax back... 1990年代打电话来,并希望他们的Perl语法返回...

Sorry about that. 对于那个很抱歉。 It's not your fault. 这不是你的错。

Perl syntax has changed quite a bit since its early days, and for some reason, most people still write in the older syntax form. 自成立以来,Perl语法已经发生了很大变化,并且由于某些原因,大多数人仍然使用较旧的语法形式进行编写。 It's taught in schools and people pick it up from examples in their workplace. 它是在学校里教授的,人们从工作场所中的例子中学到它。 Python developers decry the unreadable Perl syntax as proof that Perl is an old decrepit language which now belongs to the dustbin of history. Python开发人员谴责不可读的 Perl语法,以证明Perl是一种古老的折旧语言,现在属于历史垃圾箱。 But, in many ways, awful Perl syntax is proof how easy it is to pick up Perl and to learn it. 但是,在许多方面,糟糕的Perl语法证明了学习和学习Perl多么容易。

Always put use strict; 始终把use strict; and use warnings; use warnings; at the top of your program. 在程序的顶部。 This will catch about 90% of the errors in Perl. 这将捕获Perl中大约90%的错误。 It would have caught the error where you're using = instead of eq or == in your while statement. while语句中使用=而不是eq== ,将捕获错误。 Get a new copy of Learning Perl (aka _The Llama Book). 获取新版本的Learning Perl (又名_The Llama Book)。 Go through it and pick up the new syntax. 通过它并选择新的语法。 This will greatly improve your coding skills. 这将大大提高您的编码技能。

Another issue is that your inner while loop is an infinite loop. 另一个问题是您的内部while循环是一个无限循环。 You're not really changing values of anything, so you're constantly looping over and over again. 您并没有真正更改任何值,因此您不断地反复遍历。 The below does the same thing: 下面做同样的事情:

while ( $foo ne $bar ) {
    print "Are we there yet?\n";
}

If $foo doesn't equal $bar , the above loop will go on printing Are we there yet? 如果$foo不等于$bar ,上面的循环将继续打印Are we there yet? for billions of years until the sun uses up its last bit of helium fuel, and expands into a massive star that swallows up Earth's orbit (or until you get tired of it and hit Control-C). 持续数十亿年,直到太阳耗尽了最后一点氦气,然后膨胀成巨大的恒星,吞没了地球的轨道(或者直到您厌倦了它并撞向Control-C为止)。

If you don't want an infinite loop, you have to change at least one of the values you use in your while statement: 如果您不希望无限循环,则必须至少更改您在while语句中使用的值之一:

while ( $foo ne $bar ) {
    print "Are we there yet?\n";
    $foo = $bar;    # One more peep, and I'll stop the car!
}

Also, what happens if one file contains more lines than the other? 另外,如果一个文件包含的行比另一个文件多,会发生什么? I have a feeling what you want to do is read in one file into a hash, then loop through the other file. 我感觉到您想要做的是将一个文件读入哈希,然后遍历另一个文件。 If that hash key exists in the second file, you then want to combine the lines. 如果该哈希键存在于第二个文件中,则您希望合并这些行。 Unfortunately your question isn't exactly clear what you want to do. 不幸的是,您的问题并不清楚您要做什么。

Can you edit your question to better explain what you're attempting to accomplish. 您能否编辑问题以更好地解释您要完成的任务。 For example, if you can give us a sample input of your two input files and what you want your output file to look like. 例如,如果您可以为我们提供两个输入文件的示例输入,以及您希望输出文件的外观。 You only need to give us a few lines of each, but this will help us better understand what you want to do. 您只需要给我们几行,但这将有助于我们更好地了解您想做什么。

You're using an assignment = instead of a equality test eq , and it should be an if instead of a while . 您使用的是Assignment =而不是相等测试eq ,它应该是if而不是while

while($lines2[0] = $lines1[5]) {

changing it to: 更改为:

if ($lines2[0] eq $lines1[5]) {

Btw, ALWAYS include use strict; 顺便说一句,总是包括use strict; and use warnings; use warnings; at the top of every script. 在每个脚本的顶部。 And if you're doing file processing, use autodie; 如果您正在执行文件处理,请use autodie; as well. 也一样

Here is a cleaned up version of your script with those pragmas and using lexical file handles: 这是带有这些编译指示并使用词法文件句柄的脚本的清理版本:

#!/usr/bin/perl

use strict;
use warnings;
use autodie;

open my $outfh, ">", "location.txt";
open my $infh1, '<', "./checkins.txt";
open my $infh2, '<', "./locations.txt";

while (my $line1 = <$infh1> and my $line2 = <$infh2>) {
    chomp $line1;
    chomp $line2;
    my @lines1 = split("\t", $line1);
    my @lines2 = split("\t", $line2);

    if ($lines2[0] eq $lines1[5]) {
        print $outfh join("\t", @lines2[0,1,2], @lines1[6,7,8]), "\n";
    }
}
close($outfh);
close($infh1);
close($infh2);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM