简体   繁体   English

在Perl中,如何在一个循环中读取多个文件句柄?

[英]In Perl, how can I read from multiple filehandles in one loop?

I was wondering how I could implement this in Perl: 我想知道如何在Perl中实现它:

while ( not the end of the files )
    $var1 = read a line from file 1
    $var2 = read a line from file 2
    # operate on variables
end while

I'm not sure how to read one line at a time from two files in one while loop. 我不确定如何在一个while循环中从两个文件一次读取一行。

Seems like you wrote your answer yourself, almost. 好像你自己写的答案差不多。 Just check for eof for both file handles, like so: 只需检查两个文件句柄的eof ,如下所示:

while (not eof $fh1 and not eof $fh2) {
    my $var1 = <$fh1>;
    my $var2 = <$fh2>;
    # do stuff
}

More reading: 更多阅读:

Note: I expanded my answer in response to @zostay and @jm666's comments. 注意:我在回答@zostay和@ jm666的评论时扩展了我的答案。

The first step in coming up with an efficient, clear, and concise answer to this question starts with the idea that related variables go in an aggregate . 提出一个有效,清晰,简洁的问题答案的第一步,从相关变量汇总的观点开始。 So, the array @fh will contain the filehandles from which we are reading simultaneously. 因此,数组@fh将包含我们同时读取的文件句柄。

Then, we can read a line from each filehandle and store them in an array using the <> operator in conjunction with map . 然后,我们可以从每个文件句柄中读取一行,并使用<>运算符和map将它们存储在一个数组中。 map takes a transformation rule and a list, and returns another list. map采用转换规则和列表,并返回另一个列表。 Hence: 因此:

my @lines = map scalar <$_>, @fh;

takes the filehandles in @fh , and reads a single line from each (note scalar ), and puts those lines in @lines . 获取@fh的文件句柄,并从每个文件中读取一行(注释标量 ),并将这些行放在@lines This is a one-to-one transformation of @fh . 这是@fh one-to-one转换。

As the documentation for <> indicates, <> returns an undefined value if the end-of-file is reached, or there is an error. 正如<>的文档所示,如果到达文件末尾, <>返回未定义的值,或者出现错误。

Now, one way to check if we successfully read from all files is to check if the number defined lines is the same as the number of filehandles. 现在,检查我们是否成功读取所有文件的一种方法是检查数字定义的行是否与文件句柄的数量相同。 grep selects elements of a list that satisfy a certain criterion. grep选择满足特定条件的列表元素。 Hence 于是

@fh == grep defined, my @lines = map <$_>, @fh;

would check if the number of filehandles in @fh is the same as the number of defined elements in @lines . 会检查是否在文件句柄的数量@fh相同定义的元素数@lines However, the @fh appearing on both sides of this comparison can indeed be confusing, so an alternative way of checking the there are no undefined elements in @lines is: 但是,在这个比较的两边出现的@fh确实令人困惑,所以另一种检查@lines中没有未定义元素的@lines是:

0 == grep !defined, my @lines = map <$_>, @fh;

If you want to put that condition in a while loop, you have to write: 如果你想把这个条件放在while循环中,你必须写:

while (0 == grep !defined, my @lines = map <$_>, @fh) {

whereas if you go with an until , you can simply write: 而如果你去一个直到 ,你可以简单地写:

until (grep !defined, my @lines = map <$_>, @fh) {

This means " until at least one of the readlines returns an undefined value, execute the body of the loop ". 这意味着“ 直到至少有一个读取行返回一个未定义的值,执行循环体 ”。

Now, note that Perl's eof is different than C's eof . 现在,请注意Perl的eofC的eof不同。 The documentation for Perl's eof notes that: Perl的eof文档指出:

Practical hint: you almost never need to use eof in Perl, because the input operators typically return undef when they run out of data or encounter an error. 实用提示:您几乎不需要在Perl中使用eof ,因为输入操作符通常在数据耗尽或遇到错误时返回undef

If you check eof every time through the loop, you're doubling your file IO because " this function actually reads a character and then ungetc s it ." 如果你每次循环都检查eof ,那么你的文件IO就会翻倍,因为“ 这个函数实际上是在读取一个字符然后ungetc 。”

I almost always give a self-contained runnable example with my code. 我几乎总是用我的代码给出一个自包含的runnable示例。 Below, I did not want to rely on any specific files existing on your system, so I use the always available DATA and STDIN handles. 下面,我不想依赖系统中存在的任何特定文件,因此我使用始终可用的DATASTDIN句柄。 As opposed to using the eof function, when you use this method, you don't have to worry about where you're reading from: All you care about is whether a readline on any one of the files returned an undefined value. 与使用eof函数相反,当您使用此方法时,您不必担心读取的位置:您关心的是任何一个文件的readline是否返回了未定义的值。 It can also be used with any number of filehandles. 它也可以与任意数量的文件句柄一起使用。 Also, you really don't have put the filehandles in an array, but as I said, related variables belong in an aggregate, so if you find yourself typing stuff like 此外,你真的没有把文件句柄放在一个数组中,但正如我所说,相关的变量属于一个聚合,所以如果你发现自己输入像

my $var1 = <$fh1>;
my $var2 = <$fh2>;

realize that you should have used an array to store the filehandles . 意识到你应该使用数组来存储文件句柄

#!/usr/bin/env perl

use strict; use warnings;

my @fh = (\*DATA, \*STDIN);

until (grep !defined, my @lines = map scalar <$_>, @fh) {
    print for @lines;
}

__DATA__
one
two
three

This example script will stop asking for your input on STDIN when the lines in DATA are exhausted. DATA中的行耗尽时,此示例脚本将停止询问您对STDIN的输入。 If you do not have any trailing blank lines in the script, you should have to enter 如果脚本中没有任何尾随空白行,则必须输入 three four lines before the script terminates. 脚本终止前的四行。

Now, if you want to know which filehandles reached the end, you'd switch to using something like: 现在,如果您想知道哪些文件句柄到达目的地,您将切换到使用以下内容:

#!/usr/bin/env perl

use strict; use warnings;

my @fh = (\*DATA, \*STDIN);

while (1) {
    my @lines = map scalar <$_>, @fh;

    if (my @eof = grep !defined($lines[$_]), 0 .. $#fh) {
        warn "Could not read from filehandle(s) '@eof'";
        last;
    }

    print for @lines;
}

__DATA__
one
two
three

Important 重要

The loops above are designed to stop when any one of the files is exhausted. 上面的循环设计为在任何一个文件耗尽时停止。 On the other hand, you might want the loops to run until all of the files are exhausted. 另一方面,您可能希望循环运行,直到所有文件都用完为止。 In that case, you'd use: 在这种情况下,您将使用:

 while (grep defined, my @lines = map scalar <$_>, @fh) {

Another easy solution without explicit eof() checking would go like this: 没有明确的eof()检查的另一个简单的解决方案将是这样的:

while (defined(my $var1 = <$fh1>) and defined(my $var2 = <$fh2>)) {
    # do stuff
}

This uses the fact that <> returns undef if & only if you're at the end of the file. 这使用了<>返回undef的事实,当且仅当您在文件的末尾时。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM