简体   繁体   English

在Perl中逐行读取文件

[英]Reading a file line by line in Perl

I want to read a file by one line, but it's reading just the first line. 我想一行读取一个文件,但它只读取第一行。 How to read all lines? 如何阅读所有行?

My code: 我的代码:

open(file_E, $file_E);

while ( <file_E> ) {
    /([^\n]*)/;
    print $line1;
}

close($file_E);

Let's start by looking at your code. 让我们从查看代码开始。

open(file_E, $file_E);

while ( <file_E> ) {
    /([^\n]*)/;
    print $line1;
}

close($file_E);

On the first line you open a file named in $file_E using the bareword filehandle file_E . 在第一行,使用裸file_E打开一个名为$file_E的文件。 This should work so long as the file successfully opens. 只要文件成功打开,它就应该起作用。 It would be better to also check the success of this operation one of two ways: Either put use autodie; 这将是更好地还检查这两种操作方式之一的成功:要么把use autodie; at the top of your script (but then risk applying its semantics in places where your code is incompatible with this level of error handling), or change your open to look like this: 在脚本的顶部(但要冒风险在其代码与此级别的错误处理不兼容的地方应用其语义),或将open更改为如下所示:

open(file_E, $file_E) or die "Failed to open $file_E: $!\n";

Now if you fail to open the file you will get an error message that will help track down the problem. 现在,如果您无法打开文件,您将收到一条错误消息,可以帮助您查找问题。

Next lets look at the while loop, because it's here where you have an issue that is causing the bug you are experiencing. 接下来,让我们看看while循环,因为在这里,您遇到的问题会导致您遇到错误。 On the first line of the while loop you have this: while循环的第一行,您有以下内容:

while ( <file_E> ) {

By consulting perldoc perlsyn you will see that line is special-cased to actually do this: 通过咨询perldoc perlsyn您将看到该行在实际情况中是特殊情况:

while (defined($_ = <file_E>)) {

So your code is implicitly assigning each line to $_ on successive iterations. 因此,您的代码在连续的迭代中隐式地将每行分配给$_ Also by consulting perldoc perlop you'll find that when the match operator ( /.../ or m/.../ ) is invoked without binding the match explicitly using =~ , the match will bind against $_ . 此外,通过咨询perldoc perlop您会发现,在调用匹配运算符( /.../m/.../ )而未使用=~显式绑定匹配时,匹配将与$_绑定。 Still then, so far so good. 到目前为止,到目前为止还算不错。 However, you are not actually doing anything useful with the match. 但是,您实际上并没有对比赛做任何有用的事情。 The match operator will return Boolean truth / falsehood for whether or not the match succeeded. 匹配运算符将返回布尔真/假,以判断匹配是否成功。 And because your pattern contains capturing parenthesis, it will capture something into the capture variable $1 . 并且由于您的模式包含捕获括号,因此它将捕获捕获变量$1 But you are never testing for match success, nor are you ever referring to $1 again. 但是,您永远不会测试比赛成功,也不会再提$1

On the line that follows, you do this: print $line1 . 在接下来的行中,执行以下操作: print $line1 Where, in your code, is $line1 being assigned a value? $line1在您的代码中的哪个位置分配了值? Because it is never being assigned a value in what you've shown us. 因为从不向您显示的内容分配任何值。

I can only guess that your intent is to iterate over the lines of the file, capture the line but without the trailing newline, and then print it. 我只能猜测,您的意图是遍历文件的各行,捕获该行但不包含尾随的换行符,然后进行打印。 It seems that you wish to print it without any newlines, so that all of the input file is printed as a single line of output. 看来您希望不带任何换行符来打印它,以便所有输入文件都被打印为一行输出。

open my $input_fh_e, '<', $file_E or die "Failed to open $file_E: $!\n";

while(my $line = <$input_fh_e>) {
    chomp $line;
    print $line;
}

close $input_fh_e or die "Failed to close $file_E: $!\n";

No need to capture anything -- if all that the capture is doing is just grabbing everything up to the newline, you can simply strip off the newline with chomp to begin with. 无需捕获任何内容-如果捕获所做的只是将所有内容抓到换行符,则可以简单地用chomp剥离换行符。

In my example I used a lexical filehandle (a file handle that is lexically scoped, declared with my ). 在我的示例中,我使用了词法文件句柄(用my声明的词法范围的文件句柄)。 This is generally a better practice in modern Perl as it avoids using a bareword, avoids possible namespace collisions, and assures that the handle will get closed as soon as the lexical scope closes. 在现代Perl中,这通常是更好的做法,因为它避免使用裸字,避免可能的命名空间冲突,并确保在词法作用域关闭后立即关闭句柄。

I also used the 'three arg' version of open , which is safer because it eliminates the potential for $file_E to be used to open a pipe or do some other nefarious or simply unintended shell manipulation. 我还使用了open的“三个arg”版本,它比较安全,因为它消除了$file_E用于打开管道或执行其他某些恶意或简单的意外shell操作的可能性。

I suggest also starting your script with use strict; 我建议也use strict;启动脚本use strict; , because had you done so, you would have gotten an error message at compiletime telling you that $line1 was never declared. ,因为这样做了,您在编译时会收到一条错误消息,告诉您从未声明$line1 Also start your script with use warnings , so that you would get a warning when you try to print $line1 before assigning a value to it. 另外,还应use warnings启动脚本,以便在尝试为$line1分配值之前打印此警告。

Most of the issues in your code will be discussed in perldoc perlintro , which you can arrive at from your command line simply by typing perldoc perlintro , assuming you have Perl installed. 代码中的大多数问题将在perldoc perlintro进行讨论,假设您已经安装了Perl,只需键入perldoc perlintro即可perldoc perlintro It typically takes 20-40 minutes to read through perlintro . 通读perlintro通常需要20-40分钟。 If ever there were a document that should constitute required reading before getting started writing Perl code, that reading would probably include perlintro . 如果有文档构成开始编写Perl代码之前的必读内容,则该阅读内容可能包括perlintro

Another alternative, note that $_ will include newline so you will need to chomp it if you don't want the newline in $line: 另一种选择是,注意$ _将包括换行符,因此,如果您不希望在$ line中换行符,则需要将其砍掉:

open(file_E, $file_E);

while ( <file_E> ) {
    my $line = $_;
    print $line;
}

close($file_E);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM