[英]How can I count paragraphs in text file using Perl?
I need to create Perl code which allows counting paragraphs in text files. 我需要创建允许对文本文件中的段落进行计数的Perl代码。 I tried this and doesn't work: 我尝试了这个,但是不起作用:
open(READFILE, "<$filename")
or die "could not open file \"$filename\":$!";
$paragraphs = 0;
my($c);
while($c = getc(READFILE))
{
if($C ne"\n")
{
$paragraphs++;
}
}
close(READFILE);
print("Paragraphs: $paragraphs\n");
See perlfaq5: How can I read in a file by paragraphs? 请参阅perlfaq5:如何按段落读取文件?
local $/ = ''; # enable paragraph mode
open my $fh, '<', $file or die "can't open $file: $!";
1 while <$fh>;
my $count = $.;
Have a look at the Beginning Perl book at http://www.perl.org/books/beginning-perl/ . 请参阅http://www.perl.org/books/beginning-perl/上的Beginning Perl书。 In particular, the following chapter will help you: http://docs.google.com/viewer?url=http%3A%2F%2Fblob.perl.org%2Fbooks%2Fbeginning-perl%2F3145_Chap06.pdf 特别是,以下章节将为您提供帮助: http : //docs.google.com/viewer?url=http%3A%2F%2Fblob.perl.org%2Fbooks%2Fbeginning-perl%2F3145_Chap06.pdf
If you're determining paragraphs by a double-newline ("\\n\\n") then this will do it: 如果要通过双换行符(“ \\ n \\ n”)确定段落,则可以这样做:
open READFILE, "<$filename"
or die "cannot open file `$filename' for reading: $!";
my @paragraphs;
{local $/; @paragraphs = split "\n\n", <READFILE>} # slurp-split
my $num_paragraphs = scalar @paragraphs;
__END__
Otherwise, just change the "\\n\\n" in the code to use your own paragraph separator. 否则,只需更改代码中的“ \\ n \\ n”即可使用您自己的段落分隔符。 It may even be a good idea to use the pattern \\n{2,}
, just in case someone went crazy on the enter key. 最好使用\\n{2,}
,以防万一有人对Enter键发疯。
If you are worried about memory consumption, then you may want to do something like this (sorry for the hard-to-read code): 如果您担心内存消耗,那么您可能需要执行以下操作(对难以理解的代码表示抱歉):
my $num_paragraphs;
{local $/; $num_paragraphs = @{[ <READFILE> =~ /\n\n/g ]} + 1}
Although, if you want to keep using your own code, you can change if($C ne"\\n")
to if($c eq "\\n")
. 虽然,如果您想继续使用自己的代码,可以将if($C ne"\\n")
更改为if($c eq "\\n")
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.