简体   繁体   English

如何使用Perl计算文本文件中的段落?

[英]How can I count paragraphs in text file using Perl?

I need to create Perl code which allows counting paragraphs in text files. 我需要创建允许对文本文件中的段落进行计数的Perl代码。 I tried this and doesn't work: 我尝试了这个,但是不起作用:

open(READFILE, "<$filename")
or die "could not open file \"$filename\":$!";

$paragraphs = 0;

my($c);

while($c = getc(READFILE))
{
if($C ne"\n")
{
$paragraphs++;
}
}

close(READFILE);

print("Paragraphs: $paragraphs\n");

See perlfaq5: How can I read in a file by paragraphs? 请参阅perlfaq5:如何按段落读取文件?

local $/ = '';  # enable paragraph mode
open my $fh, '<', $file or die "can't open $file: $!";
1 while <$fh>;
my $count = $.;

If you're determining paragraphs by a double-newline ("\\n\\n") then this will do it: 如果要通过双换行符(“ \\ n \\ n”)确定段落,则可以这样做:

open READFILE, "<$filename"
    or die "cannot open file `$filename' for reading: $!";
my @paragraphs;
{local $/; @paragraphs = split "\n\n", <READFILE>} # slurp-split
my $num_paragraphs = scalar @paragraphs;
__END__

Otherwise, just change the "\\n\\n" in the code to use your own paragraph separator. 否则,只需更改代码中的“ \\ n \\ n”即可使用您自己的段落分隔符。 It may even be a good idea to use the pattern \\n{2,} , just in case someone went crazy on the enter key. 最好使用\\n{2,} ,以防万一有人对Enter键发疯。

If you are worried about memory consumption, then you may want to do something like this (sorry for the hard-to-read code): 如果您担心内存消耗,那么您可能需要执行以下操作(对难以理解的代码表示抱歉):

my $num_paragraphs;
{local $/; $num_paragraphs = @{[ <READFILE> =~ /\n\n/g ]} + 1}

Although, if you want to keep using your own code, you can change if($C ne"\\n") to if($c eq "\\n") . 虽然,如果您想继续使用自己的代码,可以将if($C ne"\\n")更改为if($c eq "\\n")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM