[英]Perl - How to omit lines from a text file?
I have a text file, and I'm looking to omit some of the lines from the text file, and use that string to create a new file.我有一个文本文件,我希望从文本文件中省略一些行,并使用该字符串创建一个新文件。 The nice thing is that my text file starts the text chunk that I need with a line that includes "START" and ends with "END".
好消息是我的文本文件以包含“START”并以“END”结尾的行开头我需要的文本块。
For example, my text file looks like:例如,我的文本文件如下所示:
1
2
3
Start
4
5
6
End
7
8
Start
9
10
End
The desired output would be two strings that I can output into text files that look like:所需的 output 将是两个字符串,我可以将 output 转换为如下所示的文本文件:
Start
4
5
6
End
Start
9
10
End
What I currently have so far:我目前所拥有的:
open(RH, '<', $fileName) or die $!;
while(<RH>) {
#print $_;
chomp $_;
if ($_ eq 'START') {
$str = "$str"."$_\n";
}
}
But I'm not sure how to continue.但我不确定如何继续。
EDIT: I answered this question using:编辑:我使用以下方法回答了这个问题:
$cmd = q(awk '/Start/,/End/ {print}' foo.txt);
my $output = qx($cmd);
my @cards = split (/(?<=\End)/, $output);
You can use some of Perl's heritage from AWK and just do this (assuming your file is called foo.txt)您可以使用来自 AWK 的一些 Perl 遗产,然后执行此操作(假设您的文件名为 foo.txt)
perl -ne'print if /Start/../End/' foo.txt
The expression /Start/../End/
means "at the first line that matches /Start/
up to the next line that matches /End/
.表达式
/Start/../End/
表示“从匹配/Start/
的第一行到匹配/End/
的下一行。
The equivalent code for awk would be awk 的等效代码为
awk '/Start/,/End/ {print}' foo.txt
# Read the entire file into a string `$str`:
open my $fh, '<', 'file_name' or die "Can't open file $!";
my $str = do { local $/; <$fh> };
close $fh;
while ($str =~ m{\n(START\n.*\nEND)\n}msg) {
# Do something with each START...END set of lines
print "$str\n";
}
Notes:笔记:
local $/
; local $/
; might be done by something like undef $/;
undef $/;
之类的东西来完成。 Use GNU grep
:使用 GNU
grep
:
grep -Poz '(?ms)^Start.*?^End\n' in_file
Here, GNU grep
uses the following options:在这里,GNU
grep
使用以下选项:
-P
: Use Perl regexes. -P
:使用 Perl 正则表达式。
-o
: Print the matches only (1 match per line), not the entire lines. -o
:仅打印匹配项(每行 1 个匹配项),而不是整行。
-z
: Treat input and output data as sequences of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline. -z
:将输入和 output 数据视为行序列,每行都以零字节(ASCII NUL 字符)而不是换行符结尾。 Thus, you can match newlines in the input.因此,您可以匹配输入中的换行符。
(?ms)
: Enable the m
and s
pattern-match modifiers , to allow multiline matches, and to allow .
(?ms)
:启用m
和s
模式匹配修饰符,以允许多行匹配,并允许.
to match a newline, respectively.分别匹配换行符。
SEE ALSO:也可以看看:
grep
manual grep
说明书
perlre - Perl regular expressions perlre - Perl 正则表达式
Use ..
as a "flip-flop" operator.使用
..
作为“触发器”运算符。
# Switch to a lexical filehandle
# (as this is modern best practice)
open(my $rh, '<', $fileName) or die $!;
# Open an output filehandle
my $x = 1;
open my $out, '>', "$filename.out.$x" or die $!;
while(<$rh>) {
print $out $_ if /Start/ .. /End/;
# Open a new output file if we've seen 'End'
if (/End/) {
++$x;
open my $out, '>', "$filename.out.$x" or die $!;
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.