[英]perl script to read content between marks
In the perl , how to read the contents between two marks. 在perl中,如何读取两个标记之间的内容。 Source data like this
像这样的源数据
START_HEAD
ddd
END_HEAD
START_DATA
eee|234|ebf
qqq| |ff
END_DATA
--Generate at 2011:23:34
then I only want to get data between "START_DATA" and "END_DATA". 那么我只想获取“ START_DATA”和“ END_DATA”之间的数据。 How to do this ?
这个怎么做 ?
sub readFile(){
open(FILE, "<datasource.txt") or die "file is not found";
while(<FILE>){
if(/START_DATA/){
record(\*FILE);#start record;
}
}
}
sub record($){
my $fileHandle = $_[0];
while(<fileHandle>){
print $_."\n";
if(/END_DATA/) return ;
}
}
I write this code, it doesn't work. 我写这段代码,它不起作用。 do you know why ?
你知道为什么吗 ?
Thanks 谢谢
Thanks 谢谢
You can use the range operator: 您可以使用范围运算符:
perl -ne 'print if /START_DATA/ .. /END_DATA/'
The output will include the *_DATA lines, too, but it should not be so hard to get rid of them. 输出也将包括* _DATA行,但要摆脱它们并不难。
Besides a few typos, your code is not too far off. 除了一些拼写错误之外,您的代码距离还不太远。 Had you used
你曾经用过
use strict;
use warnings;
You might have figured it out yourself. 您可能自己想通了。 Here's what I found:
这是我发现的:
Normal sub declaration is sub my_function (prototype) {
, but you can leave out the prototype and just use sub my_function {
. 普通的子声明是
sub my_function (prototype) {
,但是您可以省略原型,而只使用sub my_function {
。
while (<fileHandle>) {
is missing the $
sign to denote that it is a variable (scalar) and not a global. while (<fileHandle>) {
缺少$
符号表示它是变量(标量)而不是全局变量。 Should be $fileHandle
. $fileHandle
。 print $_."\\n";
will add an extra newline. print;
print;
will do what you expect. if(/END_DATA/) return;
is a syntax error. Use either: 使用以下任一方法:
return if (/END_DATA/);
or 要么
if (/END_DATA/) { return }
Below is the cleaned up version. 下面是清理后的版本。 I commented out your
open()
while testing, so this would be a functional code example. 我在测试时注释掉了
open()
,所以这将是一个功能代码示例。
use strict;
use warnings;
readFile();
sub readFile {
#open(FILE, "<datasource.txt") or die "file is not found";
while(<DATA>) {
if(/START_DATA/) {
recordx(\*DATA); #start record;
}
}
}
sub recordx {
my $fileHandle = $_[0];
while(<$fileHandle>) {
print;
if (/END_DATA/) { return }
}
}
__DATA__
START_HEAD
ddd
END_HEAD
START_DATA
eee|234|ebf
qqq| |ff
END_DATA
--Generate at 2011:23:34
This is a pretty simple thing to do with regular expressions, just use the /s or /m (single line or multiple line) flags - /s allows the .
使用正则表达式是一件非常简单的事情,只需使用/ s或/ m(单行或多行)标志-/ s允许使用
.
operator to match newlines, so you can do /start_data(.+)end_data/is
. 运算符以匹配换行符,因此您可以执行
/start_data(.+)end_data/is
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.