In the perl , how to read the contents between two marks. Source data like this
START_HEAD
ddd
END_HEAD
START_DATA
eee|234|ebf
qqq| |ff
END_DATA
--Generate at 2011:23:34
then I only want to get data between "START_DATA" and "END_DATA". How to do this ?
sub readFile(){
open(FILE, "<datasource.txt") or die "file is not found";
while(<FILE>){
if(/START_DATA/){
record(\*FILE);#start record;
}
}
}
sub record($){
my $fileHandle = $_[0];
while(<fileHandle>){
print $_."\n";
if(/END_DATA/) return ;
}
}
I write this code, it doesn't work. do you know why ?
Thanks
Thanks
You can use the range operator:
perl -ne 'print if /START_DATA/ .. /END_DATA/'
The output will include the *_DATA lines, too, but it should not be so hard to get rid of them.
Besides a few typos, your code is not too far off. Had you used
use strict;
use warnings;
You might have figured it out yourself. Here's what I found:
Normal sub declaration is sub my_function (prototype) {
, but you can leave out the prototype and just use sub my_function {
.
while (<fileHandle>) {
is missing the $
sign to denote that it is a variable (scalar) and not a global. Should be $fileHandle
. print $_."\\n";
will add an extra newline. Just print;
will do what you expect. if(/END_DATA/) return;
is a syntax error. Brackets are not optional in perl in this case. Unless you reverse the statement. Use either:
return if (/END_DATA/);
or
if (/END_DATA/) { return }
Below is the cleaned up version. I commented out your open()
while testing, so this would be a functional code example.
use strict;
use warnings;
readFile();
sub readFile {
#open(FILE, "<datasource.txt") or die "file is not found";
while(<DATA>) {
if(/START_DATA/) {
recordx(\*DATA); #start record;
}
}
}
sub recordx {
my $fileHandle = $_[0];
while(<$fileHandle>) {
print;
if (/END_DATA/) { return }
}
}
__DATA__
START_HEAD
ddd
END_HEAD
START_DATA
eee|234|ebf
qqq| |ff
END_DATA
--Generate at 2011:23:34
This is a pretty simple thing to do with regular expressions, just use the /s or /m (single line or multiple line) flags - /s allows the .
operator to match newlines, so you can do /start_data(.+)end_data/is
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.