Perl脚本读取标记之间的内容

Question

In the perl , how to read the contents between two marks. 在perl中，如何读取两个标记之间的内容。 Source data like this 像这样的源数据

START_HEAD
ddd
END_HEAD

START_DATA
eee|234|ebf
qqq|              |ff
END_DATA

--Generate at 2011:23:34

then I only want to get data between "START_DATA" and "END_DATA". 那么我只想获取“ START_DATA”和“ END_DATA”之间的数据。 How to do this ? 这个怎么做？

sub readFile(){ 
    open(FILE, "<datasource.txt") or die "file is not found";

    while(<FILE>){      
        if(/START_DATA/){           
            record(\*FILE);#start record;
        }
    }
}

sub record($){
    my $fileHandle = $_[0];

    while(<fileHandle>){
        print $_."\n";      
        if(/END_DATA/) return ;         
    }
}

I write this code, it doesn't work. 我写这段代码，它不起作用。 do you know why ? 你知道为什么吗？

Thanks 谢谢

Answer 1

You can use the range operator: 您可以使用范围运算符：

perl -ne 'print if /START_DATA/ .. /END_DATA/'

The output will include the *_DATA lines, too, but it should not be so hard to get rid of them. 输出也将包括* _DATA行，但要摆脱它们并不难。

Answer 2

Besides a few typos, your code is not too far off. 除了一些拼写错误之外，您的代码距离还不太远。 Had you used 你曾经用过

use strict;
use warnings;

You might have figured it out yourself. 您可能自己想通了。 Here's what I found: 这是我发现的：

Don't use prototypes if you do not need them, or know what they do. 如果您不需要原型或知道它们的用途，请不要使用它们。

Normal sub declaration is sub my_function (prototype) { , but you can leave out the prototype and just use sub my_function { . 普通的子声明是sub my_function (prototype) { ，但是您可以省略原型，而只使用sub my_function { 。

while (<fileHandle>) { is missing the $ sign to denote that it is a variable (scalar) and not a global. while (<fileHandle>) {缺少$符号表示它是变量（标量）而不是全局变量。 Should be $fileHandle . 应该是$fileHandle 。
print $_."\\n"; will add an extra newline. 将添加一个额外的换行符。 Just print; 只是print; will do what you expect. 会做您期望的。
if(/END_DATA/) return; is a syntax error. 是语法错误。 Brackets are not optional in perl in this case. 在这种情况下，括号在perl中不是可选的。 Unless you reverse the statement. 除非您撤消声明。

Use either: 使用以下任一方法：

return if (/END_DATA/);

or 要么

if (/END_DATA/) { return }

Below is the cleaned up version. 下面是清理后的版本。 I commented out your open() while testing, so this would be a functional code example. 我在测试时注释掉了open() ，所以这将是一个功能代码示例。

use strict;
use warnings;

readFile();

sub readFile { 
    #open(FILE, "<datasource.txt") or die "file is not found";
    while(<DATA>) {      
        if(/START_DATA/) {
            recordx(\*DATA); #start record;
        }
    }
}

sub recordx {
    my $fileHandle = $_[0];
    while(<$fileHandle>) {
        print;
        if (/END_DATA/) { return }         
    }
}

__DATA__
START_HEAD
ddd
END_HEAD

START_DATA
eee|234|ebf
qqq|              |ff
END_DATA

--Generate at 2011:23:34

Answer 3

This is a pretty simple thing to do with regular expressions, just use the /s or /m (single line or multiple line) flags - /s allows the . 使用正则表达式是一件非常简单的事情，只需使用/ s或/ m（单行或多行）标志-/ s允许使用. operator to match newlines, so you can do /start_data(.+)end_data/is . 运算符以匹配换行符，因此您可以执行/start_data(.+)end_data/is 。

Perl脚本读取标记之间的内容

问题描述

3 个解决方案

解决方案1
6 2011-11-06 01:00:22

解决方案2
3 已采纳 2011-11-06 01:35:07

解决方案3
0 2011-11-06 00:53:43

Perl脚本读取标记之间的内容

问题描述

3 个解决方案

解决方案1 6 2011-11-06 01:00:22

解决方案2 3 已采纳 2011-11-06 01:35:07

解决方案3 0 2011-11-06 00:53:43

解决方案1
6 2011-11-06 01:00:22

解决方案2
3 已采纳 2011-11-06 01:35:07

解决方案3
0 2011-11-06 00:53:43