如何grep在Perl中捕获文件的多行模式

Question

I have a file that looks something like this: 我有一个看起来像这样的文件：

Random words go here
/attribute1
/attribute2
/attribute3="all*the*things*I'm*interested*in*are*inside*here**
and*it*goes*into*the*next*line.*blah*blah*blah*foo*foo*foo*foo*
bar*bar*bar*bar*random*words*go*here*until*the*end*of*the*sente
nce.*I*think*we*have*enough*words"

I want to grep the file for the line \\attribute3= then I want to save the string found inside the quotation marks to a separate variable. 我想grep行\\attribute3=的文件，然后将引号内的字符串保存到单独的变量中。

Here's what I have so far: 这是我到目前为止的内容：

#!/bin/perl
use warnings; use strict;
my $file = "data.txt";
open(my $fh, '<', $file) or die $!;
while (my $line = <$fh>) {
    if ($line =~ /\/attribute3=/g){
        print $line . "\n";
    }
}

That's printing out /attribute3="all*the*things*I'm*interested*in*are*inside*here** but 那正在/attribute3="all*the*things*I'm*interested*in*are*inside*here**但是

I want all*the*things*I'm*interested*in*are*inside*here**and*it*goes*into*the*next*line.*blah*blah*blah*foo*foo*foo*foo*bar*bar*bar*bar*random*words*go*here*until*the*end*of*the*sentence.*I*think*we*have*enough*words . 我想要all*the*things*I'm*interested*in*are*inside*here**and*it*goes*into*the*next*line.*blah*blah*blah*foo*foo*foo*foo*bar*bar*bar*bar*random*words*go*here*until*the*end*of*the*sentence.*I*think*we*have*enough*words 。

So what I did next is: 所以我接下来要做的是：

#!/bin/perl
use warnings; use strict;
my $file = "data.txt";
open(my $fh, '<', $file) or die $!;
my $part_I_want;
while (my $line = <$fh>) {
    if ($line =~ /\/attribute3=/g){
        $line =~ /^/\attribute3=\"(.*?)/;   # capture everything after the quotation mark
        $part_I_want .= $1;   # the capture group; save the stuff on line 1
        # keep adding to the string until we reach the closing quotation marks
        next (unless $line =~ /\"/){
             $part_I_want .= $_;    
        }
    }
}

The code above doesn't work. 上面的代码不起作用。 How do I grep capture a multiline pattern between two characters (in this case it's quotation marks)? 我如何grep捕获两个字符之间的多行模式（在本例中为引号）？

Answer 1

my $str = do { local($/); <DATA> };
$str =~ /attribute3="([^"]*)"/;
$str = $1;
$str =~ s/\n/ /g;

__DATA__
Random words go here
/attribute1
/attribute2
/attribute3="all*the*things*I'm*interested*in*are*inside*here**
and*it*goes*into*the*next*line.*blah*blah*blah*foo*foo*foo*foo*
bar*bar*bar*bar*random*words*go*here*until*the*end*of*the*sente
nce.*I*think*we*have*enough*words"

Answer 2

将整个文件读入一个变量，然后使用/attribute3=\\"([^\\"]*)\\"/ms

Answer 3

From the command line: 在命令行中：

perl -n0e '/\/attribute3="(.*)"/s && print $1' foo.txt

This is basically what you had, but the 0 flag is the equivalent of undef $/ within the code. 这基本上就是您拥有的，但是0标志等效于代码中的undef $/ 。 From the man page: 从手册页：

-0[octal/hexadecimal] -0 [八进制/十六进制]

specifies the input record separator ($/) as an octal or hexadecimal number. 将输入记录分隔符（$ /）指定为八进制或十六进制数字。 If there are no digits, the null character is the separator. 如果没有数字，则空字符为分隔符。

如何grep在Perl中捕获文件的多行模式

问题描述

3 个解决方案

解决方案1
2 已采纳 2015-11-04 20:39:56

解决方案2
1 2015-11-04 19:28:52

解决方案3
1 2015-11-04 20:55:52

如何grep在Perl中捕获文件的多行模式

问题描述

3 个解决方案

解决方案1 2 已采纳 2015-11-04 20:39:56

解决方案2 1 2015-11-04 19:28:52

解决方案3 1 2015-11-04 20:55:52

解决方案1
2 已采纳 2015-11-04 20:39:56

解决方案2
1 2015-11-04 19:28:52

解决方案3
1 2015-11-04 20:55:52