Perl正则表达式，获取两个字符串之间的字符串

Question

I am new to Perl and trying to use Regex to get a piece of string between two tags that I know will be there in that string. 我是Perl的新手，并尝试使用Regex在两个我知道会在该字符串中的标签之间获取一个字符串。 I already tried various answers from stackoverflow but none of them seems to be working for me. 我已经尝试过stackoverflow的各种答案，但是似乎没有一个对我有用。 Here's my example... 这是我的例子

The required data is in $info variable out of which I want to get the useful data 所需的数据在$ info变量中，我想从中获取有用的数据

my $info = "random text i do not want\n|BIRTH PLACE=Boston, MA\n|more unwanted random text";

The Useful Data in the above string is Boston, MA . 上面字符串中的有用数据是Boston, MA 。 I removed the newlines from the string by $info =~ s/\\n//g; 我通过$info =~ s/\\n//g;从字符串中删除了换行符$info =~ s/\\n//g; . 。 Now $info has this string "random text i do not want|BIRTH PLACE=Boston, MA|more unwanted random text" . 现在$info具有此字符串"random text i do not want|BIRTH PLACE=Boston, MA|more unwanted random text" 。 I thought doing this will help me capture the required data easily. 我认为这样做将有助于我轻松捕获所需的数据。

Please help me in getting the required data. 请帮助我获取所需的数据。 I am sure that the data will always be preceded by |BIRTH PLACE= and succeeded by | 我确信数据将始终在|BIRTH PLACE=之前，并在| . 。 Everything before and after that is unwanted text. 在此之前和之后的所有内容都是不需要的文本。 If a question like this is already answered please guide me to it as well. 如果已经回答了这样的问题，请也指导我。 Thanks. 谢谢。

Answer 1

除了替换周围的所有内容，您还可以搜索/\\|BIRTH PLACE=([^\\|]+)\\n\\|/ ，[^ \\ |] + anything that is not a pipe的anything that is not a pipe一项或多项。

Answer 2

$info =~ m{\|BIRTH PLACE=(.*?)\|} or die "There is no data in \$info?!";
my $birth_place = $1;

That should do the trick. 这应该够了吧。

Answer 3

You know, actually, those newlines might have helped you. 您知道，实际上，这些换行符可能对您有所帮助。 I would have gone for an initial regular expression of: 我本来会想要一个初始正则表达式：

/^\|BIRTH PLACE=(.*)$/m

Using the multiline modifer ( m ) to match ^ at the beginning of a line and $ at the end of it, instead of just matching at the beginning and end of the string. 使用多行修饰符（ m ）来匹配行首的^和末尾的$ ，而不仅仅是匹配字符串的首尾。 Heck, you can even get really crazy and match: 哎呀，你甚至可以变得非常疯狂并匹配：

/(?<=^\|BIRTH PLACE=).+$/m

To capture only the information you want, using lookbehind ( (?<= ... ) ) to assert that it's the birth place information. 若要仅捕获所需的信息，请使用后向（ (?<= ... ) ）断言这是出生地信息。

Why curse the string twice when you can do it once? 为什么一次只能诅咒两次？

So, in perl: 因此，在perl中：

if ($info =~ m/(?<=^\|BIRTH PLACE=).+$/m) {
    print "Born in $&.\n";
} else {
    print "From parts unknown";
}

Answer 4

You have presumably read this data from a file, which is a bad start. 您大概已经从文件中读取了此数据，这是一个糟糕的开始。 You program should look like this 您的程序应如下所示

use strict;
use warnings;

use autodie;

open my $fh, '<', 'myfile';

my $pob;
while (<$fh>) {
  if (/BIRTH PLACE=(.+)/) {
    $pob = $1;
    last;
  }
}

print $pob;

output 输出

Boston, MA

Perl正则表达式，获取两个字符串之间的字符串

问题描述

4 个解决方案

解决方案1
3 2013-02-08 15:49:34

解决方案2
1 已采纳

解决方案3
1 2013-02-08 16:00:36

解决方案4
1 2013-02-08 16:28:23

Perl正则表达式，获取两个字符串之间的字符串

问题描述

4 个解决方案

解决方案1 3 2013-02-08 15:49:34

解决方案2 1 已采纳

解决方案3 1 2013-02-08 16:00:36

解决方案4 1 2013-02-08 16:28:23

解决方案1
3 2013-02-08 15:49:34

解决方案2
1 已采纳

解决方案3
1 2013-02-08 16:00:36

解决方案4
1 2013-02-08 16:28:23