简体   繁体   English

需要在同一行中匹配多个模式-Perl

[英]Need to match multiple pattern in the same line - Perl

I need to match multiple pattern in the same line. 我需要在同一行中匹配多个模式。 For example, in this file: 例如,在此文件中:

Hello, Chester [McAllister;Scientist] lives in Boston [Massachusetts;USA;Fenway Park] # McAllister works in USA
I'm now working in New-York [NYC;USA] # I work in USA
...

First, I want to match every string into the brackets knowing that it is possible to have more than 1 pattern and also that we can have 1 to n strings into the brackets always separated by a semicolon. 首先,我想将每个字符串都匹配到括号中,因为知道可能有不止一种模式,而且我们可以将1到n个字符串始终用分号隔开。

Finally, for each line i need to compare the values to the string located after the # . 最后,对于每一行,我需要将值与#之后的字符串进行比较。 For example in the first sentence, i want to compare: 例如在第一句话中,我想比较一下:

[McAllister;Scientist] & [Massachusetts;USA;Fenway Park] TO "McAllister works in USA"

The tidiest way is probably to use a regex to find all the embedded sequences delimited by square brackets, and then use map with split to separate those sequences into terms. 最简单的方法可能是使用正则表达式查找所有由方括号定界的嵌入序列,然后使用带有split map将这些序列分成术语。

This program demonstrates. 该程序演示。

Note that I have assumed that all of the data in the file has been read into a single scalar variable. 请注意,我假设文件中的所有数据都已读入单个标量变量。 You can alter this to process a single line at a time, but only if the bracketed subsequences are never split across multiple lines 您可以将其更改为一次只处理一行,但前提是方括号中的子序列永远不会拆分成多行

use strict;
use warnings;

my $s = <<END_TEXT;
Hello, Chester [McAllister;Scientist] lives in Boston [Massachusetts;USA;Fenway Park] # McAllister works in USA
I'm now working in New-York [NYC;USA] # I work in USA
END_TEXT

my @data = map [ split /;/ ], $s =~ / \[ ( [^\[\]]+ ) \] /xg;

use Data::Dump;
dd \@data;

output 输出

[
  ["McAllister", "Scientist"],
  ["Massachusetts", "USA", "Fenway Park"],
  ["NYC", "USA"],
]

Try this 尝试这个

This is also gives what you expect. 这也给出了您的期望。

use strict;
use warnings;
open('new',"file.txt");
my @z =map{m/\[[\w;\s]+\]/g} <new>;
print "$_ ,",foreach(@z);

You actually need match the words separated by the ; 您实际上需要匹配用;分隔的单词; within the [] . []

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM