[英]Perl pattern match not working as expected
I'm trying to match values, which may be comma separated, using a regex. 我正在尝试使用正则表达式匹配可能以逗号分隔的值。 Basically, I want to return true if any value in the string does NOT have 3g or 3k starting in the 3rd position. 基本上,如果字符串中的任何值在第3个位置没有3g或3k,我想返回true。
My test code is as follows: 我的测试代码如下:
my @a = ('in3g123456,dh3k123456,dhec110101','dhec110101,dhec123456','in3g123456,dh3k123456', 'c3kasdf', 'usdfusdufs3gsdf' );
foreach (@a) {
print $_;
say $_ =~ /(?:^|,)\w{2}[^(?:3G|3K)]/i ? " true" : " false";
}
This returns 这回来了
in3g123456,dh3k123456,dhec110101 true
dhec110101,dhec123456 true
in3g123456,dh3k123456 false
c3kasdf false <- whaaaaaaaat?
usdfusdufs3gsdf true
I don't understand why the 4th one is not true. 我不明白为什么第四个不是真的。 Any help would be appreciated. 任何帮助,将不胜感激。
[^(?:3G|3K)]
reads as "any character but (
, ?
, etc." [^(?:3G|3K)]
读为“任何字符,但(
, ?
等)”
failed
v
c3 kasdf
/(?:^|,)\w{2}[^(?:3G|3K)]/i
Use this: 用这个:
/(?:^|,)\w{2}(?!3G|3K)/i
Demo: https://regex101.com/r/P2XsgN/1 . 演示: https : //regex101.com/r/P2XsgN/1 。
How about /\\b\\w{2}(?!3g|3k)/i
. 怎么样/\\b\\w{2}(?!3g|3k)/i
。
\\b
matches the empty string at the beginning or end of a word. \\b
匹配单词开头或结尾的空字符串。 Slightly simpler equivalent to (^|,)
in this situation. 在这种情况下略微简单等同于(^|,)
。
(?!foo)
is a zero-width negative lookahead assertion. (?!foo)
是零宽度负前瞻断言。 So, matches the empty string as long as it's not followed by a substring that matches foo
. 因此,匹配空字符串,只要它后面没有匹配foo
的子字符串。
You can also split the string first, instead of parsing everything with a regex. 您也可以先拆分字符串,而不是使用正则表达式解析所有内容。 That is far more flexible and maintainable, and easier. 这更灵活,更易于维护,也更容易。
When processing the list of the extracted "values" you can match any character twice then your pattern, /^..$patt/
. 处理提取的“值”列表时,您可以匹配任何字符两次,然后匹配您的模式/^..$patt/
。 The module List::MoreUtils is useful (and fast) for list manipulations, and its notall
function is tailor-made for your condition. 模块List :: MoreUtils对于列表操作很有用(而且速度很快),并且它的notall
函数是根据您的条件量身定制的。
use warnings 'all';
use strict;
use List::MoreUtils qw(notall);
my $file = '...';
open my $fh, '<', $file or die "Can't open $file: $!";
while (<$fh>)
{
my $res = notall { /^..(?:3k|3g)/ } split /,/;
print "$_: " . ($res ? 'true' : 'false'), "\n";
}
I presume that you read from a file. 我假设你从一个文件中读取。 If not, replace while (<$fn>)
with for (@strings)
. 如果没有,用for (@strings)
替换while (<$fn>)
for (@strings)
。
The notall
function returns true if any element of the list fails the condition. 如果列表的任何元素未通过条件,则notall
函数返回true。
The split by default uses $_
so we only need the pattern. 默认情况下拆分使用$_
所以我们只需要模式。 Here it is simply ,
but the pattern takes a regex so one can match separators flexibly. 这很简单,
但模式采用正则表达式,因此可以灵活地匹配分隔符。 For example, this /[,\\s]+/
splits on any amount of ,
and/or whitespace. 例如,这个/[,\\s]+/
上的任何量分裂,
和/或空白。 So ,, ,
in a string is matched as a separator, as well as ,
or space(s). 所以,, ,
在一个字符串中匹配为分隔符,
或者空格。
When applied to the array with your strings the above prints 当使用您的字符串应用于数组时,上面打印
in3g123456,dh3k123456,dhec110101: true dhec110101,dhec123456: true in3g123456,dh3k123456: false c3kasdf: true usdfusdufs3gsdf: true
You could use substr to get data at 3rd and 4th position and then compare it with (3g|3k)
. 您可以使用substr在第3和第4位置获取数据,然后将其与(3g|3k)
进行比较。
substr $_,2,2
#!/usr/bin/perl
use strict;
use warnings;
my @a = ('in3g123456,dh3k123456,dhec110101','dhec110101,dhec123456','in3g123456,dh3k123456', 'c3kasdf', 'usdfusdufs3gsdf' );
foreach (@a) {
my @inputs = split /,/,$_;
my $flag = 0;
foreach (@inputs){
$flag = 1 unless ((substr $_,2,2) =~ /(3g|3k)/);
}
$flag ? print "$_: True\n" : print "$_: False\n";
}
Output: 输出:
in3g123456,dh3k123456,dhec110101: True
dhec110101,dhec123456: True
in3g123456,dh3k123456: False
c3kasdf: True
usdfusdufs3gsdf: True
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.