简体   繁体   English

Perl模式匹配无法按预期工作

[英]Perl pattern match not working as expected

I'm trying to match values, which may be comma separated, using a regex. 我正在尝试使用正则表达式匹配可能以逗号分隔的值。 Basically, I want to return true if any value in the string does NOT have 3g or 3k starting in the 3rd position. 基本上,如果字符串中的任何值在第3个位置没有3g或3k,我想返回true。

My test code is as follows: 我的测试代码如下:

my @a = ('in3g123456,dh3k123456,dhec110101','dhec110101,dhec123456','in3g123456,dh3k123456', 'c3kasdf', 'usdfusdufs3gsdf' );

foreach (@a) {
  print $_;
  say $_ =~ /(?:^|,)\w{2}[^(?:3G|3K)]/i ? " true" : " false";
}

This returns 这回来了

in3g123456,dh3k123456,dhec110101 true
dhec110101,dhec123456 true
in3g123456,dh3k123456 false
c3kasdf false   <- whaaaaaaaat?
usdfusdufs3gsdf true

I don't understand why the 4th one is not true. 我不明白为什么第四个不是真的。 Any help would be appreciated. 任何帮助,将不胜感激。

[^(?:3G|3K)] reads as "any character but ( , ? , etc." [^(?:3G|3K)]读为“任何字符,但(?等)”

                      failed
                      v
        c3            kasdf
/(?:^|,)\w{2}[^(?:3G|3K)]/i

Use this: 用这个:

/(?:^|,)\w{2}(?!3G|3K)/i

Demo: https://regex101.com/r/P2XsgN/1 . 演示: https//regex101.com/r/P2XsgN/1

How about /\\b\\w{2}(?!3g|3k)/i . 怎么样/\\b\\w{2}(?!3g|3k)/i

\\b matches the empty string at the beginning or end of a word. \\b匹配单词开头或结尾的空字符串。 Slightly simpler equivalent to (^|,) in this situation. 在这种情况下略微简单等同于(^|,)

(?!foo) is a zero-width negative lookahead assertion. (?!foo)是零宽度负前瞻断言。 So, matches the empty string as long as it's not followed by a substring that matches foo . 因此,匹配空字符串,只要它后面没有匹配foo的子字符串。

You can also split the string first, instead of parsing everything with a regex. 您也可以先拆分字符串,而不是使用正则表达式解析所有内容。 That is far more flexible and maintainable, and easier. 这更灵活,更易于维护,也更容易。

When processing the list of the extracted "values" you can match any character twice then your pattern, /^..$patt/ . 处理提取的“值”列表时,您可以匹配任何字符两次,然后匹配您的模式/^..$patt/ The module List::MoreUtils is useful (and fast) for list manipulations, and its notall function is tailor-made for your condition. 模块List :: MoreUtils对于列表操作很有用(而且速度很快),并且它的notall函数是根据您的条件量身定制的。

use warnings 'all';
use strict;
use List::MoreUtils qw(notall);

my $file = '...';
open my $fh, '<', $file or die "Can't open $file: $!";

while (<$fh>)
{
    my $res = notall { /^..(?:3k|3g)/ } split /,/;

    print "$_: " . ($res ? 'true' : 'false'), "\n";
}

I presume that you read from a file. 我假设你从一个文件中读取。 If not, replace while (<$fn>) with for (@strings) . 如果没有,用for (@strings)替换while (<$fn>) for (@strings)

The notall function returns true if any element of the list fails the condition. 如果列表的任何元素未通过条件,则notall函数返回true。

The split by default uses $_ so we only need the pattern. 默认情况下拆分使用$_所以我们只需要模式。 Here it is simply , but the pattern takes a regex so one can match separators flexibly. 这很简单,但模式采用正则表达式,因此可以灵活地匹配分隔符。 For example, this /[,\\s]+/ splits on any amount of , and/or whitespace. 例如,这个/[,\\s]+/上的任何量分裂,和/或空白。 So ,, , in a string is matched as a separator, as well as , or space(s). 所以,, ,在一个字符串中匹配为分隔符,或者空格。

When applied to the array with your strings the above prints 当使用您的字符串应用于数组时,上面打印

in3g123456,dh3k123456,dhec110101: true
dhec110101,dhec123456: true
in3g123456,dh3k123456: false
c3kasdf: true
usdfusdufs3gsdf: true

You could use substr to get data at 3rd and 4th position and then compare it with (3g|3k) . 您可以使用substr在第3和第4位置获取数据,然后将其与(3g|3k)进行比较。

substr $_,2,2

#!/usr/bin/perl
use strict;
use warnings;

my @a = ('in3g123456,dh3k123456,dhec110101','dhec110101,dhec123456','in3g123456,dh3k123456', 'c3kasdf', 'usdfusdufs3gsdf' );

foreach (@a) {
  my @inputs = split /,/,$_;
  my $flag = 0;
  foreach (@inputs){
    $flag = 1 unless ((substr $_,2,2) =~ /(3g|3k)/);
  }
  $flag ? print "$_: True\n" : print "$_: False\n";
}

Output: 输出:

in3g123456,dh3k123456,dhec110101: True
dhec110101,dhec123456: True
in3g123456,dh3k123456: False
c3kasdf: True
usdfusdufs3gsdf: True

Demo 演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM