简体   繁体   English

条件.NET正则表达式

[英]Conditional .NET regex

I want to write an F#.NET Boolean function named IsStrValid using a regex that decides whether a given string s confirms to the following rules or not: 我想写一个名为F#.NET布尔函数IsStrValid使用决定是否给定的字符串正则表达式s确认以下规则与否:

  • s is 4-character long s长4个字符
  • 1st and 3rd characters are either A, B, or C 第1和第3个字符是A,B或C.
  • 2nd character is either 1, 2, or 3 第二个字符是1,2或3
  • 4th character is either 1, 2, or 3, except for when the 1st and 3rd characters are the same - then the 4th character cannot be the same as the 2nd one. 第4个字符是1,2或3,除了第1个和第3个字符相同时 - 第4个字符不能与第2个字符相同。

Eg: 例如:

IsStrValid "A3B3" //true
IsStrValid "A3A3" //false

This is how far I've gotten; 这是我走了多远; stuck at the conditional part ( ??? ): 卡在条件部分( ??? ):

let IsStrValid (s : string) =   
    Regex.IsMatch(s, @"^([ABC])([123])([ABC])(?(???)[123]|[???])$")

While regular expressions support backreferences , trying to do complex logic like 'if X then Y' is pretty difficult. 虽然正则表达式支持反向引用 ,但尝试执行复杂的逻辑,如“如果X然后Y”则相当困难。 You could do something like this with a negative lookahead assertion: 您可以使用负前瞻断言执行类似的操作:

let IsStrValid (s : string) = 
    Regex.IsMatch(s, @"^([ABC][123])(?!\1)[ABC][123]$")

However, as Mathew suggests , if it gets more complicated than this, it's probably easier to simply test the conditions directly, like this: 然而,正如Mathew 所说 ,如果它变得比这更复杂,那么直接测试条件可能更容易,如下所示:

let IsStrValid (s : string) =
    let isLet i = Seq.exists ((=)s.[i]) ['A'; 'B'; 'C']
    let isNum i = Seq.exists ((=)s.[i]) ['1'; '2'; '3']
    (s.Length = 4) && (isLet 0) && (isNum 1) && (isLet 2) && (isNum 3) &&
    (s.[0] <> s.[2]) || (s.[1] <> s.[3]) 

This sounds like a perfect application for conditionals, but they actually make this job more difficult, not less. 这听起来像条件​​的完美应用,但它们实际上使这项工作更加困难,而不是更少。 It's much easier to do it the old-fashioned way, with lookaheads. 以老式的方式做到这一点要容易得多。

^
([ABC])
([123])
(?:
   (?!\1)[ABC][123]
 |
   \1(?!\2)[123]
)
$

This implements your description almost verbatim. 这几乎逐字地实现了您的描述。 If the third character is one of the allowed letters, but not the same as the first one, grab it and any of the allowed digits. 如果第三个字符是允许的字母之一,但与第一个字符不同,请抓取它和任何允许的数字。 If the third character is the same as the first, grab it followed by one of the allowed digits, unless it's the same as the first digit. 如果第三个字符与第一个字符相同,则抓住它后跟一个允许的数字,除非它与第一个数字相同。

But I think even that's more complex than it needs to be. 但我认为即使这比它需要的更复杂。 It seems to me your requirements can be restated as two consecutive pairs of characters that match the regex [ABC][123] , but must not match each other 在我看来,您的要求可以重新设置为与正则表达式[ABC][123]匹配的两对连续字符,但不能相互匹配

^([ABC])([123])(?!\1\2)[ABC][123]$

I agree with Matthew that there's no reason to use a regex for this. 我同意Matthew的说法,没有理由使用正则表达式。 However, it's certainly possible, even without backreferences. 然而,即使没有反向引用,它当然也是可能的。 First of all, there's a finite set of strings, so you could just list them all out (or write a simple function to generate this string): 首先,有一组有限的字符串,所以你可以把它们全部列出来(或者写一个简单的函数来生成这个字符串):

A1A2|A1A3|A1B1|A1B2|...

But it's also easy to come up with a slightly less absurd approach. 但是,提出一种略显荒谬的方法也很容易。 You require that the first and third character differ or the second and fourth ones do (or both), so you can write that out explicitly (use RegexOptions.IgnorePatternWhitespace ): 您需要第一个和第三个字符不同,或者第二个和第四个字符不同(或两个),因此您可以明确地写出它(使用RegexOptions.IgnorePatternWhitespace ):

[A-C]1[A-C](2|3) |
[A-C]2[A-C](1|3) |
[A-C]3[A-C](2|3) |
A[1-3](B|C)[1-3] |
B[1-3](A|C)[1-3] |
C[1-3](A|B)[1-3]

or you can group this a bit differently to factor out some of the repetition: 或者您可以对此进行不同的分组以分解一些重复:

[A-C](1[A-C](2|3)|
      2[A-C](1|3)|
      3[A-C](1|2))|
(A[1-3](B|C)|
 B[1-3](A|C)|
 C[1-3](A|B))[1-3]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM