简体   繁体   中英

Conditional .NET regex

I want to write an F#.NET Boolean function named IsStrValid using a regex that decides whether a given string s confirms to the following rules or not:

  • s is 4-character long
  • 1st and 3rd characters are either A, B, or C
  • 2nd character is either 1, 2, or 3
  • 4th character is either 1, 2, or 3, except for when the 1st and 3rd characters are the same - then the 4th character cannot be the same as the 2nd one.

Eg:

IsStrValid "A3B3" //true
IsStrValid "A3A3" //false

This is how far I've gotten; stuck at the conditional part ( ??? ):

let IsStrValid (s : string) =   
    Regex.IsMatch(s, @"^([ABC])([123])([ABC])(?(???)[123]|[???])$")

While regular expressions support backreferences , trying to do complex logic like 'if X then Y' is pretty difficult. You could do something like this with a negative lookahead assertion:

let IsStrValid (s : string) = 
    Regex.IsMatch(s, @"^([ABC][123])(?!\1)[ABC][123]$")

However, as Mathew suggests , if it gets more complicated than this, it's probably easier to simply test the conditions directly, like this:

let IsStrValid (s : string) =
    let isLet i = Seq.exists ((=)s.[i]) ['A'; 'B'; 'C']
    let isNum i = Seq.exists ((=)s.[i]) ['1'; '2'; '3']
    (s.Length = 4) && (isLet 0) && (isNum 1) && (isLet 2) && (isNum 3) &&
    (s.[0] <> s.[2]) || (s.[1] <> s.[3]) 

This sounds like a perfect application for conditionals, but they actually make this job more difficult, not less. It's much easier to do it the old-fashioned way, with lookaheads.

^
([ABC])
([123])
(?:
   (?!\1)[ABC][123]
 |
   \1(?!\2)[123]
)
$

This implements your description almost verbatim. If the third character is one of the allowed letters, but not the same as the first one, grab it and any of the allowed digits. If the third character is the same as the first, grab it followed by one of the allowed digits, unless it's the same as the first digit.

But I think even that's more complex than it needs to be. It seems to me your requirements can be restated as two consecutive pairs of characters that match the regex [ABC][123] , but must not match each other

^([ABC])([123])(?!\1\2)[ABC][123]$

I agree with Matthew that there's no reason to use a regex for this. However, it's certainly possible, even without backreferences. First of all, there's a finite set of strings, so you could just list them all out (or write a simple function to generate this string):

A1A2|A1A3|A1B1|A1B2|...

But it's also easy to come up with a slightly less absurd approach. You require that the first and third character differ or the second and fourth ones do (or both), so you can write that out explicitly (use RegexOptions.IgnorePatternWhitespace ):

[A-C]1[A-C](2|3) |
[A-C]2[A-C](1|3) |
[A-C]3[A-C](2|3) |
A[1-3](B|C)[1-3] |
B[1-3](A|C)[1-3] |
C[1-3](A|B)[1-3]

or you can group this a bit differently to factor out some of the repetition:

[A-C](1[A-C](2|3)|
      2[A-C](1|3)|
      3[A-C](1|2))|
(A[1-3](B|C)|
 B[1-3](A|C)|
 C[1-3](A|B))[1-3]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM