简体   繁体   中英

Regex to match capital letters from the end of a string with a twist

Given a string such as "fooBAR", I wish to match the capital letters at the end of the string (ie "BAR"), with the following twists:

  1. the match must contain at least two letters
  2. the match must not contain any of the following: 1D, 2D, 3D, 4D

Examples:

"fooB" -> ""
"fooBAR" -> "BAR"
"foo64BAR" -> "BAR"
"foo64BR" -> "BR"
"fooDBAR" -> "DBAR"
"foo12BAR" -> "BAR"
"foo1DBAR" -> "BAR"

The trivial regex

[A-Z][A-Z]+

fails the last example (returns "DBAR" instead of "BAR".)

A negative lookbehind such as

(?<![1-4D])[A-Z][A-Z]+

also fails the last example (returns "AR" instead of "BAR".)

I have a feeling that this should be straightforward, but for the love of me I cannot find the solution. Any ideas?

I'd just go with some simple patterns combined with a bit of code, along these lines:

string GetMatch(string input)
{
    var match = Regex.Match(input, @"\d?([A-Z]{2,})$");
    if (Regex.Match(match.Value, @"^[1-4]D").Success)
        return match.Groups[1].Value.Substring(1);
    else
        return match.Groups[1].Value;
}

The negative lookbehind does not work because it requires something other than 1, 2, 3, 4, or D before the match. The "B" in your example fits the assertion, so the match is everything after that.

You need to look for a D that is NOT prefixed by a 1-4 followed by another upper case letter, or any letter except D followed by another upper case letter.

((?<![1-4])D|[ABCE-Z])[A-Z]+$

使用Regex选项Right To Left从头开始工作:

(?<![1-4])D?[A-Z-[D]]{2,}

The following should work:

(?!(?<=[1-4])D)[A-Z]{2,}$

The primary regex here is [AZ]{2,} , which will match two or more uppercase characters at the end of a string. The negative lookahead (?!(?<=[1-4])D) covers your other requirement. This can be read as "fail if the previous character was a digit from 1 to 4, and the next character is a D".

If you want to match at the end of a line instead of the end of a string use RegexOptions.Multiline .

Example: http://rubular.com/r/XgKv9pavJd

FJ's regex pattern

(?!(?<=[1-4])D)[A-Z]{2,}$

is not correct for all possible inputs, such as

fooBar1DDBAR
fooBar1DDB

A little correction should make a trick:

(?<![A-Z])(?!(?<=[1-4])D)[A-Z]{2,}$

See a difference at http://rubular.com/r/dGJWj7lE79 (FJ) versus http://rubular.com/r/mOux7d4zv3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM