简体   繁体   English

C#在正则表达式中匹配字母,数字和特殊字符

[英]C# Matching letters, numbers and special character in regex

I have the following Regex : 我有以下正则Regex

[A-Z]{2}[0-9]{4}

and it matches perfectly with a string like this: AB1234 . 并且它与这样的字符串完全匹配: AB1234 But I have to improve this Regex to match with these specific rules: 但是我必须改进此正则Regex以使其符合以下特定规则:

  • The string must have only two sharps (##) between each group ( AB1234##AB1234 ) 字符串在每个组( AB1234##AB1234 )之间必须只有两个尖号(##)
  • It may have 8 groups of string ( AB1234##AB1234##AB1234##AB1234##AB1234##AB1234##AB1234##AB1234 ) 它可能具有8组字符串( AB1234##AB1234##AB1234##AB1234##AB1234##AB1234##AB1234##AB1234
  • Regardless the number of groups, the last one cannot have the sharps (##) at the end. 无论组的数量如何,最后一个组的末尾都不能使用尖号(##)。 So, if I have 3 groups, it will looks like this: AB1234##AB1234##AB1234 因此,如果我有3个群组,它将看起来像这样: AB1234##AB1234##AB1234

If I use the sample string from the second bullet point, my Regex will match with the pattern, but in this case it doesn't validade the characters between each group. 如果我使用第二个项目符号点的示例字符串,则我的Regex将与该模式匹配,但是在这种情况下,它不能验证每个组之间的字符。

Can anyone help me to improve this Regex? 谁能帮助我改进此正则表达式?

尝试这个:

^([A-Z]{2}[0-9]{4}##){0,7}([A-Z]{2}[0-9]{4})$
([A-Z]{2}[0-9]{4}##){0,7}([A-Z]{2}[0-9]{4})

You can combine Regex and LINQ then use an extension method like this: 您可以将RegexLINQ结合使用,然后使用如下扩展方法:

public static bool Validate(this string source)
{
    string pattern = "[A-Z]{2}[0-9]{4}";
    return !source.StartsWith("##") &&
           !source.EndsWith("##") &&
           source.Split(new[] {"##"}, StringSplitOptions.RemoveEmptyEntries)
                 .All(x => Regex.IsMatch(x, pattern));
}

Usage: 用法:

bool t1 = "AB1234##AB1234".Validate(); // true
bool t2 = "AB1234##AB1234##AB1234".Validate(); // true
bool t3 = "AB1234##AB1234##" // false
^(?:[A-Z]{2}[0-9]{4})(?:##(?:[A-Z]{2}[0-9]{4})){0,7}$
    ^^^^^^^^^^^^^^^^    ^^^^^^^^^^^^^^^^^^^^^   ^^^
          (1)                    (2)            (3)
  1. Match at least one string that begins with two capital letters that is followed by 4 decimal digits. 匹配至少一个以两个大写字母开头,后跟4个十进制数字的字符串。
  2. Optionally follow that from zero to 7 times by ## and a repetition of the first match. (可选)将其从零到7乘以##并重复第一个匹配项。

Result: ( * indicates a match) 结果:( *表示匹配)

* AB1234
  AB1234x
* AB1234##AB1234
* AB1234##AB1234##AB1234
  AB1234##AB1234##AB1234x

See the live demo . 观看现场演示

Note: This answer is quite similar to this other answer . 注意:此答案与该其他答案非常相似。 However, the answer here begins with the assumption that at least one sequence of AB1234 is present. 但是,这里的答案始于假设存在至少一个 AB1234序列。 And it then allows for the possibility that it is followed zero to 7 times by ##AB1234 . 它然后允许它是由跟随零至7倍的可能性 ##AB1234 In the end, both regex expressions are fine. 最后,两个正则表达式都很好。 It comes down to personal preference. 这取决于个人喜好。

Also note that I used non-capturing groups (?:...) to avoid the unnecessary overhead of creating capture groups that aren't needed in this situation. 还要注意,我使用了非捕获组(?:...)来避免创建在这种情况下不需要的捕获组的不必要的开销。 (Capture groups are also known as back-references.) (捕获组也称为反向引用。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM