简体   繁体   English

正则表达式恰好匹配n次出现的字母和m次出现的数字

[英]Regex to match exactly n occurrences of letters and m occurrences of digits

I have to match an 8 character string, which can contain exactly 2 letters (1 uppercase and 1 lowercase), and exactly 6 digits, but they can be permutated arbitrarily. 我必须匹配一个8个字符的字符串,它可以包含2个字母(1个大写和1个小写),正好是6个数字,但它们可以任意排列。

So, basically: 所以,基本上:

  • K82v6686 would pass K82v6686会通过
  • 3w28E020 would pass 3w28E020会通过
  • 1276eQ900 would fail (too long) 1276eQ900会失败(太长时间)
  • 98Y78k9k would fail (three letters) 98Y78k9k会失败(三个字母)
  • A09B2197 would fail (two capital letters) A09B2197会失败(两个大写字母)

I've tried using the positive lookahead to make sure that the string contains digits, uppercase and lowercase letters, but I have trouble with limiting it to a certain number of occurrences. 我已经尝试使用正向前瞻来确保字符串包含数字,大写和小写字母,但是我将其限制为一定数量的出现时遇到了麻烦。 I suppose I could go about it by including all possible combinations of where the letters and digits can occur: 我想我可以通过包括字母和数字可能出现的所有可能组合来解决它:

(?=.*[0-9])(?=.*[A-Z])(?=.*[a-z]) ([A-Z][a-z][0-9]{6})|([A-Z][0-9][a-z][0-9]{5})| ... | ([0-9]{6}[a-z][A-Z])

But that's a very roundabout way of doing it, and I'm wondering if there's a better solution. 但这是一种非常迂回的方式,我想知道是否有更好的解决方案。

You can use 您可以使用

^(?=[^A-Z]*[A-Z][^A-Z]*$)(?=[^a-z]*[a-z][^a-z]*$)(?=(?:\D*\d){6}\D*$)[a-zA-Z0-9]{8}$

See the regex demo (a bit modified due to the multiline input). 请参阅正则表达式演示 (由于多行输入而稍有修改)。 In Java, do not forget to use double backslashes (eg \\\\d to match a digit). 在Java中,不要忘记使用双反斜杠(例如\\\\d匹配一个数字)。

Here is a breakdown: 这是一个细分:

  • ^ - start of string (assuming no multiline flag is to be used) ^ - 字符串的开头(假设不使用多行标志)
  • (?=[^AZ]*[AZ][^AZ]*$) - check if there is only 1 uppercase letter (use \\p{Lu} to match any Unicode uppercase letter and \\P{Lu} to match any character other than that) (?=[^AZ]*[AZ][^AZ]*$) - 检查是否只有1个大写字母(使用\\p{Lu}匹配任何Unicode大写字母和\\P{Lu}匹配任何字符除此之外)
  • (?=[^az]*[az][^az]*$) - similar check if there is only 1 lowercase letter (alternatively, use \\p{Ll} and \\P{Ll} to match Unicode letters) (?=[^az]*[az][^az]*$) - 类似检查是否只有1个小写字母(或者,使用\\p{Ll}\\P{Ll}来匹配Unicode字母)
  • (?=(?:\\D*\\d){6}\\D*$) - check if there are six digits in a string (=from the beginning of the string, there can be 0 or more non-digit symbols ( \\D matches any character but a digit, you may also replace it with [^0-9] ), then followed by a digit ( \\d ) and then followed by 0 or more non-digit characters ( \\D* ) up to the end of string ( $ )) and then (?=(?:\\D*\\d){6}\\D*$) - 检查字符串中是否有六位数字(=从字符串的开头,可以有0个或更多的非数字符号( \\D匹配除数字之外的任何字符,您也可以用[^0-9]替换它,然后是数字( \\d ),然后是0或更多非字母字符( \\D* ),直到字符串( $ )的结尾然后
  • [a-zA-Z0-9]{8} - match exactly 8 alphanumeric characters. [a-zA-Z0-9]{8} - 恰好匹配8个字母数字字符。
  • $ - end of string. $ - 结束字符串。

Following the logic, we can even reduce this to just 按照逻辑,我们甚至可以减少这一点

^(?=[^a-z]*[a-z][^a-z]*$)(?=(?:\D*\d){6}\D*$)[a-zA-Z0-9]{8}$

One condition can be removed as we only allow lower- and uppercase letters and digits with [a-zA-Z0-9] , and when we apply 2 conditions the 3rd one is automatically performed when matching the string (one character must be an uppercase in this case). 可以删除一个条件,因为我们只允许使用[a-zA-Z0-9]小写和大写字母和数字,当我们应用2个条件时,第3个条件在匹配字符串时自动执行(一个字符必须是大写的在这种情况下)。

When using it with Java matches() method, there is no need to use ^ and $ anchors at the start and end of the pattern, but you still need it in the lookaheads: 当使用Java matches()方法时,不需要在模式的开头和结尾使用^$ anchors,但是在前瞻中你仍然需要它:

String s = "K82v6686";
String rx = "(?=[^a-z]*[a-z][^a-z]*$)" +      // 1 lowercase letter check
            "(?=(?:\\D*\\d){6}\\D*$)" +       // 6 digits check
            "[a-zA-Z0-9]{8}";                 // matching 8 alphanum chars exactly
if (s.matches(rx)) {
    System.out.println("Valid"); 
} 
Pattern.matches(".*[A-Z].*", s) &&
Pattern.matches(".*[a-z].*", s) &&
Pattern.matches(".*(\\D*\\d){6}.*", s) &&
Pattern.matches(".{8}", s)

As we need an alternating automaton to be created for this task, it's much simpler to use a conjunction of regexps for constituent types of character. 由于我们需要为此任务创建交替自动机 ,因此使用正则表达式的组合来构造字符类型要简单得多。

We require it to have at least one lowercase letter, one uppercase letter and 6 digits, which three classes are mutually exclusive. 我们要求它至少有一个小写字母,一个大写字母和6个数字,这三个类是互斥的。 And with the last condition we require the length of string to be exactly the sum of these numbers in such a way leaving no room for extra characters beyond the desired types. 在最后一个条件下,我们要求字符串的长度恰好是这些数字的总和,这样就不会留出超出所需类型的额外字符的空间。 Of course we may say s.lenght() == 8 as the last condition term but this would break the style :). 当然我们可能会说s.lenght() == 8作为最后一个条件,但这会打破风格:)。

词法排序字符串,然后匹配^(?:[az][AZ]|[AZ][az])[0-9]{6}$

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM