[英]Regex to match exactly n occurrences of letters and m occurrences of digits
I have to match an 8 character string, which can contain exactly 2 letters (1 uppercase and 1 lowercase), and exactly 6 digits, but they can be permutated arbitrarily. 我必须匹配一个8个字符的字符串,它可以包含2个字母(1个大写和1个小写),正好是6个数字,但它们可以任意排列。
So, basically: 所以,基本上:
I've tried using the positive lookahead to make sure that the string contains digits, uppercase and lowercase letters, but I have trouble with limiting it to a certain number of occurrences. 我已经尝试使用正向前瞻来确保字符串包含数字,大写和小写字母,但是我将其限制为一定数量的出现时遇到了麻烦。 I suppose I could go about it by including all possible combinations of where the letters and digits can occur:
我想我可以通过包括字母和数字可能出现的所有可能组合来解决它:
(?=.*[0-9])(?=.*[A-Z])(?=.*[a-z]) ([A-Z][a-z][0-9]{6})|([A-Z][0-9][a-z][0-9]{5})| ... | ([0-9]{6}[a-z][A-Z])
But that's a very roundabout way of doing it, and I'm wondering if there's a better solution. 但这是一种非常迂回的方式,我想知道是否有更好的解决方案。
You can use 您可以使用
^(?=[^A-Z]*[A-Z][^A-Z]*$)(?=[^a-z]*[a-z][^a-z]*$)(?=(?:\D*\d){6}\D*$)[a-zA-Z0-9]{8}$
See the regex demo (a bit modified due to the multiline input). 请参阅正则表达式演示 (由于多行输入而稍有修改)。 In Java, do not forget to use double backslashes (eg
\\\\d
to match a digit). 在Java中,不要忘记使用双反斜杠(例如
\\\\d
匹配一个数字)。
Here is a breakdown: 这是一个细分:
^
- start of string (assuming no multiline flag is to be used) ^
- 字符串的开头(假设不使用多行标志) (?=[^AZ]*[AZ][^AZ]*$)
- check if there is only 1 uppercase letter (use \\p{Lu}
to match any Unicode uppercase letter and \\P{Lu}
to match any character other than that) (?=[^AZ]*[AZ][^AZ]*$)
- 检查是否只有1个大写字母(使用\\p{Lu}
匹配任何Unicode大写字母和\\P{Lu}
匹配任何字符除此之外) (?=[^az]*[az][^az]*$)
- similar check if there is only 1 lowercase letter (alternatively, use \\p{Ll}
and \\P{Ll}
to match Unicode letters) (?=[^az]*[az][^az]*$)
- 类似检查是否只有1个小写字母(或者,使用\\p{Ll}
和\\P{Ll}
来匹配Unicode字母) (?=(?:\\D*\\d){6}\\D*$)
- check if there are six digits in a string (=from the beginning of the string, there can be 0 or more non-digit symbols ( \\D
matches any character but a digit, you may also replace it with [^0-9]
), then followed by a digit ( \\d
) and then followed by 0 or more non-digit characters ( \\D*
) up to the end of string ( $
)) and then (?=(?:\\D*\\d){6}\\D*$)
- 检查字符串中是否有六位数字(=从字符串的开头,可以有0个或更多的非数字符号( \\D
匹配除数字之外的任何字符,您也可以用[^0-9]
替换它,然后是数字( \\d
),然后是0或更多非字母字符( \\D*
),直到字符串( $
)的结尾然后 [a-zA-Z0-9]{8}
- match exactly 8 alphanumeric characters. [a-zA-Z0-9]{8}
- 恰好匹配8个字母数字字符。 $
- end of string. $
- 结束字符串。 Following the logic, we can even reduce this to just 按照逻辑,我们甚至可以减少这一点
^(?=[^a-z]*[a-z][^a-z]*$)(?=(?:\D*\d){6}\D*$)[a-zA-Z0-9]{8}$
One condition can be removed as we only allow lower- and uppercase letters and digits with [a-zA-Z0-9]
, and when we apply 2 conditions the 3rd one is automatically performed when matching the string (one character must be an uppercase in this case). 可以删除一个条件,因为我们只允许使用
[a-zA-Z0-9]
小写和大写字母和数字,当我们应用2个条件时,第3个条件在匹配字符串时自动执行(一个字符必须是大写的在这种情况下)。
When using it with Java matches()
method, there is no need to use ^
and $
anchors at the start and end of the pattern, but you still need it in the lookaheads: 当使用Java
matches()
方法时,不需要在模式的开头和结尾使用^
和$
anchors,但是在前瞻中你仍然需要它:
String s = "K82v6686";
String rx = "(?=[^a-z]*[a-z][^a-z]*$)" + // 1 lowercase letter check
"(?=(?:\\D*\\d){6}\\D*$)" + // 6 digits check
"[a-zA-Z0-9]{8}"; // matching 8 alphanum chars exactly
if (s.matches(rx)) {
System.out.println("Valid");
}
Pattern.matches(".*[A-Z].*", s) &&
Pattern.matches(".*[a-z].*", s) &&
Pattern.matches(".*(\\D*\\d){6}.*", s) &&
Pattern.matches(".{8}", s)
As we need an alternating automaton to be created for this task, it's much simpler to use a conjunction of regexps for constituent types of character. 由于我们需要为此任务创建交替自动机 ,因此使用正则表达式的组合来构造字符类型要简单得多。
We require it to have at least one lowercase letter, one uppercase letter and 6 digits, which three classes are mutually exclusive. 我们要求它至少有一个小写字母,一个大写字母和6个数字,这三个类是互斥的。 And with the last condition we require the length of string to be exactly the sum of these numbers in such a way leaving no room for extra characters beyond the desired types.
在最后一个条件下,我们要求字符串的长度恰好是这些数字的总和,这样就不会留出超出所需类型的额外字符的空间。 Of course we may say
s.lenght() == 8
as the last condition term but this would break the style :). 当然我们可能会说
s.lenght() == 8
作为最后一个条件,但这会打破风格:)。
词法排序字符串,然后匹配^(?:[az][AZ]|[AZ][az])[0-9]{6}$
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.