简体   繁体   English

正则表达式验证3个重复字符

[英]Regex to validate 3 repeating characters

I'm trying to validate password which should not allow 3 repeating characters regardless of their position in the string. 我正在尝试验证密码,该密码不应该允许3个重复字符,无论它们在字符串中的位置如何。

For example : 例如 :

121121 - Not Accepted, since 1 appears more than 3 times. 121121 - 未接受,因为1出现超过3次。

121212 - accepted, since 1 and 2 appears only 3 times 121212 - 接受,因为1和2只出现3次

I tried this 我试过这个

([0-9])\1{2,}

But its validating only consecutive repeated digits. 但它的验证仅连续重复数字。

I don't recommend using regular expressions for something like this, as it would be easier to just collect the password into a Map where a count of each character is maintained. 我不建议对这样的事情使用正则表达式,因为将密码收集到Map中会更容易,其中每个字符的计数都保持不变。 Then, you can just check if there exists any character which has a count of more than 3 : 然后,您可以检查是否存在任何数量超过3字符:

password.chars()
        .boxed()
        .collect(Collectors.groupingBy(i -> i, Collectors.counting()))
        .values()
        .stream()
        .anyMatch(i -> i > 3);

This returns true if there exists some character in password that appears more than 3 times, and false otherwise. 如果password中出现的字符超过3次,则返回true否则返回false

Use a regex with a negative look ahead with back reference: 使用带有反向引用的负面预测的正则表达式:

boolean ok = str.matches("((.)(?!(.*\\2){3}))+");

See live demo . 查看现场演示

In English, this regex says "every character must not appear 3 more times after itself". 在英语中,这个正则表达式说“每个角色一定不能再出现3次”。

The regex solution for this is very inefficient . 正则表达式的解决方案是非常低效的 Please consider treating this answer from pure academic interest. 请考虑从纯粹的学术兴趣来对待这个答案。

The pattern that fails strings having 4 or more occurrences of the same char is 使具有4个或更多相同char的字符串失败的模式是

^(?!.*(.).*\1.*\1.*\1).*

The last .* may be replaced with a more restrictive pattern if you need to precise this pattern. 如果您需要精确模式,最后一个.*可能会被更严格的模式替换。

See the regex demo . 请参阅正则表达式演示

The main part here is the (?!.*(.).*\\1.*\\1.*\\1) negative lookahead. 这里的主要部分是(?!.*(.).*\\1.*\\1.*\\1)负向前瞻。 It matches any 0+ chars (if Pattern.DOTALL is used, any char including newlines), as many as possible, then it matches and captures (with (.) ) any char into Group 1, and then matches any 0+ chars followed with the same char 3 times. 它匹配任何0+字符(如果使用Pattern.DOTALL ,任何字符包括换行符),尽可能多,然后它匹配并捕获 (带(.) )任何字符到组1,然后匹配任何0+字符跟随用同样的char 3次。 If the pattern is found (matched), the whole string match fails. 如果找到(匹配)模式,则整个字符串匹配失败。

Why is it inefficient? 为什么效率低下? The pattern relies heavily on backtracking. 该模式在很大程度上依赖于回溯。 .* grabs all chars to the end of the string, then the engine backtracks, trying to accommodate some text for the subsequent subpatterns. .*所有字符抓取到字符串的末尾,然后引擎回溯,尝试为后续子模式提供一些文本。 You may see the backtracking steps here . 您可能会在此处看到回溯步骤 The more .* there is, the more resource-consuming the pattern is. 越多.* ,模式消耗的资源越多。

Why is lazy variant not any better? 为什么懒惰的变种不是更好? The ^(?!.*?(.).*?\\1.*?\\1.*?\\1).* looks to be faster with some strings, and it will be faster if the repeating chars appear close to each other and the start of the string. ^(?!.*?(.).*?\\1.*?\\1.*?\\1).*看起来更快一些字符串,如果重复字符看起来接近每个字符串会更快其他和字符串的开头。 If they are at the end of the string, the efficiency will degrade. 如果它们位于字符串的末尾,则效率会降低。 So, if the previous regex matches 121212 in 77 steps, the current one will also take the same amount of steps. 因此,如果前一个正则表达式在77个步骤中匹配121212 ,则当前正则表达式也将采用相同数量的步骤。 However, if you test it against 1212124444 , you will see that the lazy variant will fail after 139 steps , while the greedy variant will fail after 58 steps . 但是,如果您针对1212124444测试,您将看到懒惰变体在139步后失败 ,而贪婪变体将在58步之后失败 And vice versa, 4444121212 will cause the lazy regex fail quicker, 14 steps vs. 211 steps with the greedy variant . 反之亦然, 4444121212将导致懒惰的正则表达式更快失败, 14步骤211步骤与贪婪的变体

In Java, you may use it 在Java中,您可以使用它

s.matches("(?!.*(.).*\\1.*\\1.*\\1)")

or 要么

s.matches("(?!.*?(.).*?\\1.*?\\1.*?\\1)")

Use Jacob's solution in production. 在生产中使用雅各布的解决方案

Can you use a map instead, 你能用地图吗?

public static void main(String[] args) {
    System.out.println(validate("121121"));
    System.out.println(validate("121212"));
}

static boolean validate(String s)
{
    HashMap<Character, Integer> map = new HashMap<>();
    for (Character c : s.toCharArray())
    {
        if (map.containsKey(c))
        {
            map.put(c, map.get(c) + 1 );
        }
        else
        {
            map.put(c , 1);
        }
    }

    for (Integer count : map.values())
    {
        if (count > 3)
            return false;
    }
    return true;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM