简体   繁体   English

正则表达式,用于查找字符串中的1到3个字符

[英]Regex for finding between 1 and 3 character in a string

I am trying to write a regex which should return true, if [A-Za-z] is occured between 1 and 3, but I am not able to do this 我正在尝试编写一个正则表达式,如果在1到3之间出现[A-Za-z],则应返回true,但我无法执行此操作

public static void main(String[] args) {
    String regex = "(?:([A-Za-z]*){3}).*";
    String regex1 = "(?=((([A-Za-z]){1}){1,3})).*";

    Pattern pattern = Pattern.compile(regex);
    System.out.println(pattern.matcher("AD1CDD").find());
}

Note: for consecutive 3 characters I am able to write it, but what I want to achieve is the occurrence should be between 1 and 3 only for the entire string. 注意:对于连续的3个字符,我可以编写它,但是我要实现的是,对于整个字符串,出现的次数应该在1到3之间。 If there are 4 characters, it should return false. 如果有4个字符,则应返回false。 I have used look-ahead to achieve this 我已经使用超前实现

If I understand your question correctly, you want to check if 如果我正确理解您的问题,则要检查

  • 1 to 3 characters of the range [a-zA-Z] are in the string 字符串中包含[a-zA-Z]范围的1至3个字符
  • Any other character can occur arbitrary often? 任何其他字符都可能经常出现任意性吗?

First of all, just counting the characters and not using a regular expression is more efficient, as this is not a regular language problem, but a trivial counting problem. 首先,仅对字符进行计数而不使用正则表达式会更有效,因为这不是常规语言问题,而是琐碎的计数问题。 There is nothing wrong with using a for loop for this problem (except that interpreters such as Python and R can be fairly slow). 使用for循环解决此问题没有什么错(除了诸如Python和R之类的解释器可能相当慢)。

Nevertheless, you can (ab-) use extended regular expressions: 不过,您可以(ab-)使用扩展的正则表达式:

^([^A-Za-z]*[A-Za-z]){1,3}[^A-Za-z]*$

This is fairly straightforward, once you also model the "other" characters. 一旦您还对“其他”字符进行建模,这将非常简单。 And that is what you should do to define a pattern: model all accepted strings (ie the entire "language"), not only those characters you want to find. 这就是您定义模式所要做的:对所有可接受的字符串(即整个“语言”)建模,不仅要建模您要查找的那些字符。

Alternatively, you can "findAll" matches of ([A-Za-z]) , and look at the length of the result. 另外,您可以“ findAll”匹配([A-Za-z]) ,并查看结果的长度。 This may be more convenient if you also need the actual characters. 如果您还需要实际的字符,这可能会更方便。

The for loop would look something like this: for循环如下所示:

public static boolean containsOneToThreeAlphabetic(String str) {
    int matched = 0;
    for(int i=0; i<str.length; i++) {
        char c = str.charAt(i);
        if ((c>='A' && c<='Z') || (c>='a' && c<='z')) matched++;
    }
    return matched >=1 && matched <= 3;
}

This is straightforward, readable, extensible, and efficient (in compiled languages). 这是直接,可读,可扩展和高效的(在编译语言中)。 You can also add a if (matched>=4) return false; 您还可以添加if (matched>=4) return false; (or break ) to stop early. (或break )以尽早停止。

Please, stop playing with regex, you'll complicate not only your own life, but the life of the people, who have to handle your code in the future. 请停止使用正则表达式,这不仅会使您自己的生活变得复杂,而且会使将来必须处理您的代码的人们的生活更加复杂。 Choose a simpler approach, find all [A-Za-z]+ strings, put them into the list, then check every string, if the length is within 1 and 3 or beyond that. 选择一种更简单的方法,找到所有[A-Za-z]+字符串,将它们放入列表中,然后检查每个字符串,如果长度在1到3之间或超出该范围。

Regex 正则表达式

/([A-Za-z])(?=(?:.*\\1){3})/s

Looking for a char and for 3 repetitions of it. 寻找一个char及其3个重复。 So if it matches there are 4 or more equal chars present. 因此,如果匹配,则存在4个或更多相等的字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM