简体   繁体   中英

Regex for finding between 1 and 3 character in a string

I am trying to write a regex which should return true, if [A-Za-z] is occured between 1 and 3, but I am not able to do this

public static void main(String[] args) {
    String regex = "(?:([A-Za-z]*){3}).*";
    String regex1 = "(?=((([A-Za-z]){1}){1,3})).*";

    Pattern pattern = Pattern.compile(regex);
    System.out.println(pattern.matcher("AD1CDD").find());
}

Note: for consecutive 3 characters I am able to write it, but what I want to achieve is the occurrence should be between 1 and 3 only for the entire string. If there are 4 characters, it should return false. I have used look-ahead to achieve this

If I understand your question correctly, you want to check if

  • 1 to 3 characters of the range [a-zA-Z] are in the string
  • Any other character can occur arbitrary often?

First of all, just counting the characters and not using a regular expression is more efficient, as this is not a regular language problem, but a trivial counting problem. There is nothing wrong with using a for loop for this problem (except that interpreters such as Python and R can be fairly slow).

Nevertheless, you can (ab-) use extended regular expressions:

^([^A-Za-z]*[A-Za-z]){1,3}[^A-Za-z]*$

This is fairly straightforward, once you also model the "other" characters. And that is what you should do to define a pattern: model all accepted strings (ie the entire "language"), not only those characters you want to find.

Alternatively, you can "findAll" matches of ([A-Za-z]) , and look at the length of the result. This may be more convenient if you also need the actual characters.

The for loop would look something like this:

public static boolean containsOneToThreeAlphabetic(String str) {
    int matched = 0;
    for(int i=0; i<str.length; i++) {
        char c = str.charAt(i);
        if ((c>='A' && c<='Z') || (c>='a' && c<='z')) matched++;
    }
    return matched >=1 && matched <= 3;
}

This is straightforward, readable, extensible, and efficient (in compiled languages). You can also add a if (matched>=4) return false; (or break ) to stop early.

Please, stop playing with regex, you'll complicate not only your own life, but the life of the people, who have to handle your code in the future. Choose a simpler approach, find all [A-Za-z]+ strings, put them into the list, then check every string, if the length is within 1 and 3 or beyond that.

Regex

/([A-Za-z])(?=(?:.*\\1){3})/s

Looking for a char and for 3 repetitions of it. So if it matches there are 4 or more equal chars present.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM