简体   繁体   中英

Need help for writing regular expression

I am weak in writing regular expressions so I'm going to need some help on the one. I need a regular expression that can validate that a string is an set of alphabets (the alphabets must be unique) delimited by comma.

Only one character and after that a comma

Examples:

A,E,R
R,A
E,R

Thanks

You can use a repeated group to validate it's a comma separated string.

^[AER](?:,[AER])*$

To not have unique characters, you would do something like:

^([AER])(?:,(?!\1)([AER])(?!.*\2))*$

If I understand it correctly, a valid string will be a series (possibly zero long) of two-character patterns, where each pattern is a letter followed by a comma; finally followed at the end by one letter.

Thus:

"^([A-Za-z],)*[A-Za-z]$"

Since you've clarified that the letters have to be A, E, or R: 由于您已经澄清了字母必须是A,E或R:

"^([AER],)*[AER]$"

Something like this "^([AER],)*[AER]$"

@Edit: regarding the uniqueness, if you can drop the "last character cannot be a comma" requirement (which can be checked before the regex anyway in constant time) then this should work:

"^(?:([AER],?)(?!.*\\\\1))*$"

This will match A,E,R, hence you need that check before performing the regex. I do not take responsibility for the performance but since it's only 3 letters anyway...

The above is a java regex obviously, if you want a "pure one" ^(?:([AER],?)(?!.*\\1))*$

@Edit2: sorry, missed one thing: this actually requires that check and then you need to add a comma at the end since otherwise it will also match A,E,E . Kind of limited I know.

Note: I'm going to answer the original question. That is, I don't care if the elements repeat.

We've had several suggestions for this regex:

^([AER],)*[AER]$

Which does indeed work. However, to match a String, it first has to back up one character because it will find that there is no , at the end. So we switch it for this to increase performance:

^[AER](,[AER])*$

Notice that this will match a correct String the very first time it attempts to. But also note that we don't need to worry about the ( )* backing up at all; it will either match the first time, or it won't match the String at all. So we can further improve performance by using a possessive quantifier:

^[AER](,[AER])*+$

This will take the whole String and attempt to match it. If it fails, then it stops, saving time by not doing useless backing up.


If I were trying to ensure the String had no repeated elements, I would not use regex; it just complicates things. You end up with less-readable code (sadly, most people don't understand regex) and, oftentimes, slower code. So I would build my own validator:

public static boolean isCommaDelimitedSet(String toValidate, HashSet<Character> toMatch) {
    for (int index = 0; index < toValidate.length(); index++) {
        if (index % 2 == 0) {
            if (!toMatch.contains(toValidate.charAt(index))) return false;
        } else {
            if (toValidate.charAt(index) != ',') return false;
        }
    }
    return true;
}

This assumes that you want to be able to pass in a set of characters that are allowed. If you don't want that and have explicit chars you want to match, change the contents of the if (index % 2 == 0) block to:

char c = toValidate.charAt(index);
if (c == 'A' || c == 'E' || c == 'R' || /* and so on */ ) return false;

My own ugly but extensible solution, which will disallow leading and trailing commas, and checks that the characters are unique.

It uses forward-declared backreference: note how the second capturing group is behind the reference made to it (?!.*\\2) . On the first repetition, since the second capturing group hasn't captured anything, Java treats any attempt to reference text match by second capturing group as failure.

^([AER])(?!.*\1)(?:,(?!.*\2)([AER]))*+$

Demo on regex101 (PCRE flavor has the same behavior for this case)

Demo on RegexPlanet

Test cases:

A,E,R
A,R,E
E,R,A
A
R,E
R
E

A,
A,R,
A,A,R
E,A,E
A,E,E
X,R,E
R,A,E,
,A
AA,R,E

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM