简体   繁体   中英

Java regex match all characters except

What is the correct syntax for matching all characters except specific ones.

For example I'd like to match everything but letters [AZ] [az] and numbers [0-9] .

I have

string.matches("[^[A-Z][a-z][0-9]]")

Is this incorrect?

Yes, you don't need nested [] like that. Use this instead:

"[^A-Za-z0-9]"

It's all one character class.

If you want to match anything but letters, you should have a look into Unicode properties .

\\p{L} is any kind of letter from any language

Using an uppercase "P" instead it is the negation, so \\P{L} would match anything that is not a letter.

\\d or \\p{Nd} is matching digits

So your expression in modern Unicode style would look like this

Either using a negated character class

[^\p{L}\p{Nd}]

or negated properties

[\P{L}\P{Nd}]

The next thing is, matches() matches the expression against the complete string, so your expression is only true with exactly one char in the string. So you would need to add a quantifier:

string.matches("[^\p{L}\p{Nd}]+")

returns true, when the complete string has only non alphanumerics and at least one of them.

string.matches("[^A-Za-z0-9]")

Almost right. What you want is:

string.matches("[^A-Za-z0-9]")

Here's a good tutorial

Lets say that you want to make sure that no Strings have the _ symbol in them, then you would simply use something like this.

    Pattern pattern = Pattern.compile("_");
    Matcher matcher = Pattern.matcher(stringName);
    if(!matcher.find()){
       System.out.println("Valid String");
    }else{
        System.out.println("Invalid String");
     }

You can negate character classes :

"[^abc]"       // matches any character except a, b, or c (negation).
"[^a-zA-Z0-9]" // matches non-alphanumeric characters.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM