简体   繁体   中英

Check that string contains non-latin letters

I have the following method to check that string contains only latin symbols.

private boolean containsNonLatin(String val) {
        return val.matches("\\w+");
}

But it returns false if I pass string: my string because it contains space. But I need the method which will check that if string contains letters not in Latin alphabet it should return false and it should return true in all other cases.

Please help to improve my method.

examples of valid strings:

w123.
w, 12
w#123
dsf%&@

You can use \\p{IsLatin} class:

return !(var.matches("[\\p{Punct}\\p{Space}\\p{IsLatin}]+$"));

Java Regex Reference

I need something like not p{IsLatin}

If you need to match all letters but Latin ASCII letters, you can use

"[\\p{L}\\p{M}&&[^\\p{Alpha}]]+"

The \\p{Alpha} POSIX class matches [A-Za-z] . The \\p{L} matches any Unicode base letter, \\p{M} matches diacritics. When we add &&[^\\p{Alpha}] we subtract these [A-Za-z] from all the Unicode letters.

The whole expression means match one or more Unicode letters other than ASCII letters .

To add a space, just add \\s :

"[\\s\\p{L}\\p{M}&&[^\\p{Alpha}]]+"

See IDEONE demo :

List<String> strs = Arrays.asList("w123.", "w, 12", "w#123", "dsf%&@", "Двв");
for (String str : strs)
    System.out.println(!str.matches("[\\s\\p{L}\\p{M}&&[^\\p{Alpha}]]+")); // => 4 true, 1 false

Just add a space to your matcher:

private boolean isLatin(String val) {
    return val.matches("[ \\w]+");
}

User this :

public static boolean isNoAlphaNumeric(String s) {
       return s.matches("[\\p{L}\\s]+");
}
  • \\p{L} means any Unicode letter.
  • \\s space character

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM