简体   繁体   中英

Replace non-ascii character by ascii code using java regex

I have string like this T 8.ESTÜTESTतुम मेरी . Now using java regex i want to replace non-ascii character Ü , तुम मेरी with its equivalent code.

How can i achieve this?

I can replace it with any other string.

String str = "T 8.ESTÜTESTतुम मेरी";
String resultString = str.replaceAll("[^\\\\p{ASCII}]", ""); System.out.println(resultString);

It prints T 8.ESTTEST

Sorry, I don't know how to do this using a single regex, please check if this works for you

    String str = "T 8.ESTÜTESTतुम मेरी";

    StringBuffer sb = new StringBuffer();
    for(int i=0;i<str.length();i++){
        if (String.valueOf(str.charAt(i)).matches("[^\\p{ASCII}]")){
            sb.append("[CODE #").append((int)str.charAt(i)).append("]");
        }else{
            sb.append(str.charAt(i));
        }
    }
    System.out.println(sb.toString());

prints

T 8.EST[CODE #220]TEST[CODE #2340][CODE #2369][CODE #2350] [CODE #2350][CODE #2375][CODE #2352][CODE #2368]

the problem seems to be how to tell regex how to convert what it finds to the code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM