简体   繁体   中英

Java and Unicode trouble

I have a Java-program that fetch rows from a SQL-Server DB and insert the same row into an Informix DB. The Informix DB only supports 8859-1 character set. Sometimes the users inserts a row in the SQL server DB by copy and paste from Word or Excel and that causes some characters to end up as Unicode characters (some of them 3-bytes in size).

How can i write a filter function that replaces the unicode characters with for example a '?' or something else ?

/Jimmy

You could replace all non-ASCII characters with ? :

StringBuilder buf = new StringBuilder();
for (char ch : originalString.toCharArray()) {
    if (ch > 127) {
        buf.append('?');
    } else {
        buf.append(ch);
    }
}
return buf.toString();

Another way is to use a regular expression:

originalString.replaceAll("\\P{ASCII}", "?")

It replaces all characters which are not ASCII characters with ? .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM