简体   繁体   English

Java和Unicode麻烦

[英]Java and Unicode trouble

I have a Java-program that fetch rows from a SQL-Server DB and insert the same row into an Informix DB. 我有一个Java程序,该程序可从SQL-Server DB中获取行并将同一行插入Informix DB中。 The Informix DB only supports 8859-1 character set. Informix DB仅支持8859-1字符集。 Sometimes the users inserts a row in the SQL server DB by copy and paste from Word or Excel and that causes some characters to end up as Unicode characters (some of them 3-bytes in size). 有时,用户通过从Word或Excel复制和粘贴在SQL Server数据库中插入一行,这导致某些字符最终以Unicode字符的形式出现(某些字符为3字节)。

How can i write a filter function that replaces the unicode characters with for example a '?' 我该如何编写一个过滤器函数,以Unicode字符替换为“'”? or something else ? 或者是其他东西 ?

/Jimmy /吉米

You could replace all non-ASCII characters with ? 您可以将所有非ASCII字符替换为? :

StringBuilder buf = new StringBuilder();
for (char ch : originalString.toCharArray()) {
    if (ch > 127) {
        buf.append('?');
    } else {
        buf.append(ch);
    }
}
return buf.toString();

Another way is to use a regular expression: 另一种方法是使用正则表达式:

originalString.replaceAll("\\P{ASCII}", "?")

It replaces all characters which are not ASCII characters with ? 它将所有非ASCII字符替换为? .

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM