简体   繁体   English

将特殊字符存储为Unicode

[英]Store special characters as unicode

We have to deal with special characters frequently. 我们必须经常处理特殊字符。 Sadly no particular attention was paid to the encoding until now. 遗憾的是,到目前为止,对编码没有特别注意。 As you can guess we always end up getting an encoding problem of one kind or another. 可以猜到,我们总是最终遇到一种或另一种编码问题。

We are currently working on changing all special characters to unicode at the moment (in property files and in code (I know that this does not comply with good coding practise but we can't change this at the moment). 目前,我们正在努力将所有特殊字符更改为unicode(在属性文件和代码中(我知道这不符合良好的编码习惯,但目前无法更改)。

Now I am not quite sure how we should handel inputs from other systems with varying encodings, should we convert special characters to unicode and is there any good API or convention on how to deal with these? 现在我不太确定如何处理来自其他系统的不同编码的输入,是否应该将特殊字符转换为unicode,以及如何处理这些特殊的API或约定?

If you know the original encoding you can convert it by using 如果您知道原始编码,则可以使用

String orig = "Cp1250"
BufferedReader r = new BufferedReader(new InputStreamReader(is, orig));

Where the string orig have to reflect this table . 字符串orig必须反映此 Then you can do whatever you want in code, because Java stores it internally in UTF8. 然后,您可以执行代码中的任何操作,因为Java在内部将其存储在UTF8中。 If you want to persist it again in different encoding you will use a dual OutputStreamWriter with explicitly specified encoding. 如果要再次使用其他编码对其进行持久化,则将使用具有明确指定编码的双重OutputStreamWriter

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM