[英]Java Internationalization
I have a Java string that I'm having trouble manipulating. 我在处理Java字符串时遇到麻烦。 I have a String, s, that has a value of 丞 (a Chinese character I chose at random, I don't speak Chinese).
我有一个字符串s,其值是丞(我随机选择的中文字符,我不会说中文)。 If I call
如果我打电话
String t = new String(s.getBytes());
if (s.equals(t))
System.out.println("String unchanged");
else
System.out.println("String changed");
Then I get the String changed result. 然后我得到String更改的结果。 Does anyone know what's going on?
有人知道发生了什么吗?
Because that method : 因为那个方法 :
Encodes this String into a sequence of bytes using the platform's default charset
使用平台的默认字符集将此字符串编码为字节序列
If your default charset is ie US-ASCII
you won't get the same bytes used by that Chinese letter 如果您的默认字符集为
US-ASCII
,则不会获得该中文字母使用的相同字节
I imagine an extra bit/byte may be added/droppped in the process. 我想可能会在此过程中添加/删除一个额外的位/字节。
Try using getBytes( String charSetName ) 尝试使用getBytes(String charSetName)
public byte[] getBytes(String charsetName)
Using the correct charsetName 使用正确的charsetName
The getBytes() method uses the default encoding. getBytes()方法使用默认编码。 According to the docs:
根据文档:
The CharsetEncoder class should be used when more control over the encoding process is required.
当需要对编码过程进行更多控制时,应使用CharsetEncoder类。
Actually, I figured this out, sorry for the post. 实际上,我知道了这一点,对这个帖子感到抱歉。 I was using the default Java Charset, instead of explicitly casting it as a UTF-8 Charset.
我使用的是默认的Java字符集,而不是将其显式转换为UTF-8字符集。 It works now.
现在可以使用了。
String t = new String(s.getBytes()); 字符串t =新的String(s.getBytes()); may create string using ASCII as default charset.
可以使用ASCII作为默认字符集创建字符串。 Use following method to create the string with charsetName as UTF-8
使用以下方法创建以charsetName为UTF-8的字符串
String(byte[] bytes, int offset, int length, String charsetName) String(byte []个字节,int偏移量,int长度,字符串charsetName)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.