简体   繁体   English

Java中的汉字UTF-16编码字符串

[英]Chinese character UTF-16 Encoding string in Java

I am trying to encode string in java using following method, 我正在尝试使用以下方法在Java中编码字符串,

String s = "子";
byte[]   bytesEncoded = Base64.encodeBase64(s.getBytes("UTF-16"));
String stringEncoded = new String(bytesEncoded);

When I run this code in eclipse I am getting value as /v9bUA== 当我在Eclipse中运行此代码时,我的价值是/ v9bUA ==

But some online UTF 16 converter giving values like 4E02 但是一些在线UTF 16转换器给出的值如4E02

Anybody knows how to convert Chinese characters in UTF 16. 任何人都知道如何在UTF 16中转换汉字。

I already gone through most of stackoverflow question still got no answers. 我已经经历了大多数stackoverflow问题,但仍然没有答案。

Thanks in Advance! 提前致谢!

This works fine. 这很好。

You just need to convert bytecode in to hex representation, 您只需要将字节码转换为十六进制表示,

String encodeAsUcs2(String messageContent) throws UnsupportedEncodingException {
  byte[] bytes = messageContent.getBytes("UTF-16BE");

  StringBuilder sb = new StringBuilder();
  for (byte b : bytes) {
    sb.append(String.format("%02X", b));
  }

  return sb.toString();
}

The code 编码

String s = "子";
byte[] utf16encodedBytes = s.getBytes("UTF-16")

will give you the string encoded as uft16 bytes. 将为您提供编码为uft16字节的字符串。

I think what is confusing you here is that you are then encoding to Base64 which gives those bytes in ASCII as /v9bUA== . 我认为让您感到困惑的是,您然后将其编码为Base64,从而以ASCII形式将那些字节表示为/ v9bUA == The number 4E02 is a Hex encoding. 数字4E02是十六进制编码。 To see the Hex encoding for your example you could try: 要查看示例的十六进制编码,可以尝试:

String hexEncodedString =  DatatypeConverter.printHexBinary(utf16encodedBytes);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM