简体   繁体   中英

Chinese character UTF-16 Encoding string in Java

I am trying to encode string in java using following method,

String s = "子";
byte[]   bytesEncoded = Base64.encodeBase64(s.getBytes("UTF-16"));
String stringEncoded = new String(bytesEncoded);

When I run this code in eclipse I am getting value as /v9bUA==

But some online UTF 16 converter giving values like 4E02

Anybody knows how to convert Chinese characters in UTF 16.

I already gone through most of stackoverflow question still got no answers.

Thanks in Advance!

This works fine.

You just need to convert bytecode in to hex representation,

String encodeAsUcs2(String messageContent) throws UnsupportedEncodingException {
  byte[] bytes = messageContent.getBytes("UTF-16BE");

  StringBuilder sb = new StringBuilder();
  for (byte b : bytes) {
    sb.append(String.format("%02X", b));
  }

  return sb.toString();
}

The code

String s = "子";
byte[] utf16encodedBytes = s.getBytes("UTF-16")

will give you the string encoded as uft16 bytes.

I think what is confusing you here is that you are then encoding to Base64 which gives those bytes in ASCII as /v9bUA== . The number 4E02 is a Hex encoding. To see the Hex encoding for your example you could try:

String hexEncodedString =  DatatypeConverter.printHexBinary(utf16encodedBytes);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM