字节[]到字符串的转换，并再次使用UTF-8编码返回到字节[]，没有给出相同的字节数组

Question

To understand more on bytes, char and String in Java, I took a sample byte [] and converted to String and then from string converted to byte [] back. 为了了解Java中有关字节，char和String的更多信息，我提取了一个样本byte []并转换为String，然后从字符串转换为byte []。 However I realized that original byte [] and new byte [] are not same. 但是我意识到原始字节[]和新字节[]不相同。 Why? 为什么？ Any help. 任何帮助。

import java.io.UnsupportedEncodingException;

public class HelloWorld{

     public static void main(String []args) throws UnsupportedEncodingException{

        byte [] originalStringBytes = {39, -94, 17, -18, 43, 32, 50, -70, 31, -125, -46, 10, -23, 32, -112, 63};
        //Convert into string 
        String convertedString = new String (originalStringBytes, "UTF-8");
        //Now again get the bytes back from string 
        byte [] afterStringConversionBytes = convertedString.getBytes("UTF-8");
        //compare two byte array, both are not same
        if(originalStringBytes.length == afterStringConversionBytes.length) {
            System.out.println("SAME");
        } else {
            System.out.println("DIFFERENT");
        }

     }
}

It printed "DIFFERENT" for me. 它为我打印了“不同”。

Answer 1

A sequence of bytes has to follow strict rules to be valid utf-8 encoded text. 字节序列必须遵循严格的规则才能成为有效的utf-8编码文本。 What you have in the array does not follow these rules, and can't be converted into a string without losing information. 数组中的内容不遵循这些规则，并且在不丢失信息的情况下无法转换为字符串。

The rules are explained for example in https://en.wikipedia.org/wiki/UTF-8 例如在https://en.wikipedia.org/wiki/UTF-8中解释了规则

字节[]到字符串的转换，并再次使用UTF-8编码返回到字节[]，没有给出相同的字节数组

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-01-26 21:52:45

字节[]到字符串的转换，并再次使用UTF-8编码返回到字节[]，没有给出相同的字节数组

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-01-26 21:52:45

解决方案1
2 已采纳 2016-01-26 21:52:45