简体   繁体   English

String(bytes [] Charset)在Java7和java 8中以不同的方式返回结果

[英]String (bytes[] Charset) is returning results differently in Java7 and java 8

import java.io.UnsupportedEncodingException;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;

public class Java87String {

public static void main(String[] args) throws UnsupportedEncodingException {
        // TODO Auto-generated method stub

        //byte[] b = {-101, 53, -51, -26, 24, 60, 20, -31, -6, 45, 50, 103, -66, 28, 114, -39, 92, 23, -47, 32, -5, -122, -28, 79, 22, -76, 116, -122, -54, -122};
        //byte[] b = {-76, -55, 85, -50, 80, -23, 27, 62, -94, -74, 47, -123, -119, 94, 90, 61, -63, 73, 56, -48, -54, -4, 11, 79};

        byte[] b = { -5, -122, -28};

        System.out.println("Input Array :" + Arrays.toString(b));
        System.out.println("Array Length : " + b.length);                       
        String target = new String(b,StandardCharsets.UTF_8);
        System.out.println(Arrays.toString(target.getBytes("UTF-8")));
        System.out.println("Final Key :" + target);

}
}

The above code returns the following output in Java 7 上面的代码在Java 7中返回以下输出

Input Array :[-5, -122, -28]
Array Length : 3
[-17, -65, -67]
Final Key :�

The Same code returns the following output in Java 8 相同的代码在Java 8中返回以下输出

Input Array :[-5, -122, -28]
Array Length : 3
[-17, -65, -67, -17, -65, -67, -17, -65, -67]
Final Key :���

Sounds like Java8 is doing the right thing of replacing with the default sequence of [-17, -65, -67] . 听起来像Java8正在用[-17, -65, -67]的默认序列替换正确的东西。

Why is there a difference in output and Any Known bugs in JDK 1.7 which fixes this issue? 为什么JDK 1.7中的输出和任何已知错误都存在差异而修复了这个问题?

I think (-5, -122, -28) is a invalid UTF-8 byte sequence, so the JVM may output anything in this case. 我认为(-5, -122, -28)是无效的UTF-8字节序列,因此在这种情况下JVM可以输出任何内容。 If it were a valid one, maybe the different Java versions could show the same output. 如果它是有效的,可能不同的Java版本可能显示相同的输出。

Does this specific byte sequence have a meaning? 这个特定的字节序列有意义吗? just curious 只是好奇

Per the String JavaDoc : 根据String JavaDoc

The behavior of this constructor when the given bytes are not valid in the given charset is unspecified. 当给定字节在给定字符集中无效时,此构造函数的行为未指定。 The CharsetDecoder class should be used when more control over the decoding process is required. 当需要更多地控制解码过程时,应该使用CharsetDecoder类。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM