简体   繁体   English

Java 8改变了UTF-8解码

[英]Java 8 change in UTF-8 decoding

We recently migrated our application to JDK 8 from JDK 7. After the change, we ran into a problem with the following snippet of code. 我们最近将我们的应用程序从JDK 7迁移到了JDK 8.在更改之后,我们遇到了以下代码片段的问题。

String output = new String(byteArray, "UTF-8");

The byte array may contain invalid UTF-8 byte sequences. 字节数组可能包含无效的UTF-8字节序列。 The same byte array upon UTF-8 decoding, results in two difference strings on Java 7 and Java 8. 在UTF-8解码时,相同的字节数组在Java 7和Java 8上产生两个不同的字符串。

According to the answer to this SO post , Java 8 "fixes" an error in Java 7 and replaces invalid UTF-8 byte sequences with a replacement string, which is in accordance with the UTF-8 specification. 根据这篇SO帖子答案 ,Java 8“修复”了Java 7中的一个错误,并用一个替换字符串替换了无效的UTF-8字节序列,这符合UTF-8规范。

But we would like to stick with Java 7's version of the decoded string. 但我们希望坚持使用Java 7的解码字符串版本。

We have tried to use CharsetDecoder with CodingErrorAction as REPLACE, REPORT and IGNORE on Java 8. Still, we were not able to generate the same string as Java 7. 我们尝试在Java 8上使用带有CodingErrorAction的CharsetDecoder作为REPLACE,REPORT和IGNORE。但是,我们无法生成与Java 7相同的字符串。

Can we do this with a technique of reasonable complexity? 我们能用合理复杂的技术做到这一点吗?

From the pointers provided by @Holger, It was clear that we had to write a custom CharsetDecoder. 从@Holger提供的指针来看,显然我们必须编写一个自定义的CharsetDecoder。

I copied over OpenJDK's version of sun.nio.cs.UTF_8 class, renamed it to CustomUTF_8 and used it to construct a string like so 我复制了OpenJDK版本的sun.nio.cs.UTF_8类,将其重命名为CustomUTF_8并用它来构造一个像这样的字符串

String output = new String(bytes, new CustomUTF_8());

I plan to run extensive tests cross verifying the outputs generated on Java 7 and Java 8. This is an interim solution while I am trying to fix the actual problem of passing output from hmac directly to String without Base64 encoding it first to. 我计划运行大量测试,交叉验证在Java 7和Java 8上生成的输出。这是一个临时解决方案,而我正在尝试修复将输出从hmac直接传递给String而不用Base64编码的实际问题。

 String output = new String(Base64.Encoder.encode(bytes), Charset.forname("UTF-8"));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM