Is it 100% safe (exception / error free) to convert a byte[] that includes random binary data to a String via the constructor:
new String(bytes);
// -- or --
new String(bytes,"UTF-8"); // Or other charset
My concern is whether invalid UTF-8 bytes will cause an exception or other failure instead of just a possibly partially garbled message.
I have tried some known bad byte values, as they appear to work as expected. Eg:
byte[] bytes = new byte[] {'a','b','c',(byte)0xfe,(byte)0xfe,(byte)0xff,(byte)0xff,'d','e','f'};
String test = new String(bytes,"UTF-8");
System.out.println(test);
Prints "abc????def".
My concern is if certain other combinations can fail in other unexpected ways since I cannot guarantee that I can test every invalid combination.
This is covered in the docs :
This method always replaces malformed-input and unmappable-character sequences with this charset's default replacement string
One thing that will fail, if you're not always using UTF-8, is that it can throw UnsupportedEncodingException .
If you want to twiddle with decoding behavior on bad inputs, use something like
StandardCharsets.UTF_8
.newDecoder()
.implOnMalformedInput(CodingErrorAction.REPORT)
.implOnUnmappableCharacter(CodingErrorAction.REPLACE)
.implReplaceWith(replacementString)
.decode(ByteBuffer.wrap(byteArray))
.toString();
which lets you twiddle all the various knobs involved.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.