简体   繁体   中英

Is specifying String encoding when parsing byte[] really necessary?

Supposedly, it is "best practice" to specify the encoding when creating a String from a byte[] :

byte[] b;
String a = new String(b, "UTF-8"); // 100% safe
String b = new String(b); // safe enough

If I know my installation has default encoding of utf8, is it really necessary to specify the encoding to still be "best practice"?

Different use cases have to be distinguished here: If you get the bytes from an external source via some protocol with a specified encoding then always use the first form (with explicit encoding).

If the source of the bytes is the local machine, for example a local text file, the second form (without explicit encoding) is better.

Always keep in mind, that your program may be used on a different machine with a different platform encoding. It should work there without any changes.

If I know my installation has default encoding of utf8, is it really necessary to specify the encoding to still be "best practice"?

But do you know for sure that your installation will always have a default encoding of UTF-8? (Or at least, for as long as your code is used ...)

And do you know for sure that your code is never going to be used in a different installation that has a different default encoding?

If the answer to either of those is "No" (and unless you are prescient, it probably has to be "No") then I think that you should follow best practice ... and specify the encoding if that is what your application semantics requires:

  • If the requirement is to always encode (or decode) in UTF-8, then use "UTF-8" .

  • If the requirement is to always encode (or decode) in using the platform default, then do that.

  • If the requirement is to support multiple encodings (or the requirement might change) then make the encoding name a configuration (or command line) parameter, resolve to a Charset object and use that.

The point of this "best practice" recommendation is to avoid a foreseeable problem that will arise if your platform's characteristics change. You don't think that is likely, but you probably can't be completely sure about it. But at the end of the day, it is your decision.

(The fact that you are actually thinking about whether "best practice" is appropriate to your situation is a GOOD THING ... in my opinion.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM