简体   繁体   中英

Java's UTF-8 encoding

I have this code:

BufferedWriter w = Files.newWriter(file, Charsets.UTF_8);
w.newLine();
StringBuilder sb = new StringBuilder();
sb.append("\"").append("éééé").append("\";")
w.write(sb.toString());

But it ain't work. In the end my file hasn't an UTF-8 encoding. I tried to do this when writing:

w.write(new String(sb.toString().getBytes(Charsets.US_ASCII), "UTF8"));

It made question marks appear everywhere in the file...

I found that there was a bug regarding the recognition of the initial BOM charcater ( http://bugs.java.com/view_bug.do?bug_id=4508058 ), so I tried using the BOMInputStream class. But bomIn.hasBOM() always returns false, so I guess my problem is not BOM related maybe?

Do you know how I can make my file encoded in UTF-8? Was the problem solved in Java 8?

You're writing UTF-8 correctly in your first example (although you're redundantly creating a String from a String)

The problem is that the viewer or tool you're using to view the file doesn't read the file as UTF-8.

Don't mix in ASCII, that just converts all the non-ASCII bytes to question marks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM