简体   繁体   中英

Using java to write a UTF-8 File

I am trying to write a java utility that writes out an UTF-8 file with just the characters I explicity write to the file. I wrote the following code to do the trick.

import java.io.BufferedWriter;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;


public class FileGenerator {

    public static void main(String[] args) {
        try {

            char content = 0xb5;

            String filename = "SPTestOutputFile.txt";

            BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(
                        new FileOutputStream(filename), "UTF-8"));

            bw.write(content);
            bw.close();

            System.out.println("Done");

        } catch (IOException e) {
            e.printStackTrace();
        }
    }

}

I also pass -Dfile.encoding=UTF-8 as a VM argument.

The character that I am trying to write does get written to the file but I also get a  before it so when I try to write out µ I actually get µ. Does anyone know how to correct this so that I always just get just µ?

Thanks

The implementation works just fine: the UTF-8 representation for µ is c2 b5 . That is exactly what is written to the file.

Check UTF-8 table here .

十六进制编辑器中的文件

Your txt file contains two "symbols":

  1. BOM ( Byte order mark )
  2. µ

If your application (some reader) recognizes encoding correctly, you see only µ . In other cases application interprets BOM as another symbol and you can see µ or something else.

So your text file is OK.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM