简体   繁体   中英

Convert Windows-1252 file into UTF-8 file

Hello I am having some issues with this simple task of conversion. Here is my code bellow (rough but not so complex):

        FileInputStream fis = new FileInputStream ("file");
    BufferedReader reader = new BufferedReader(new InputStreamReader(fis,"CP1250"));

    try {

        StringBuilder sb = new StringBuilder();
        String line = null;
        try {
            line = reader.readLine();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        while (line != null) {
            sb.append(line);
            if(line.contains(" "))
            sb.append(System.lineSeparator());
            try {
                line = reader.readLine();
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
        }
        String everything = sb.toString();
        System.out.println(everything);

        PrintWriter writer = null;
        try {
            writer = new PrintWriter("clean", "UTF-8");
        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (UnsupportedEncodingException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        writer.println(everything);
        writer.close();
    } 

    finally {
        try {
            reader.close();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }

But I get the same output as the input with the same encoding format. Do you see anyway able to help?

The docs say that 1) public void println(String x) Prints a String and then terminates the line. This method behaves as though it invokes print(String) and then println().

And 2) public void print(String s) Prints a string. If the argument is null then the string "null" is printed. Otherwise, the string's characters are converted into bytes according to the platform's default character encoding, and these bytes are written in exactly the manner of the write(int) method.

You probably will get your conversion done with

PrintWriter writer 
    = new PrintWriter(new OutputStreamWriter(new FileOutputStream("clean", true), 
        "UTF-8")); 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM