简体   繁体   中英

reading text file with utf-8 encoding using java

I have problem in reading text file with utf-8 encoding I'm using java with netbeans 7.2.1 platform

I already configured the java project to handle UTF-8 javaproject==>right click==>properties==>source==>UTF-8

but still get the unknown character output:

the code:

File fileDirs = new File("C:\\file.txt");

BufferedReader in = new BufferedReader(
new InputStreamReader(new FileInputStream(fileDirs), "UTF-8"));

String str;

while ((str = in.readLine()) != null) {
    System.out.println(str);
}

any other ideas?

thanks

Use

    import java.io.BufferedReader;
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.IOException;
    import java.io.InputStreamReader;
    import java.io.UnsupportedEncodingException;     
    public class test {
    public static void main(String[] args){

    try {
        File fileDir = new File("PATH_TO_FILE");

        BufferedReader in = new BufferedReader(
           new InputStreamReader(new FileInputStream(fileDir), "UTF-8"));

        String str;

        while ((str = in.readLine()) != null) {
            System.out.println(str);
        }

                in.close();
        } 
        catch (UnsupportedEncodingException e) 
        {
            System.out.println(e.getMessage());
        } 
        catch (IOException e) 
        {
            System.out.println(e.getMessage());
        }
        catch (Exception e)
        {
            System.out.println(e.getMessage());
        }
    }
}

You need to put UTF-8 in quotes

You are reading the file right but the problem seems to be with the default encoding of System.out . Try this to print the UTF-8 string-

PrintStream out = new PrintStream(System.out, true, "UTF-8");
out.println(str);

You need to specify the encoding of the InputStreamReader using the Charset parameter.

Charset inputCharset = Charset.forName("ISO-8859-1");
InputStreamReader isr = new InputStreamReader(fis, inputCharset));

This is work for me. i hope to help you.

I ran into the same problem every time it finds a special character marks it as . to solve this, I tried using the encoding: ISO-8859-1

BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream("txtPath"),"ISO-8859-1"));

while ((line = br.readLine()) != null) {

}

I hope this can help anyone who sees this post.

Ok, I am definitively late to the party but if you are still looking for an optimal solution I would use the following ( for Java 8 )

    Charset inputCharset = Charset.forName("ISO-8859-1");
    Path pathToFile = ....
    try (BufferedReader br = Files.newBufferedReader( pathToFile, inputCharset )) {
        ...
     }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM