简体   繁体   中英

Java - Can't read in special characters from a text file

I am writing a program which searches for words in a text file (say B) in another dictionary text file (say A) to compare efficiencies of different sorting algorithms.

Anyway, my problem is when one of these source text files has a special character such as "µ." First of all, to save a text file with such a character in windows, notepad says I have to change the encoding from ANSI to something else like UTF-8.

My program crashes when it encounters a line with a special character. Specifically at the point when this word is compared to a word in the other dictionary text file using the compareTo method. It crashes with a NullPointerException.

I have printed out the special character to see that "µ" is represented as "µ" and strange characters are always present on the first line ("").

I am using a Scanner for file input:

inputStream = new Scanner (new FileInputStream(args[0]));

I have tried a FileReader as well

In general, how would I read special characters, or words containing special characters? And would these characters be compatiable with the built in compareTo method or would I have to find another way to order them?

There is no ANSI encoding, there is only ASCII. Use Notepad++ to create correct UTF-8 encoded files. Open the file in Java with a reader which takes in an encoding.

Do

inputStream = new Scanner(new FileInputStream(args[0]), "UTF-8");

or

BufferedReader in = new BufferedReader(
        new InputStreamReader(new FileInputStream(args[0]), "UTF-8"));

InputStreams are for binary byte data, Readers are on characters with their encoding.

It seems there is a "BOM" character in front of the text, a zero width space, which serves to mark the text as UTF-8. This could have been deleted, but then Windows does not recognize UTF-8. In the scanner you might wish to skip it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM