简体   繁体   中英

how to read a unicode encoded file in java

I am trying to read a file that has been encoded in Unicode(I used Editplus to find out its encoding.)

I am using the following code:-

InputStream inStream = new FileInputStream(logFile);
InputStreamReader streamReader = new InputStreamReader(inStream, "Unicode");
final BufferedReader reader = new BufferedReader(streamReader);

But it does not read the file correctly. When I tried "UTF-8" it read the file but the output produced contained a space after every character.

I need to read a file and display its contents in a JList. I searched and got to know that

Unicode characters use 2 bytes. With ASCII text every other byte will be a binary 0 which will display as a ? or square with most text editors.

This is similar to what is happening with me. I do not have much knowledge about encoding.

Any help would be really appreciated.

I'm not sure what endianness "Unicode" gives, but you should try "UTF-16BE" and "UTF-LE" - obviously BE is Big Endian, and LE is Little Endian. (Just which byte comes first in each 16-bit code unit.)

(I've just read that "UTF-16" defaults to big endian, so I suspect "Unicode" does too... that would mean "UTF-16LE" is more likely to work.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM