简体   繁体   中英

Problem reading File using UTF-8 encoding in java

when i read the entire XML file in JEditorPane all works fine except the BOM charatcer. I get a BOM charatcer (a dot) at start of file. If i remove the dot and save file it is saved as ANSI.In notepad++ it shows (ANSI as UTF-8) encoding for the same file. If i dont remove the dot XML parser fails to parse the document. Can u help me with this.???? thanks.

Continue use UTF-8 without BOM. Try Editplus go to menu Document->File Encoding ->Change File Encoding then chose UTF-8.

If your XML file only contains ASCII characters it will be valid ASCII/ANSI as well as valid UTF8, so don't worry about Notepad++ recognizing the file as ANSI.

While you can use a BOM for UTF8, it is discouraged because it will break a lot of Unix programs and you really shouldn't do it.

使用java命令的-D选项,按照此答案中的建议设置系统属性file.encoding

java -Dfile.encoding=utf-8

Problem:

utf-8 does not use the BOM, so most programs don't expect it and fail to parse/handle it. As far as I know only some Microsoft programs insert it to detect the utf-8 encoding faster.

Solution:

  • Remove the BOM, nobody needs it.
  • Don't use buggy editors with non standard encoding. (=> my opinion)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM