简体   繁体   English

Java:UTF-8和BOM

[英]Java: UTF-8 and BOM

On a page of Java's Bug Database http://bugs.sun.com/view_bug.do?bug_id=4508058 it reads that Sun/Oracle will not fix the problem of Java not parsing the BOM of a UTF-8-encoded string. 在Java的Bug数据库的页面http://bugs.sun.com/view_bug.do?bug_id=4508058上,它显示Sun / Oracle不会解决Java无法解析UTF-8编码字符串的BOM的问题。 Since the most recent comment on this page dates back to 2010, I would like to know if there is any younger info about that? 由于此页面上的最新评论可以追溯到2010年,所以我想知道是否有关于此的更年轻的信息? Is it still true that Java cannot handle BOM of UTF-8? Java是否不能处理UTF-8的BOM表是否仍然正确?

Yes, it is still true that Java cannot handle the BOM in UTF8 encoded files. 是的,Java仍然不能处理UTF8编码文件中的BOM。 I came across this issue when parsing several XML files for data formatting purposes. 在解析多个XML文件以进行数据格式化时遇到了这个问题。 Since you can't know when you might come across them, I would suggest stripping the BOM marker out if you find it at runtime or following the advice that tchrist gave. 由于您不知道何时会遇到它们,因此建议您在运行时找到它或遵循tchrist的建议将BOM标记删除。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM