在文档的元素内容中发现使用ow3c.dom.Document对象（Unicode：0x1a）解析文档时发生解析错误

Question

我收到错误消息： org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 14515; An invalid XML character (Unicode: 0x1a) was found in the element content of the document org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 14515; An invalid XML character (Unicode: 0x1a) was found in the element content of the document org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 14515; An invalid XML character (Unicode: 0x1a) was found in the element content of the document 。

我收到错误的xml文件内容：

 <Product>
          <Description>672577000 3M 4540 DISPOSABLE COVERALL → XL</Description>
 </Product>

使用org.w3c.dom.Document对象解析文档时出现此错误，由于输入文件中的→导致发生错误。 那么如何解决此问题？

Answer 1

xml文件中不允许所有字符。 这是一个链接，供您查找允许或不鼓励使用哪个，并且不允许进行重置：

http://en.wikipedia.org/wiki/Valid_characters_in_XML

您的（→）不允许。

Answer 2

I resolved this by using below code
String removedUnicodeChar  = "DISPOSABLE COVERALL → XXL</Description></Order> ↔ ↕ ↑ ↓ → ABC";
Pattern pattern = Pattern.compile("[\\p{Cntrl}|\\uFFFD]");
Matcher m = pattern.matcher(removedUnicodeChar);
if(m.find()){
    System.out.println("Control Characters found");
    removedUnicodeChar = m.replaceAll("");
}

在文档的元素内容中发现使用ow3c.dom.Document对象（Unicode：0x1a）解析文档时发生解析错误

问题描述

2 个解决方案

解决方案1
0 2014-04-16 06:06:07

解决方案2
0 已采纳 2014-04-17 07:37:09

在文档的元素内容中发现使用ow3c.dom.Document对象（Unicode：0x1a）解析文档时发生解析错误

问题描述

2 个解决方案

解决方案1 0 2014-04-16 06:06:07

解决方案2 0 已采纳 2014-04-17 07:37:09

解决方案1
0 2014-04-16 06:06:07

解决方案2
0 已采纳 2014-04-17 07:37:09