I am using a third party application and would like to change one of its files. The file is stored in XML but with an invalid doctype.
When I try to read use a it errors out becuase the doctype contains "file:///ReportWiz.dtd" (as shown, with quotes) and I get an exception for cannot find file. Is there a way to tell the docbuilder to ignore this? I have tried setValidate to false and setNamespaceAware to false for the DocumentBuilderFactory.
The only solutions I can think of are
DocumentBuilderFactory docFactory = DocumentBuilderFactory .newInstance(); docFactory.setValidating(false); DocumentBuilder docBuilder = docFactory.newDocumentBuilder(); Document doc = docBuilder.parse(file);
Tell your DocumentBuilderFactory to ignore the DTD declaration like this:
docFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
See here for a list of available features.
You also might find JDOM a lot easier to work with than org.w3c.dom:
org.jdom.input.SAXBuilder builder = new SAXBuilder();
builder.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
org.jdom.Document doc = builder.build(file);
Handle resolution of the DTD manually, either by returning a copy of the DTD file (loaded from the classpath) or by returning an empty one. You can do this by setting an entity resolver on your document builder:
EntityResolver er = new EntityResolver() {
@Override
public InputSource resolveEntity(String publicId, String systemId)
throws SAXException, IOException {
if ("file:///ReportWiz.dtd".equals(systemId)) {
System.out.println(systemId);
InputStream zeroData = new ByteArrayInputStream(new byte[0]);
return new InputSource(zeroData);
}
return null;
}
};
My first thought was dealing with it as a stream. You could make a new adapter at some level and just copy input to output except for the offending text.
If the file is shortish (under half a gig or so) you could also read the entire thing into a byte array and make your modifications there, then create a new stream from the byte array into your builder.
That's the advantage of the amazingly bulky way Java handles streams, you actually have a lot of flexibility.
如果你不想承担解析器Xerces的,并希望通用的解决方案看这个
我要讨论的另一件事是将所有文件存储在一个字符串中,然后进行操作并将String连接到文件中。这些似乎都不干净或不容易,但是最好的方法是什么?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.