简体   繁体   English

关闭DTD验证在XPathExpression评估()

[英]Turn off dtd validation in XPathExpression evaluate()

I want a small subtree out of a xml file (100 Mb) and need to turn off DTD validation, but I can not find any solution for that. 我希望从xml文件(100 Mb)中取出一个小子树,并且需要关闭DTD验证,但是我找不到任何解决方案。

XPath xpath = XPathFactory.newInstance().newXPath();  
XPathExpression expr = xpath.compile("//HEADER");  
Node node = (Node) expr.evaluate(new InputSource(new FileReader(file)), XPathConstants.NODE);

I tryed to use DocumentBuilder and turn off the DTD validation but that's so slow. 我尝试使用DocumentBuilder并关闭DTD验证,但这太慢了。

Thanks, 谢谢,

Joo o

The reason why it's so slow is because you are forcing a full scan of all the nodes because your XPath criterion is too vague: //HEADER means that the XPath engine will scan each and every node of your 100MB to select the ones where the node name is HEADER. 之所以这么慢,是因为您由于XPath准则过于含糊而强制对所有节点进行全面扫描: //HEADER意味着XPath引擎将扫描100MB的每个节点以选择该节点所在的节点名称为HEADER。 If you can make the XPath expression more specific, you should see dramatic improvements. 如果可以使XPath表达式更具体,那么应该会看到巨大的改进。

Other than that, the code below is something I had to do to prevent DTD validation in the past. 除此之外,下面的代码是我过去必须做的事情,以防止DTD验证。 It forces Xerces as the SAX parser and explicitly sets a number of Xerces specific features. 它强制Xerces作为SAX解析器,并显式设置许多Xerces特定的功能。 But again this will probably not affect significantly the response time. 但这又不会显着影响响应时间。

import java.io.File;
import java.io.StringReader;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.apache.xerces.jaxp.SAXParserFactoryImpl;
import org.xml.sax.InputSource;

[...]

    private static SAXParserFactory spf ;

    private static SAXParserFactory spf ;

    private BillCooker() throws Exception {

        System.setProperty("javax.xml.parsers.SAXParserFactory", "org.apache.xerces.jaxp.SAXParserFactoryImpl" ) ;

        spf = SAXParserFactoryImpl.newInstance();
        spf.setNamespaceAware(true);
        spf.setValidating(false);
        spf.setFeature("http://xml.org/sax/features/validation", false);
        spf.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
        spf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);

I trimmed it to leave only the lines relevant to validation 我修剪它只留下与验证相关的行

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM