简体   繁体   English

DeferredDocumentImpl上的XPath需要花费很长时间评估

[英]XPath on DeferredDocumentImpl takes extremely long time to evaluate

In Java, I load an XML file from file like so which returns a DeferredDocumentImpl 在Java中,我从这样的文件加载XML文件,该文件返回DeferredDocumentImpl

private Document loadMasterFileXml(String path)
{
    File masterFilePath = new File(path);
    DocumentBuilderFactory masterDocBuilderFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder masterDocBuilder = masterCbcCollBuilderFactory.newDocumentBuilder();
    masterDocument = masterDocBuilder.parse(masterFilePath);
    return masterDocument;
}

The XML file contains around 1000 elements like this: XML文件包含大约1000个元素,如下所示:

<com.something.something.Collection xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:com.something.something.model="http://www.something.com/something.1.0.0" xmi:id="_HklwsJnWEeeaddrVFPWCMg" name="SOME_THING">
  <signals xmi:id="_N0ir0ZnWEeeaddrVFPWCMg" id="10000">
    <signal href="#_6M0edJhNEeeNvfntr9AQ8g"/>
  </signals>
  <signals xmi:id="_N0jS4JnWEeeaddrVFPWCMg" id="10001">
    <signal href="#_6M1FgJhNEeeNvfntr9AQ8g"/>
  </signals>
  ...

The first XPath operation on this document that is executed is as follows: 在此文档上执行的第一个XPath操作如下:

public long getMaximumSignalIdFromMasterDocument()
{
    Integer errorCode=-1;
    try 
    {
        XPathFactory xPathfactory = XPathFactory.newInstance();
        XPath xPath = xPathfactory.newXPath();
        String expression = "//signals[not(@id < //signals/@id)]";
        Node node = (Node) xPath.evaluate(expression, masterCbCollDocument, XPathConstants.NODE);
        return Long.parseLong(node.getAttributes().getNamedItem("id").getNodeValue());
    }
    catch(Exception e) 
    {
        return errorCode;
    }        
}

In debug mode the following line takes over 1 hour to execute. 在调试模式下,以下行需要1个小时以上的时间才能执行。

Node node = (Node) xPath.evaluate(expression, masterCbCollDocument, XPathConstants.NODE); 节点node =(Node)xPath.evaluate(expression,masterCbCollDocument,XPathConstants.NODE);

Why is this? 为什么是这样?

Is it a problem with the XPath expression (usage of //)? XPath表达式(//的用法)是否有问题? Is it due to the Document concrete implementation being deferred so there is too much file IO going on? 是否由于推迟了Document具体实现而导致文件IO过多?

Can anyone suggest an alternative approach? 有人可以建议替代方法吗?

An alternative approach would be to refrain from using XPath. 一种替代方法是避免使用XPath。 Although the hour-long computation seems more likely to be a trouble with the debugger/IDE you use, the XPath expression is not very efficient either (O(n^2)) and cannot be significantly optimized in XPath 1.0. 尽管使用您的调试器/ IDE可能需要进行一个小时的计算,但XPath表达式也不是很有效(O(n ^ 2)),也无法在XPath 1.0中进行显着优化。 Using Java directly in this case seems more appropriate. 在这种情况下,直接使用Java似乎更合适。 An approach could be: 一种方法可以是:

NodeList signals = masterCbCollDocument.getElementsByTagName("signals");
long result = IntStream.range(0, signals.getLength()).mapToLong(i -> Long.parseLong(((Element)signals.item(i)).getAttribute("id"))).max().orElse(-1);

The masterCbCollDocument.getElementsByTagName does the same thing as //signals XPath expression in this case. 在这种情况下, masterCbCollDocument.getElementsByTagName//signals XPath表达式具有相同的作用。 The resulting signal elements within the NodeList are then mapped to their respective IDs and the maximum of them is returned. 然后将NodeList中生成的signal元素映射到其各自的ID,并返回它们中的最大值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM