DeferredDocumentImpl上的XPath需要花费很长时间评估

Question

In Java, I load an XML file from file like so which returns a DeferredDocumentImpl 在Java中，我从这样的文件加载XML文件，该文件返回DeferredDocumentImpl

private Document loadMasterFileXml(String path)
{
    File masterFilePath = new File(path);
    DocumentBuilderFactory masterDocBuilderFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder masterDocBuilder = masterCbcCollBuilderFactory.newDocumentBuilder();
    masterDocument = masterDocBuilder.parse(masterFilePath);
    return masterDocument;
}

The XML file contains around 1000 elements like this: XML文件包含大约1000个元素，如下所示：

<com.something.something.Collection xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:com.something.something.model="http://www.something.com/something.1.0.0" xmi:id="_HklwsJnWEeeaddrVFPWCMg" name="SOME_THING">
  <signals xmi:id="_N0ir0ZnWEeeaddrVFPWCMg" id="10000">
    <signal href="#_6M0edJhNEeeNvfntr9AQ8g"/>
  </signals>
  <signals xmi:id="_N0jS4JnWEeeaddrVFPWCMg" id="10001">
    <signal href="#_6M1FgJhNEeeNvfntr9AQ8g"/>
  </signals>
  ...

The first XPath operation on this document that is executed is as follows: 在此文档上执行的第一个XPath操作如下：

public long getMaximumSignalIdFromMasterDocument()
{
    Integer errorCode=-1;
    try 
    {
        XPathFactory xPathfactory = XPathFactory.newInstance();
        XPath xPath = xPathfactory.newXPath();
        String expression = "//signals[not(@id < //signals/@id)]";
        Node node = (Node) xPath.evaluate(expression, masterCbCollDocument, XPathConstants.NODE);
        return Long.parseLong(node.getAttributes().getNamedItem("id").getNodeValue());
    }
    catch(Exception e) 
    {
        return errorCode;
    }        
}

In debug mode the following line takes over 1 hour to execute. 在调试模式下，以下行需要1个小时以上的时间才能执行。

Node node = (Node) xPath.evaluate(expression, masterCbCollDocument, XPathConstants.NODE); 节点node =（Node）xPath.evaluate（expression，masterCbCollDocument，XPathConstants.NODE）;

Why is this? 为什么是这样？

Is it a problem with the XPath expression (usage of //)? XPath表达式（//的用法）是否有问题？ Is it due to the Document concrete implementation being deferred so there is too much file IO going on? 是否由于推迟了Document具体实现而导致文件IO过多？

Can anyone suggest an alternative approach? 有人可以建议替代方法吗？

Answer 1

An alternative approach would be to refrain from using XPath. 一种替代方法是避免使用XPath。 Although the hour-long computation seems more likely to be a trouble with the debugger/IDE you use, the XPath expression is not very efficient either (O(n^2)) and cannot be significantly optimized in XPath 1.0. 尽管使用您的调试器/ IDE可能需要进行一个小时的计算，但XPath表达式也不是很有效（O（n ^ 2）），也无法在XPath 1.0中进行显着优化。 Using Java directly in this case seems more appropriate. 在这种情况下，直接使用Java似乎更合适。 An approach could be: 一种方法可以是：

NodeList signals = masterCbCollDocument.getElementsByTagName("signals");
long result = IntStream.range(0, signals.getLength()).mapToLong(i -> Long.parseLong(((Element)signals.item(i)).getAttribute("id"))).max().orElse(-1);

The masterCbCollDocument.getElementsByTagName does the same thing as //signals XPath expression in this case. 在这种情况下， masterCbCollDocument.getElementsByTagName与//signals XPath表达式具有相同的作用。 The resulting signal elements within the NodeList are then mapped to their respective IDs and the maximum of them is returned. 然后将NodeList中生成的signal元素映射到其各自的ID，并返回它们中的最大值。

DeferredDocumentImpl上的XPath需要花费很长时间评估

问题描述

1 个解决方案

解决方案1
0 2018-07-25 09:44:32

DeferredDocumentImpl上的XPath需要花费很长时间评估

问题描述

1 个解决方案

解决方案1 0 2018-07-25 09:44:32

解决方案1
0 2018-07-25 09:44:32