简体   繁体   English

如何在以下方面改进Marklogic 7的性能:/ * [fn:name()=“something”]

[英]How can I improve Marklogic 7 performance on the following: /*[fn:name()=“something”]

I have a basic query: 我有一个基本的查询:

/*[fn:name()="something"]

(1) Marklogic 7 is taking multiple seconds, is there an index that I can add to make this query faster? (1)Marklogic 7需要多秒,是否有一个索引我可以添加以使此查询更快?

(2) Which in memory limits should be increased to improve performance? (2)应该增加哪些内存限制以提高性能?

(3) Are there other ways to improve the performance with a different query but get exactly the same result? (3)是否有其他方法可以通过不同的查询来提高性能但获得完全相同的结果?

Try using fn:node-name instead. 请尝试使用fn:node-name I believe that's optimized. 我相信这是优化的。 You'll need to handle namespaces properly, and that's part of why it can be optimized while fn:name can't. 您需要正确处理命名空间,这也是为什么可以优化fn:name不能优化的原因。

 /*[fn:node-name()=fn:QName("","something")] 

The following two xPaths should be the exact same: 以下两个xPath应完全相同:

/theNameOfMyElement

and

/*[fn:name()="theNameOfMyElement"]

The latter is adding an unnecessary and costly qualifier. 后者正在增加一个不必要且昂贵的限定符。 First off, the * has to search for everything, not just elements. 首先, *必须搜索所有内容,而不仅仅是元素。 Several other problems exist with that approach. 该方法存在若干其他问题。

If my first query is still taking a long time, use cts:search , which is much faster as it searches against indexes. 如果我的第一个查询仍然需要很长时间,请使用cts:search ,它在搜索索引时要快得多。 The queries above can be written like this: 上面的查询可以这样写:

cts:search(/theNameOfMyElement, ())

Where the second parameter (empty sequence) can be a qualifying cts:query . 其中第二个参数(空序列)可以是符合条件的cts:query

If namespaces are giving you fits, you can just do: 如果命名空间让你适合,你可以这样做:

/*:theNameOfMyElement

/*[fn:name()="something"] seems like very bad practice to me. /*[fn:name()="something"]对我来说似乎是非常糟糕的做法。 Use /something instead. 改用/something

EDIT 编辑

After seeing that the other answer got accepted, I've been trying to think of what scenario you must be trying to solve if his solution worked and mine didn't. 在看到其他答案被接受之后,我一直在想如果他的解决方案有效,而我的解决方案没有,那么你必须尝试解决的情况。 I'm still very certain that there is a faster way by just using xPath the way it was designed to work. 我仍然非常确定只需使用xPath就可以更快地实现它的工作方式。

After some thought, I've decided your "real" scenario must either involve a dynamic element name or you may be trying to see if the element name matches one of a sequence of names. 经过一番思考后,我认为你的“真实”场景必须涉及动态元素名称,或者你可能试图查看元素名称是否与一系列名称相匹配。

I have drawn up a sample with it's output provided below, that demonstrates how you could still use both without using the qualifier based on fn:node-name 我已经使用下面提供的输出绘制了一个示例,演示了如何在不使用基于fn:node-name的限定符的情况下仍然使用它们fn:node-name

let $xml as element(sample) := <sample>
    <wrapper>
      <product>
        <entry>
          <red>$1.00</red>
          <yellow>$3.00</yellow>
          <blue>$4.50</blue>
        </entry>
      </product>
    </wrapper>
  </sample>
let $type as xs:string := "product"
return $xml/wrapper/xdmp:unpath($type)/entry/(red|yellow)

(: Returns 
  <red>$1.00</red>
  <yellow>$3.00</yellow> 
:)

In addition to the other good suggestions, consider applying pagination. 除了其他好的建议,请考虑应用分页。 MarkLogic can identify interesting content from indexes fast, but pulling up actual content from disk is relatively slow. MarkLogic可以快速识别索引中的有趣内容,但从磁盘中提取实际内容相对较慢。 And sending over all results at once could mean trying to hold all results (potentially billions) in memory before sending a reply over the wire. 并且立即发送所有结果可能意味着在通过线路发送回复之前尝试将所有结果(可能是数十亿)保存在内存中。 Pagination allows pulling up results in batches, which keeps memory usage low, and potentially allows parallelization as well. 分页允许批量提取结果,这样可以降低内存使用率,并且可能允许并行化。

HTH! HTH!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM