简体   繁体   English

MarkLogic中cts:search选项中哪个更好的collection()或根元素

[英]Which is better collection() or root element in cts:search option in MarkLogic

In one of my projects, MarkLogic consultant advised me to use collection() in cts:search , and in another project, ML consultants have advised to use root element in cts:search . 在我的一个项目中,MarkLogic顾问建议我在cts:search使用collection() ,而在另一个项目中,ML顾问建议在cts:search使用root元素。 In both projects, we had the same volume of documents. 在两个项目中,我们拥有相同数量的文档。 Which one is better with respect to performance? 在性能方面哪个更好?

Let's say we have a document (I am taking a small document just to explain the scenario). 假设我们有一个文档(我正在准备一个小文档来解释这种情况)。 It has collection named "demo": 它具有名为“ demo”的集合:

<root>
<child1>ABC</child1>
<child2>DEF</child2>
<child3>GHI</child3>
<child4>JKL</child4>
</root>

Which case is better/more efficient: 哪种情况更好/更有效:

cts:search(/root, cts:and-query((....some cts:queries..)))

cts:search(collection("demo"), cts:and-query((....some cts:queries..)))

Please help me with an explanation which one is better than other. 请帮我解释一个比另一个更好的解释。

They are both single term lookups, as far as search execution goes, so performance should be the same. 就搜索执行而言,它们都是单项查找,因此性能应相同。

The real distinction is about how you want to manage your content. 真正的区别在于您要如何管理内容。 You can have more than one collection on the same document, so you can slice the same content multiple way, but you can only have one root element. 同一文档上可以有多个集合,因此可以用多种方式对同一内容进行切片,但只能有一个根元素。 Collections also let you abstract away from the details of document structure: you could have multiple different root elements within the same collection. 集合还使您可以抽象出文档结构的详细信息:同一集合内可以有多个不同的根元素。

As per MarkLogic documentation "MarkLogic's implementation of collections is designed to optimize query performance against large volumes of documents.". 根据MarkLogic文档,“ MarkLogic的集合实现旨在优化针对大量文档的查询性能。” So, it means you can identify the difference on only huge database. 因此,这意味着您只能识别巨大数据库上的差异。

I tried to identify this by practical, so I created two XQuery, one with collection and one with element as you suggested. 我尝试通过实际的方法对此进行识别,因此我创建了两个XQuery,一个是带有集合的,另一个是带有您建议的元素的。 But, I put xdmp:query-trace(fn:true()) at the top of both the XQuery. 但是,我将xdmp:query-trace(fn:true())放在两个XQuery的顶部。 I ran both the query one by one and analysed my MarkLogic log file. 我一个个地运行了两个查询,并分析了我的MarkLogic日志文件。

If it is element XQuery: 如果是元素XQuery:

2018-11-12 15:16:58.448 Info: App-Services: at 5:12: xdmp:eval("declare namespace sem = &quot;http://marklogic.com/semantics&quo...", (), <options xmlns="xdmp:eval"><database>5310618057872024096</database>...</options>)
2018-11-12 15:16:58.448 Info: App-Services: at 5:12: Analyzing path for search: fn:collection()/sem:triples
2018-11-12 15:16:58.448 Info: App-Services: at 5:12: Step 1 is searchable: fn:collection()
2018-11-12 15:16:58.448 Info: App-Services: at 5:12: Step 2 is searchable: sem:triples
2018-11-12 15:16:58.448 Info: App-Services: at 5:12: Path is fully searchable.
2018-11-12 15:16:58.448 Info: App-Services: at 5:12: Gathering constraints.
2018-11-12 15:16:58.448 Info: App-Services: at 5:12: Step 2 contributed 1 constraint: sem:triples
2018-11-12 15:16:58.449 Info: App-Services: at 5:12: Search query contributed 1 constraint: cts:element-value-query(xs:QName("sem:object"), "taxonomy", ("lang=en"), 1)
2018-11-12 15:16:58.449 Info: App-Services: at 5:12: Executing search.
2018-11-12 15:16:58.464 Info: App-Services: at 5:12: Selected 65964 fragments to filter

and if it is collection XQuery: 如果是集合XQuery:

2018-11-12 15:20:07.871 Info: App-Services: at 5:11: xdmp:eval("declare namespace sem = &quot;http://marklogic.com/semantics&quo...", (), <options xmlns="xdmp:eval"><database>5310618057872024096</database>...</options>)
2018-11-12 15:20:07.871 Info: App-Services: at 5:11: Analyzing path for search: fn:collection("/triples")
2018-11-12 15:20:07.871 Info: App-Services: at 5:11: Step 1 is searchable: fn:collection("/triples")
2018-11-12 15:20:07.871 Info: App-Services: at 5:11: Path is fully searchable.
2018-11-12 15:20:07.871 Info: App-Services: at 5:11: Gathering constraints.
2018-11-12 15:20:07.871 Info: App-Services: at 5:11: Step 1 contributed 1 constraint: fn:collection("/triples")
2018-11-12 15:20:07.875 Info: App-Services: at 5:11: Search query contributed 1 constraint: cts:element-value-query(xs:QName("sem:object"), "taxonomy", ("lang=en"), 1)
2018-11-12 15:20:07.875 Info: App-Services: at 5:11: Executing search.
2018-11-12 15:20:07.891 Info: App-Services: at 5:11: Selected 65964 fragments to filter

The difference is clearly noticable. 区别显然是明显的。 If we are using collection query, MarkLogic is doing everthing almost in single step "1" but if it is element query, MarkLogic is doing two step process. 如果我们使用集合查询,那么MarkLogic几乎会在单步“ 1”中做所有事情,但是如果是元素查询,那么MarkLogic会做两步过程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM