简体   繁体   English

Marklogic REST API-从文档中提取数据

[英]Marklogic REST API - Extract data from documents

I am new to MarkLogic, and I'm trying to extract information from documents using the search API. 我是MarkLogic的新手,我正尝试使用搜索API从文档中提取信息。 My documents are in the below format. 我的文件采用以下格式。

<nitf>
<head>
<title>ABC</title>
</head>
...
...
</nitf>

I would like to show only the titles of the documents that match the search query in the results, ie the search API must return only titles of matching documents. 我只想在结果中显示与搜索查询匹配的文档的标题,即搜索API必须仅返回匹配文档的标题。 I have gone through the documentation and tried a different things such as query options which was suggested by @ehennum , but to no effect. 我浏览了文档并尝试了其他操作,例如@ehennum建议的查询选项,但没有效果。 Any help on this would be great. 任何帮助都会很棒。 Thanks! 谢谢!

Krishna, it sounds like you don't want snippets at all, so you should turn off snippeting : 克里希纳,听起来您根本不想要摘录,因此您应该关闭摘录

<search:transform-results apply="empty-snippet"/>

Then to get the title, use extract-metadata : 然后要获取标题,请使用extract-metadata

<search:extract-metadata>
  <search:qname elem-ns="" elem-name="title"/>
</search:extract-metadata>

As a footnote to Dave's good suggestion, MarkLogic 7 provides Query By Example as a simple interface to search. 作为Dave好的建议的脚注,MarkLogic 7提供了“示例查询”作为简单的搜索界面。 Please see: 请参见:

http://docs.marklogic.com/REST/POST/v1/qbe http://docs.marklogic.com/REST/POST/v1/qbe

http://docs.marklogic.com/guide/search-dev/qbe#id_54044 http://docs.marklogic.com/guide/search-dev/qbe#id_54044

The particular query would look something like the following: 特定查询如下所示:

<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    ... your query by example ...
  </q:query>
  <q:response>
    <q:snippet><q:none/></q:snippet>
    <q:extract><title/></q:extract>
  </q:response>
</q:qbe>

If I recall correctly, NITF doesn't use a namespace, but if it did, you'd have to qualify title with the prefix. 如果我没记错的话,NITF不会使用名称空间,但是如果使用了名称空间,则必须使用标题限定标题。

To expand on the fine answer from @dave-cassel, since MarkLogic version 8, the <search:extract-metadata> option is deprecated and you should use search:extract-document-data instead--lifted directly from the API docs: 为了扩展@ dave-cassel的精确答案,从MarkLogic版本8开始,不建议使用<search:extract-metadata>选项,而应使用search:extract-document-data -直接从API文档中删除:

<search:extract-document-data selected="include">
  <search:extract-path xmlns="">/userName</search:extract-path>
</search:extract-document-data>

More: https://docs.marklogic.com/search:search#opt-extract-document-data 更多: https//docs.marklogic.com/searchsearch#opt-extract-document-data

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM