简体   繁体   中英

constrain search:search query to specific element

I'm having trouble trying to specify the search parameters to only search a specific xml element within my files. Here is the file that I am using to search on:

<file>
  <title>red</title>
  <info>
    <section>blurbs</section>
    <section>words</section>
  </info>
  <info>
    <section>first</section>
    <section>this</section>
  </info>
  <info>
    <section>blue</section>
    <section>green</section>
  </info>
  <info>
    <section>red</section>
    <section>yellow</section>
  </info>
</file>

The search:search query I'm using is:

xquery version "1.0-ml";
import module namespace search = "http://marklogic.com/appservices/search"
    at "/MarkLogic/appservices/search/search.xqy";
let $options :=
  <options xmlns="http://marklogic.com/appservices/search">
    <additional-query>
      <cts:document-query depth="infinity" xmlns:cts="http://marklogic.com/cts">
        <cts:uri>/test_data/test_search.xml</cts:uri>
      </cts:document-query>
    </additional-query>
    <extract-document-data selected="include">
      <extract-path>/file/info</extract-path>
    </extract-document-data>
    <constraint>
      <word>
        <element name="info"/>
      </word>
    </constraint> 
    <search-option>filtered</search-option>
  </options>
let $results := search:search("red", $options)

the $results variable contains:

<search:response snippet-format="snippet" total="1" start="1" page-length="10" selected="include" xmlns:search="http://marklogic.com/appservices/search">
  <search:result index="1" uri="/test_data/test_search.xml" path="fn:doc("/test_data/test_search.xml")" score="8448" confidence="0.4065818" fitness="0.8925228">
    <search:snippet>
      <search:match path="fn:doc("/test_data/test_search.xml")/file">
        <search:highlight>red
        </search:highlight>
      </search:match>
      <search:match path="fn:doc("/test_data/test_search.xml")/file/info[4]">
        <search:highlight>red
        </search:highlight>
      </search:match>
    </search:snippet>
    <search:extracted kind="element">
      <info>
        <section>blurbs
        </section>
        <section>words
        </section>
      </info>
      <info>
        <section>first
        </section>
        <section>this
        </section>
      </info>
      <info>
        <section>blue
        </section>
        <section>green
        </section>
      </info>
      <info>
        <section>red
        </section>
        <section>yellow
        </section>
      </info>
    </search:extracted>
  </search:result>
  <search:qtext>red
  </search:qtext>
  <search:metrics>
    <search:query-resolution-time>PT0.00166S
    </search:query-resolution-time>
    <search:snippet-resolution-time>PT0.000992S
    </search:snippet-resolution-time>
    <search:extract-resolution-time>PT0.00049S
    </search:extract-resolution-time>
    <search:total-time>PT0.003748S
    </search:total-time>
  </search:metrics>
</search:response>

As you can see red is hit on title and info, but I only want to search on the xml info element. What am I doing wrong here?

EDIT: I have a small understanding of contraint searching IE search:search("title:red") but what happens when that contraint is multiple words?

When creating a constraint, you should assign a name to it, as in:

<constraint name="inf">

That's what makes it possible to tag terms in query text as in inf:red

For more detail see:

You can also specify a default treatment for untagged search terms using the search:term element:

To understand the query generated from query text, it can be helpful to set the debug or return-query options to true:

By the way, you can use fn:doc() to retrieve any document and use XPath or search:snippet() to extract nodes from a document. The search:search() function isn't designed for retrieving a document by URI.

Finally, if possible, you might want to modify the document model. MarkLogic can provide more useful indexing for documents where:

  • The documents are granular with a focus on a single entity instead of a list of entities
  • The element names reflect the semantics of the data (instead of using generic element names)

Hoping that helps,

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM