简体   繁体   中英

Numbers in cts:word query in Marklogic

I have a cts:word-query which is having number as the text value. cts:search(fn:doc(),cts:word-query("226"))

This query will fetch results matching to only 226 in the documents. But I need to get the documents which contain 0026 also.

Example: This is abc.xml

<a>
<b>00226</b>
</a>

This is abc1.xml

<a>
<b>226</b>
</a>

If I give the query as cts:search(fn:doc(),cts:word-query("226")) , it will fetch only abc1.xml and if the query is cts:search(fn:doc(),cts:word-query("00226")) , it will fetch only abc.xml.

But I need to get both the documents, irrespective of leading zeros.

Simplest way would be to use a wild card character ( * ) and add the wildcarded option

cts:search(fn:doc(),cts:word-query("*226", ('wildcarded')))

EDIT:

Although this matches the example documents, as Kishan points out in the comments, the wildcard also matches unwanted documents (eg containing "226226").

Since range indexes are not an option in this case because the data is mixed, here is an alternative hack:

cts:search(
    fn:doc(),
    cts:word-query(
        for $lead in ('', '0', '00', '000') 
        return $lead || "226"))

Obviously, this depends on how many leading zeros there can be and will only work if this is known and limited.

You can add an element range index on the element <b> in the database with scalar type int or long , then you do the following query, it should return both documents:

let $query := cts:element-range-query(xs:QName("b"),"=",00226)
return cts:search(fn:doc(),$query)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM