簡體   English   中英

“jsonPropertyValueQuery”ctsquery 的選項“通配符”導致通過 REST API 或 Marklogic 查詢控制台調用的不同結果

[英]Option "wildcarded" for "jsonPropertyValueQuery" ctsquery causes the different results called via REST API or in Marklogic Query Console

我們最近使用以下 bash 命令從 docker 集線器下載了 Marklogic 的本地實例:

docker run --name marklogic-test -d -it -p 8000:8000 -p 8001:8001 -p 8002:8002 \
 -e MARKLOGIC_INIT=true \
 -e MARKLOGIC_ADMIN_USERNAME=admin \
 -e MARKLOGIC_ADMIN_PASSWORD='Areally!PowerfulPassword1337' \
 marklogicdb/marklogic-db:10.0-9.4-centos-1.0.0-ea4

“文檔”數據庫僅包含兩個文檔: sample1.json

{
   "v1": "1234",
   "v2": "ABCD",
   "v3": "0123456789" 
}

樣品2.json

{
   "v1": "5678", 
   "v2": "EFGH", 
   "v3": "9876543210"
}

如果我們將在查詢控制台中運行以下XQuery

xquery version "1.0-ml";

let $query := cts:and-query((
  cts:directory-query("/", "infinity"),
  cts:json-property-value-query("v3", "01*", ("wildcarded", "whitespace-sensitive", "punctuation-sensitive"))
))

return (xdmp:to-json($query), cts:search(/,$query))

結果將是預期的......它將只返回一個文檔:

{
    "andQuery": {
        "queries": [
            {
                "directoryQuery": {
                    "uris": [
                        "/"
                    ],
                    "depth": "infinity"
                }
            },
            {
                "jsonPropertyValueQuery": {
                    "property": [
                        "v3"
                    ],
                    "value": [
                        "01*"
                    ],
                    "options": [
                        "punctuation-sensitive",
                        "whitespace-sensitive",
                        "wildcarded",
                        "lang=en"
                    ]
                }
            }
        ]
    }
}
    json as 
    JSON
    {
    "v1": "1234",
    "v2": "ABCD",
    "v3": "0123456789"
}

但如果我會做REST API請求:

curl --location --request POST 'http://localhost:18000/LATEST/search?format=json' --user 'admin:Areally!PowerfulPassword1337' --header 'Content-Type: application/json' --data-binary "@test_api.json"

其中test_api.json有以下內容:

{
    "search": {
        "ctsquery": {
            "andQuery": {
                "queries": [
                    {
                        "directoryQuery": {
                            "uris": [
                                "/"
                            ],
                            "depth": "1"
                        }
                    },
                    {
                        "jsonPropertyValueQuery": {
                            "property": [
                                "v3"
                            ],
                            "value": [
                                "01*"
                            ],
                            "options": [
                                "punctuation-sensitive",
                                "wildcarded",
                                "whitespace-sensitive",
                                "lang=en"
                            ]
                        }
                    }
                ]
            }
        },
        "options": {
            "return-plan": false,
            "return-metrics": true,
            "return-facets": true,
            "return-query": false,
            "transform-results": {
                "apply": "raw"
            },
            "page-length": 10
        }
    }
}

答案是這樣的:

{
    "snippet-format": "snippet",
    "total": 2,
    "start": 1,
    "page-length": 10,
    "results": [
        {
            "index": 1,
            "uri": "/sample1.json",
            "path": "fn:doc(\"/sample1.json\")",
            "score": 0,
            "confidence": 0,
            "fitness": 0,
            "href": "/v1/documents?uri=%2Fsample1.json",
            "mimetype": "application/json",
            "format": "json",
            "matches": [
                {
                    "path": "fn:doc(\"/sample1.json\")/object-node()",
                    "match-text": [
                        "1234 ABCD 0123456789"
                    ]
                }
            ]
        },
        {
            "index": 2,
            "uri": "/sample2.json",
            "path": "fn:doc(\"/sample2.json\")",
            "score": 0,
            "confidence": 0,
            "fitness": 0,
            "href": "/v1/documents?uri=%2Fsample2.json",
            "mimetype": "application/json",
            "format": "json",
            "matches": [
                {
                    "path": "fn:doc(\"/sample2.json\")/object-node()",
                    "match-text": [
                        "5678 EFGH 9876543210"
                    ]
                }
            ]
        }
    ],
    "metrics": {
        "query-resolution-time": "PT0.000624S",
        "snippet-resolution-time": "PT0.005684S",
        "total-time": "PT0.00713S"
    }
}

由於某種原因,兩個文檔都作為結果返回。 即使“信心”是0? 如何理解 MarkLogic 搜索引擎的這種行為? 是 Marklogic REST API 的錯誤還是我們缺少什么?

在查詢控制台中,查看將“過濾”與“未過濾”選項應用於搜索時的區別。

  • MarkLogic REST API 是“未過濾”查詢。

  • cts:search()默認是“過濾”查詢。

過濾搜索(默認)。 過濾搜索消除了任何誤報匹配,並正確解決了同一片段中存在多個候選匹配的情況。 過濾后的搜索結果完全滿足指定的cts:query

https://docs.marklogic.com/guide/performance/unfiltered#id_89797

未過濾的搜索省略了過濾步驟,該步驟驗證每個候選片段結果是否真正滿足搜索條件。 因此,未過濾的搜索保證快速,而過濾的搜索保證准確。 默認情況下,搜索被過濾; 您必須為 cts:search 指定“未過濾”選項以返回未過濾的搜索。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM