简体   繁体   中英

How to get document from elastic search with partial query string?

I have three documents indexed with title "manage", "manager", and "management".

I am searching by following query:

  query: {
    query_string: {
           "query": "manage*",
           "fields": ["title"],
     }
  }
}

I am getting same score for all three documents. I want document with "title": "manage" first, then manager and management.

The query above searches all the documents containing manage , but here since boost is applied to manage , so the document containing manage will have a higher score as compared to other documents.

To know more about Query String Query refer this

Index Data

{ "name":"manage" }
{ "name":"manager"}
{ "name":"management"}

Search Query

    {
  "query": {
    "query_string": {
      "fields": [
        "name"                     
      ],
      "query": "manage^2*"
    }
  }
}

Search Result:

"hits": [
        {
            "_index": "my_index",
            "_type": "_doc",
            "_id": "1",
            "_score": 3.3263016,
            "_source": {
                "name": "manage"
            }
        },
        {
            "_index": "my_index",
            "_type": "_doc",
            "_id": "2",
            "_score": 1.0,
            "_source": {
                "name": "manager"
            }
        },
        {
            "_index": "my_index",
            "_type": "_doc",
            "_id": "3",
            "_score": 1.0,
            "_source": {
                "name": "management"
            }
        }
    ]

Edit 1:

If 1 more document is indexed:

{ "name":"managers" }

Search Query:

    {
  "query": {
    "query_string": {
      "query": "manage~"
    }
  }
}

Search Result:

"hits": [
            {
                "_index": "my_index",
                "_type": "_doc",
                "_id": "1",
                "_score": 0.87546873,
                "_source": {
                    "name": "manage"   
                }
            },
            {
                "_index": "my_index",
                "_type": "_doc",
                "_id": "2",
                "_score": 0.7295572,  -->score is different 
                "_source": {
                    "name": "manager"
                }
            },
            {
                "_index": "my_index",
                "_type": "_doc",
                "_id": "4",
                "_score": 0.58364576,
                "_source": {
                    "name": "managers"
                }
            }
        ]

In your case, for management you have more than 2 edit distance ie manage -> managem --> manageme --> managemen --> management . And if the search is made by using a fuzzy query, then their maximum only two edits are allowed.

So, therefore, management will not match here (by the above search query), rest all words will match (which have edit distance<=2), having different scores.

There are two ways to achieve what you want. The easiest one to try out is to resort to script-based sorting and return a score that matches the length of the data:

GET test/_search
{
  "sort": {
    "_script": {
      "type": "number",
      "script": {
        "lang": "painless",
        "source": "doc['title.keyword'].value.length()"
      },
      "order": "asc"
    }
  },
  "query": {
    "query_string": {
      "query": "manage*",
      "fields": [
        "title"
      ]
    }
  }
}

Note: if you don't have the title.keyword field, you can change your script to work from the source directly:

params._source['title'].length()

You'll get manage (with score of 6), then manager (with score of 7) and then management (with score of 10).

The other way to achieve this is to actually index another integer field (eg titleLength ) with the actual length of the title field and sort by titleLength .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM