简体   繁体   English

如何在 ElasticSearch 中使用带有“q”的 msearch()?

[英]How to use msearch() with "q" in ElasticSearch?

I've been using the standard Python ElasticSearch client to make single requests in the following format:我一直在使用标准Python ElasticSearch 客户端以以下格式发出单个请求:

es.search(index='my_index', q=query, size=5, search_type='dfs_query_then_fetch')

I now want to make queries in batch for multiple strings q .我现在想批量查询多个字符串q

I've seen this question explaining how to use the msearch() functionality to do queries in batch.我已经看到这个问题解释了如何使用msearch()功能进行批量查询。 However, msearch requires the full json-formatted request body for each request.但是,msearch 需要每个请求的完整 json 格式的请求正文。 I'm not sure which parameters in the query API correspond to just the q parameter from search() , or size , or search_type , which seem to be API shortcuts specific to the single-example search() .我不确定查询 API 中的哪些参数仅对应于search()sizesearch_typeq参数,这似乎是特定于单个示例search()的 API 快捷方式。

How can I use msearch but specify q , size , and search_type ?如何使用msearch但指定qsizesearch_type

I read through the API and figured out how to batch simple search queries:我通读了 API 并弄清楚了如何批量处理简单的搜索查询:

from typing import List

from elasticsearch import Elasticsearch

import json

def msearch(
        es: Elasticsearch,
        max_hits: int,
        query_strings: List[str],
        index: str
    ):
    search_arr = []
    
    for q in query_strings:
        search_arr.append({'index': index })
        search_arr.append(
            {
                "query": {
                    "query_string": {
                        "query": q
                    }
            },
            'size': max_hits 
        })
    
    request = ''
    request = ' \n'.join([json.dumps(x) for x in search_arr])
    resp = es.msearch(body = request)
    return resp

msearch(es, query_strings=['query 1', 'query 2'], max_hits=1, index='my_index')

EDIT: For my use case, I made one more improvement, which was because I didn't want to return the entire document in the result– for my purpose, I just needed the document ID and its score.编辑:对于我的用例,我又做了一项改进,这是因为我不想在结果中返回整个文档——为了我的目的,我只需要文档 ID 和它的分数。

So the final search request object part looked like this, including the '_source': False bit:所以最终的搜索请求对象部分看起来像这样,包括'_source': False位:

        search_arr.append(
            {
                # Queries `q` using Lucene syntax.
                "query": {
                    "query_string": {
                        "query": q
                    },
                },
                # Don't return the full profile string, etc. with the result.
                # We just want the ID and the score.
                '_source': False,
                # Only return `max_hits` documents.
                'size': max_hits 
            }
        )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM