简体   繁体   中英

batch search for documents elastic search

I am using this hopelessly inefficient code to establish if a document is already indexed:

foreach (var entry in dic)
{
    var response = client.Search<Document>(s => s.Query(q => q.QueryString(d => 
    d.Query(string.Format("{0}", entry.Key)))));

    if (response.Documents.Count == 0)
    {
        not_found++;
    }
    else
    {
        found++;
    }
}

I wonder, if one could send several entry.Key in one batch rather than hitting the endpoint for every id (entry.Key)? Thanks.

Sure!

You can use a terms filter:

client.Search<Document>(s => s.Query(
  q => q.Terms(
    c => c
      .Field(doc => doc.Id)
      .Terms(keys)))

If you are specifically looking for IDs, you can use the ids filter:

client.Search<Document>(s => s.Query(
  q => q.Ids(c => c.Values(keys))
);

If you are only interested in whether or not the document(s) have been indexed, consider limiting the returned fields to only the ID field so you don't waste bandwidth returning the full document:

response = client.Search<Document>(s => s
  .Query(q => q.Ids(c => c.Values(keys))  // look for these IDs
  .StoredFields(sf => sf.Fields(doc => doc.Id))  // return only the Id field
);  

Lastly, if you're only interested in the number of matching documents, then you can ask Elasticsearch to not return any results, and only use the response metadata to count how many documents matched:

response = client.Search<Document>(s => s
  .Query(q => q.Ids(c => c.Values(keys)))  // look for these IDs
  .Size(0)  // return 0 hits
);
found += response.Total; // number of total hits

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM