在弹性搜索Java API中以某种条件获取文档

Question

As I know we can parse document in elastic search, And when we search for a keyword, It will return the document using this code of java API:- 据我所知，我们可以在弹性搜索中解析文档，并且当我们搜索关键字时，它将使用以下Java API代码返回文档：

  org.elasticsearch.action.search.SearchResponse searchHits =  node.client()
            .prepareSearch()
            .setIndices("indices")
            .setQuery(qb)
            .setFrom(0).setSize(1000)
            .addHighlightedField("file.filename")
            .addHighlightedField("content")
            .addHighlightedField("meta.title")
            .setHighlighterPreTags("<span class='badge badge-info'>")
            .setHighlighterPostTags("</span>")
            .addFields("*", "_source")
            .execute().actionGet();

Now my question is, suppose some documents have string like these:- 现在我的问题是，假设某些文档具有如下字符串：

Jun 2010 to Sep 2011                First Document          
Jun 2009 to Aug 2011                Second Document             
Nov 2011 – Sep 2012                 Third Document   
Nov  2012- Sep 2013                 Forth Document   
Nov 2013 – Current                  First Document   
June 2014 – Feb 2015                Third Document   
Jan 2013 – Jan 2014                 Second Document   
July 2008 – Oct 2012                First Document   
May 2007 – Current                  Forth Document

Now i want those documents who comes between these conditions:- 现在，我希望出现以下情况的文件：

1 to 12 months
13-24 months
26-48 months

How i can do this? 我该怎么做？

Answer 1

When indexing documents in this form, Elasticsearch will not be able to parse those strings as dates correctly. 以这种形式索引文档时，Elasticsearch将无法正确地将这些字符串解析为日期。 In case you transformed those strings to correctly formatted timestamps , the only way you could perform the query you propose is to index those documents in this format 如果您将这些字符串转换为正确格式的时间戳记，则可以执行建议的查询的唯一方法是以这种格式索引这些文档

{
  "start": "2010-09",
  "end": "2011-10",
  // rest of the document
}

and subsequently run a script-filtered query over them, compiling a script that calculates the difference between those two dates with one of the scripting languages Elasticsearch provides. 然后对它们进行脚本过滤的查询，然后使用Elasticsearch提供的一种脚本语言编译一个脚本，计算这两个日期之间的差值。 Bear in mind that script filtering and scoring is always much slower than a simple index lookup. 请记住，脚本过滤和评分总是比简单的索引查找慢得多。

A much faster and cleaner way to do this is to index the duration of the period alongside the start and end dates, like so 一种更快，更清洁的方法是，将时间段的长短与开始日期和结束日期一起编制索引，就像这样

{
  "start": "2010-09",
  "end": "2011-10",
  "duration": 13
  // the rest of the document
}

If you index your documents in this form, you can simply perform a filtered query on the duration field: 如果以这种形式索引文档，则只需在工期字段上执行过滤查询：

{
   "query":{
      "filtered":{
         "filter":{
            "and":[
               {
                  "range":{
                     "duration":{
                        "gte":1
                     }
                  }
               },
               {
                  "range":{
                     "duration":{
                        "lte":12
                     }
                  }
               }
            ]
         }
      }
   }
}

在弹性搜索Java API中以某种条件获取文档

问题描述

1 个解决方案

解决方案1
0 已采纳 2015-06-22 07:05:13

在弹性搜索Java API中以某种条件获取文档

问题描述

1 个解决方案

解决方案1 0 已采纳 2015-06-22 07:05:13

解决方案1
0 已采纳 2015-06-22 07:05:13