简体   繁体   English

如何在elasticsearch上设置日期和模糊标题搜索

[英]How to setup date and fuzzy title search on elasticsearch

I am building an Rails 5 app with an Angular 7 frontent.我正在构建一个带有 Angular 7 前端的 Rails 5 应用程序。 In this app I am using Searchkick (an Elasticsearch gem) and I have indexed a model called Event that got attributes title (string) and starts_at (datetime).在这个应用程序中,我正在使用 Searchkick(一个 Elasticsearch gem),并且我已经索引了一个名为 Event 的模型,该模型具有属性 title(字符串)和 starts_at(日期时间)。

I want to be able to build a query in the search controller where I am able to do the following:我希望能够在搜索控制器中构建查询,我可以在其中执行以下操作:

  1. Search the title with a fuzzy search meaning it do not have to match 100% (which it now require).使用模糊搜索搜索标题,这意味着它不必 100% 匹配(现在需要)。
  2. Search with a date range matching starts_at for the indexed Events.使用与开始时间匹配的日期范围搜索索引事件。

This is my controller index method这是我的控制器索引方法

def index
        args = {}
        args[:eventable_id] = params[:id]
        args[:eventable_type] = params[:type]
        args[:title] = params[:title] if params[:title].present?
        if params[:starts_at].present?
            args[:starts_at] = {}
            args[:starts_at][:gte] = params[:starts_at].to_date.beginning_of_day
            args[:starts_at][:lte] = params[:ends_at].to_date.end_of_day
        end
        @events = Event.search where: args, page: params[:page], per_page: params[:per_page]
    end

I have added this line to my Event model我已将此行添加到我的事件模型中

searchkick text_middle: [:title]

This is the actual query that is run这是运行的实际查询

{
    "query": {
        "bool": {
            "must": {
                "match_all": {}
            },
            "filter": [{
                "term": {
                    "eventable_id": "2"
                }
            }, {
                "term": {
                    "eventable_type": "Space"
                }
            }, {
                "term": {
                    "title": "nice event"
                }
            }, {
                "range": {
                    "starts_at": {
                        "from": "2020-02-01T00:00:00.000Z",
                        "include_lower": true,
                        "to": "2020-02-29T23:59:59.999Z",
                        "include_upper": true
                    }
                }
            }]
        }
    },
    "timeout": "11s",
    "_source": false,
    "size": 10000
}

The date search does not work (but I get no errors) and the title search must match 100% (even the case).日期搜索不起作用(但我没有发现任何错误)并且标题搜索必须 100% 匹配(即使是这种情况)。

Thankful for all help!感谢所有帮助!

Rather than using Fuzzy queries, I would recommend an ngram analyzer.我建议使用 ngram 分析器,而不是使用模糊查询。

Here is an example of an ngram analyzer:下面是一个 ngram 分析器的例子:

analyzer: {
    ngram_analyzer: {
        type: "custom",
        tokenizer: "standard",
        filter: ["lowercase", "ngram_filter"],
        char_filter: [
            "replace_dots"
        ]
    }
},
filter: {
    ngram_filter: {
        type: "ngram",
        min_gram: "3",
        max_gram: "20",
    }
}

You will also have to add this code to your settings index:您还必须将此代码添加到您的设置索引中:

 max_ngram_diff: 17

Then on your mapping, make sure you create two fields.然后在您的映射上,确保您创建了两个字段。 1 mapping for your regular field such as name and then another mapping for your ngram field such as name.ngram . 1 映射您的常规字段,例如name ,然后是您的 ngram 字段的另一个映射,例如name.ngram

In my query, I like to give my name field a boost of 10 and my name.ngram field a boost of 5 so that the exact matches will be rendered first.在我的查询中,我喜欢将我的name字段提升 10 并将我的name.ngram字段提升 5,以便首先呈现精确匹配。 You will have to play with this though.不过,你将不得不玩这个。

In regard to your range query, I am using gte and lte .关于您的范围查询,我使用的是gtelte Here is an example:下面是一个例子:

query:{
   bool: {
      must: {
          range: {date: {gte: params[:date], lte: params[:date], boost: 10}}
      }
   }
}

I hope this helps.我希望这有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM