简体   繁体   中英

How to setup date and fuzzy title search on elasticsearch

I am building an Rails 5 app with an Angular 7 frontent. In this app I am using Searchkick (an Elasticsearch gem) and I have indexed a model called Event that got attributes title (string) and starts_at (datetime).

I want to be able to build a query in the search controller where I am able to do the following:

  1. Search the title with a fuzzy search meaning it do not have to match 100% (which it now require).
  2. Search with a date range matching starts_at for the indexed Events.

This is my controller index method

def index
        args = {}
        args[:eventable_id] = params[:id]
        args[:eventable_type] = params[:type]
        args[:title] = params[:title] if params[:title].present?
        if params[:starts_at].present?
            args[:starts_at] = {}
            args[:starts_at][:gte] = params[:starts_at].to_date.beginning_of_day
            args[:starts_at][:lte] = params[:ends_at].to_date.end_of_day
        end
        @events = Event.search where: args, page: params[:page], per_page: params[:per_page]
    end

I have added this line to my Event model

searchkick text_middle: [:title]

This is the actual query that is run

{
    "query": {
        "bool": {
            "must": {
                "match_all": {}
            },
            "filter": [{
                "term": {
                    "eventable_id": "2"
                }
            }, {
                "term": {
                    "eventable_type": "Space"
                }
            }, {
                "term": {
                    "title": "nice event"
                }
            }, {
                "range": {
                    "starts_at": {
                        "from": "2020-02-01T00:00:00.000Z",
                        "include_lower": true,
                        "to": "2020-02-29T23:59:59.999Z",
                        "include_upper": true
                    }
                }
            }]
        }
    },
    "timeout": "11s",
    "_source": false,
    "size": 10000
}

The date search does not work (but I get no errors) and the title search must match 100% (even the case).

Thankful for all help!

Rather than using Fuzzy queries, I would recommend an ngram analyzer.

Here is an example of an ngram analyzer:

analyzer: {
    ngram_analyzer: {
        type: "custom",
        tokenizer: "standard",
        filter: ["lowercase", "ngram_filter"],
        char_filter: [
            "replace_dots"
        ]
    }
},
filter: {
    ngram_filter: {
        type: "ngram",
        min_gram: "3",
        max_gram: "20",
    }
}

You will also have to add this code to your settings index:

 max_ngram_diff: 17

Then on your mapping, make sure you create two fields. 1 mapping for your regular field such as name and then another mapping for your ngram field such as name.ngram .

In my query, I like to give my name field a boost of 10 and my name.ngram field a boost of 5 so that the exact matches will be rendered first. You will have to play with this though.

In regard to your range query, I am using gte and lte . Here is an example:

query:{
   bool: {
      must: {
          range: {date: {gte: params[:date], lte: params[:date], boost: 10}}
      }
   }
}

I hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM