弹性搜索中的通配符搜索或部分匹配

Question

I am trying to provide the search to end user with type as they go which is is more like sqlserver.我正在尝试向最终用户提供搜索时输入的类型，这更像是 sqlserver。 I was able to implement ES query for the given sql scenario:我能够为给定的 sql 场景实现 ES 查询：

select * from table where name like '%pete%' and type != 'xyz and type!='abc'

But the ES query doesnt work for this sql query但是 ES 查询不适用于这个 sql 查询

select * from table where name like '%peter tom%' and type != 'xyz and type!='abc'

In my elastic search alongwith the wildcard query i also need to perform some boolean filtered query在我的弹性搜索以及通配符查询中，我还需要执行一些布尔过滤查询

{
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "should": [
            {
              "query": {
                "wildcard": {
                  "name": { "value": "*pete*" }
                }
              }
            }
          ],
          "must_not": [
            {
              "match": { "type": "xyz" }
            },
            {
              "match": { "type": "abc" }
            }
          ]
        }
      }
    }
  }
}

The above elastic query with wildcard search works fine and gets me all the documents that matches pete and are not of type xyz and abc .But when i try perform the wildcard with 2 seprate words seprated by space then the same query returns me empty as shown below.For example上面带有通配符搜索的弹性查询工作正常，并为我获取所有与 pete 匹配且不属于 xyz 和 abc 类型的文档。下面。例如

{
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "should": [
            {
              "query": {
                "wildcard": {
                  "name": { "value": "*peter tom*" }
                }
              }
            }
          ],
          "must_not": [
            {
              "match": { "type": "xyz" }
            },
            {
              "match": { "type": "abc" }
            }
          ]
        }
      }
    }
  }
}

My mapping is as follows :我的映射如下：

{
  "properties": {
    "name": {
      "type": "string"
    },
    "type": {
      "type": "string"
    }
  }
}

What query should i use in order to make wild card search possible for words seprated by spaces我应该使用什么查询，以便对由空格分隔的单词进行通配符搜索

Answer 1

The most efficient solution involves leveraging an ngram tokenizer in order to tokenize portions of your name field.最有效的解决方案是利用ngram 分词器来分词您name字段的部分内容。 For instance, if you have a name like peter tomson , the ngram tokenizer will tokenize and index it like this:例如，如果你有一个像peter tomson这样的名字，ngram 分词器会像这样对它进行分词和索引：

pe聚乙烯
pet宠物
pete皮特
peter彼得
peter t彼得
peter to彼得到
peter tom彼得汤姆
peter toms彼得汤姆斯
peter tomso彼得托姆索
eter tomson汤姆逊
ter tomson汤姆逊
er tomson汤姆逊
r tomson汤姆逊
tomson汤臣
tomson汤臣
omson欧姆生
mson米森
son儿子
on在

So, when this has been indexed, searching for any of those tokens will retrieve your document with peter thomson in it.因此，当它被编入索引后，搜索这些标记中的任何一个都将检索您的文档，其中包含peter thomson 。

Let's create the index:让我们创建索引：

PUT likequery
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_ngram_analyzer": {
          "tokenizer": "my_ngram_tokenizer"
        }
      },
      "tokenizer": {
        "my_ngram_tokenizer": {
          "type": "nGram",
          "min_gram": "2",
          "max_gram": "15"
        }
      }
    }
  },
  "mappings": {
    "typename": {
      "properties": {
        "name": {
          "type": "string",
          "fields": {
            "search": {
              "type": "string",
              "analyzer": "my_ngram_analyzer"
            }
          }
        },
        "type": {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }
  }
}

You'll then be able to search like this with a simple and very efficient term query:然后，您将能够使用简单且非常有效的term查询进行这样的搜索：

POST likequery/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "name.search": "peter tom"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "type": "xyz"
          }
        },
        {
          "match": {
            "type": "abc"
          }
        }
      ]
    }
  }
}

Answer 2

Well my solution is not perfect and I am not sure about performance.好吧，我的解决方案并不完美，我不确定性能。 So you should try it on your own risk :)所以你应该自担风险尝试它:)

This is es 5 version这是es 5版本

PUT likequery
{
  "mappings": {
    "typename": {
      "properties": {
        "name": {
          "type": "string",
          "fields": {
            "raw": {
              "type": "keyword"
            }
          }
        },
        "type": {
          "type": "string"
        }
      }
    }
  }
}

in ES 2.1 change "type": "keyword" to "type": "string", "index": "not_analyzed"在 ES 2.1 中将"type": "keyword"更改为"type": "string", "index": "not_analyzed"

PUT likequery/typename/1
{
  "name": "peter tomson"
}

PUT likequery/typename/2
{
  "name": "igor tkachenko"
}

PUT likequery/typename/3
{
  "name": "taras shevchenko"
}

Query is case sensetive查询区分大小写

POST likequery/_search
{
  "query": {
    "regexp": {
      "name.raw": ".*taras shev.*"
    }
  }
}

Response回复

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "likequery",
        "_type": "typename",
        "_id": "3",
        "_score": 1,
        "fields": {
          "raw": [
            "taras shevchenko"
          ]
        }
      }
    ]
  }
}

PS.附注。 Once again I am not sure about performance of this query since it will use scan and not index.我再次不确定此查询的性能，因为它将使用扫描而不是索引。

弹性搜索中的通配符搜索或部分匹配

问题描述

2 个解决方案

解决方案1
2 已采纳 2016-12-12 05:52:28

解决方案2
1 2016-12-12 04:07:36

弹性搜索中的通配符搜索或部分匹配

问题描述

2 个解决方案

解决方案1 2 已采纳 2016-12-12 05:52:28

解决方案2 1 2016-12-12 04:07:36

解决方案1
2 已采纳 2016-12-12 05:52:28

解决方案2
1 2016-12-12 04:07:36