简体   繁体   English

如何在文本字段的弹性搜索查询中执行基于余弦相似度的语义搜索?

[英]how to perform cosine similarity based semantic search in elastic search query on a text field?

I am performing a match on a text field(skills).我正在对文本字段(技能)进行匹配。 I don't want a exact match, instead i want cosine similarity based search on the field.我不想要完全匹配,而是希望在该字段上进行基于余弦相似度的搜索。

GET 2/_search
{
  "_source": ["Skills"], 
  "query": {
    "function_score": {
      "query": {
        "match": {
          "Job_Group": "sales"
        }
      },
      "functions": [
        
        {
          "filter": {
            "match":{
              "Skills":"Designation"
            }
          },
          "weight": 15
        }
      ]
    }
    }
}

The above query is for exact match.上面的查询是精确匹配的。 How do i include some sort of semantic search(Cosine similarity based in the query on skills field).我如何包括某种语义搜索(基于技能字段查询的余弦相似度)。 The skills field is a free text field, so i want matching to happen based on their semantic meaning also.技能字段是一个自由文本字段,所以我也希望根据它们的语义进行匹配。 Example--- skills -Communication & talking should reflect some sort of similarity and boost the score.示例---技能-交流和谈话应该反映出某种相似性并提高分数。

Very simple- Elastic is wrapping Lucene , and Lucene has More-Like-This which implement TF-IDF (BM25 to be more precise) and some additional wisdom.非常简单 - Elastic 正在包装Lucene ,而 Lucene 有 More-Like-This 实现 TF-IDF(更准确地说是 BM25)和一些额外的智慧。 Try it, it will give you good similarity results.试试吧,它会给你很好的相似性结果。 Explanation can be found in this link and various others.可以在此链接和其他各种链接中找到解释

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM