弹性搜索：“ fuzzy_like_this_field”过滤查询不起作用

Question

I am facing below issue in elastic search filter: 我在弹性搜索过滤器中面临以下问题：

When I try to apply "fuzzy_like_this_field" on String value then it's working fine. 当我尝试在字符串值上应用“ fuzzy_like_this_field”时，它工作正常。

But when I apply "fuzzy_like_this_field" filter on different data type other than String(eg double,Date) it's not working. 但是，当我对除String（例如double，Date）以外的其他数据类型应用“ fuzzy_like_this_field”过滤器时，它将无法正常工作。

It gives 它给

ElasticsearchIllegalArgumentException[fuzzy_like_this_field doesn't support binary/numeric fields. ElasticsearchIllegalArgumentException [fuzzy_like_this_field不支持二进制/数字字段。

Please see below elastic search query 请参阅下面的弹性搜索查询

{"query": {"bool": {"must": [{"fuzzy_like_this_field": {"Receipts.retailerId": {"like_text": "55f5878916c042cc8731a39e4e05b7a0","fuzziness":0.3}}},{"fuzzy_like_this_field": {"Receipts.totalCost": {"like_text": "10","fuzziness":0.3}}}],"must_not": [],"should": []}},"from": 0,"size": 1000,"sort": [],"facets": {}}

Where retailerId - String and totalCost - double 其中retailerId-字符串和totalCost-双精度

if I change totalCost data type double to string then it works. 如果我将totalCost数据类型从double更改为string，则可以正常工作。

So please suggest any solution? 因此，请提出任何解决方案？

Answer 1

Fuzzy queries expand text search results to include terms a certain Levenshtein Distance from the query term. 模糊查询扩展了文本搜索结果，以包括与查询词相距某个Levenshtein距离的词。 They expand numeric values by a margin -fuzziness <= value <= +fuzzyiness (The number of characters needed to be changed or transposed to match) - However, fuzzy_like_this and fuzzy_like_this_field only seem to support string matching (via Levenshtein distance). 它们以裕度扩展数值-fuzziness <= value <= + fuzzyiness （需要更改或换位以匹配的字符数）-但是， fuzzy_like_this和fuzzy_like_this_field似乎仅支持字符串匹配（通过Levenshtein距离）。

fuzzy_like_this and fuzzy_like_this_field queries are deprecated in ES 1.6+. 在ES fuzzy_like_this_field中已弃用fuzzy_like_this和fuzzy_like_this_field查询。 And they both suffer from performance issues. 而且它们都遭受性能问题的困扰。 You should find another method for accomplishing your goal. 您应该找到实现目标的另一种方法。

There are a number of ways to apply fuzzy matching, but I'm not sure fuzzy matching is what you're after. 应用模糊匹配的方法有很多种，但是我不确定模糊匹配是您所追求的。

By specifying: 通过指定：

"fuzzy_like_this_field":{  
                  "Receipts.retailerId":{  
                     "like_text":"55f5878916c042cc8731a39e4e05b7a0",
                     "fuzziness":0.3
                  }
               }

You're asking to match all retailerId s which match the like_text with up to 22 edits. 你问匹配所有retailerId符合其中S like_text高达22名编辑。 Edit distance = length(term) * (1.0 - fuzziness) = 32 * 0.7 = 22.4

So in this case 55ddddddd6c0ddddddd1a3dddddddda0 would qualify as a fuzzy match to 55f5878916c042cc8731a39e4e05b7a0 because 10 of the characters share the same position. 因此，在这种情况下，由于10个字符共享相同的位置，因此55ddddddd6c0ddddddd1a3dddddddda0可以作为对55f5878916c042cc8731a39e4e05b7a0的模糊匹配。

If, instead, you're merely looking for duplicate transactions, why not simply use a match query or filter, to match your retailerId and totalCost exactly? 相反，如果您只是在寻找重复的交易，为什么不只使用匹配查询或过滤器来完全匹配您的retailerId和totalCost ？

"query":{  
      "bool":{  
         "must":[  
            {  
               "match":{  
                  "Receipts.retailerId": "55f5878916c042cc8731a39e4e05b7a0" 
               }
            },
            {  
               "match":{  
                  "Receipts.totalCost": 10
               }
            }
         ]
      }
   }

弹性搜索：“ fuzzy_like_this_field”过滤查询不起作用

问题描述

1 个解决方案

解决方案1
1 2015-12-19 03:07:14

弹性搜索：“ fuzzy_like_this_field”过滤查询不起作用

问题描述

1 个解决方案

解决方案1 1 2015-12-19 03:07:14

解决方案1
1 2015-12-19 03:07:14