简体   繁体   English

弹性搜索:“ fuzzy_like_this_field”过滤查询不起作用

[英]Elastic Search : “fuzzy_like_this_field” filter query is not working

I am facing below issue in elastic search filter: 我在弹性搜索过滤器中面临以下问题:

When I try to apply "fuzzy_like_this_field" on String value then it's working fine. 当我尝试在字符串值上应用“ fuzzy_like_this_field”时,它工作正常。

But when I apply "fuzzy_like_this_field" filter on different data type other than String(eg double,Date) it's not working. 但是,当我对除String(例如double,Date)以外的其他数据类型应用“ fuzzy_like_this_field”过滤器时,它将无法正常工作。

It gives 它给

ElasticsearchIllegalArgumentException[fuzzy_like_this_field doesn't support binary/numeric fields. ElasticsearchIllegalArgumentException [fuzzy_like_this_field不支持二进制/数字字段。

Please see below elastic search query 请参阅下面的弹性搜索查询

{"query": {"bool": {"must": [{"fuzzy_like_this_field": {"Receipts.retailerId": {"like_text": "55f5878916c042cc8731a39e4e05b7a0","fuzziness":0.3}}},{"fuzzy_like_this_field": {"Receipts.totalCost": {"like_text": "10","fuzziness":0.3}}}],"must_not": [],"should": []}},"from": 0,"size": 1000,"sort": [],"facets": {}}

Where retailerId - String and totalCost - double 其中retailerId-字符串和totalCost-双精度

if I change totalCost data type double to string then it works. 如果我将totalCost数据类型从double更改为string,则可以正常工作。

So please suggest any solution? 因此,请提出任何解决方案?

Fuzzy queries expand text search results to include terms a certain Levenshtein Distance from the query term. 模糊查询扩展了文本搜索结果,以包括与查询词相距某个Levenshtein距离的词。 They expand numeric values by a margin -fuzziness <= value <= +fuzzyiness (The number of characters needed to be changed or transposed to match) - However, fuzzy_like_this and fuzzy_like_this_field only seem to support string matching (via Levenshtein distance). 它们以裕度扩展数值-fuzziness <= value <= + fuzzyiness (需要更改或换位以匹配的字符数)-但是, fuzzy_like_thisfuzzy_like_this_field似乎仅支持字符串匹配(通过Levenshtein距离)。


fuzzy_like_this and fuzzy_like_this_field queries are deprecated in ES 1.6+. 在ES fuzzy_like_this_field中已弃用fuzzy_like_thisfuzzy_like_this_field查询。 And they both suffer from performance issues. 而且它们都遭受性能问题的困扰。 You should find another method for accomplishing your goal. 您应该找到实现目标的另一种方法。

There are a number of ways to apply fuzzy matching, but I'm not sure fuzzy matching is what you're after. 应用模糊匹配的方法有很多种,但是我不确定模糊匹配是您所追求的。

By specifying: 通过指定:

"fuzzy_like_this_field":{  
                  "Receipts.retailerId":{  
                     "like_text":"55f5878916c042cc8731a39e4e05b7a0",
                     "fuzziness":0.3
                  }
               }

You're asking to match all retailerId s which match the like_text with up to 22 edits. 你问匹配所有retailerId符合其中S like_text高达22名编辑。 Edit distance = length(term) * (1.0 - fuzziness) = 32 * 0.7 = 22.4

So in this case 55ddddddd6c0ddddddd1a3dddddddda0 would qualify as a fuzzy match to 55f5878916c042cc8731a39e4e05b7a0 because 10 of the characters share the same position. 因此,在这种情况下,由于10个字符共享相同的位置,因此55ddddddd6c0ddddddd1a3dddddddda0可以作为对55f5878916c042cc8731a39e4e05b7a0的模糊匹配。


If, instead, you're merely looking for duplicate transactions, why not simply use a match query or filter, to match your retailerId and totalCost exactly? 相反,如果您只是在寻找重复的交易,为什么不只使用匹配查询或过滤器来完全匹配您的retailerIdtotalCost

"query":{  
      "bool":{  
         "must":[  
            {  
               "match":{  
                  "Receipts.retailerId": "55f5878916c042cc8731a39e4e05b7a0" 
               }
            },
            {  
               "match":{  
                  "Receipts.totalCost": 10
               }
            }
         ]
      }
   }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM