简体   繁体   English

运行fuzzywuzzy时如何摆脱属性错误?

[英]How do I get rid of attribute Error when running fuzzywuzzy?

I'm trying to compare 2 lists and get a distance ratio for each item on the list.我正在尝试比较 2 个列表并获得列表中每个项目的距离比。 My code below returned an attribute error: 'Series' object has no attribute 'fuzz'.我下面的代码返回了一个属性错误:'Series' object has no attribute 'fuzz'。 How do i fix this?我该如何解决?

'differences' is a result from my earlier code for a list of companies with actual comparison (exact match) and df['Company'] is a column in my dataframe i'm trying to compare with. 'differences' 是我早期代码的结果,该代码是具有实际比较(完全匹配)的公司列表,df['Company'] 是我试图与之比较的 dataframe 中的一列。

from fuzzywuzzy import fuzz
from fuzzywuzzy import process
str1 = ['differences']
str2 = df['Company']
print ("distance {} -> {}: {}".format(str1,str2.fuzz.ratio(str1,str2)))
str1 = ['differences']
str2 = ['abcd','differ']
for x in str1:
    for y in str2:
          print ("distance {} -> {}: {}".format(x,y,fuzz.ratio(x,y)))

在此处输入图像描述

Replace str2 with df['Company']将 str2 替换为 df['Company']

According to your comments it appears you would like to iterate over a list and would like to find the closest match in a pandas Series.根据您的评论,您似乎想遍历一个列表并想在 pandas 系列中找到最接近的匹配项。 This answer is using RapidFuzz , since it is faster than fuzzywuzzy, but would work pretty much the same way with fuzzywuzzy.这个答案使用的是RapidFuzz ,因为它比fuzzywuzzy 快,但与fuzzywuzzy 的工作方式几乎相同。 To find the closest match in an iterable you can use process.extractOne , which will return a tuple (match, score) for a normal list, or a tuple (match, score, key) for objects that provide a .items() functions like eg a dict or a pandas.Series .要在可迭代对象中找到最接近的匹配项,您可以使用process.extractOne ,它将为普通列表返回一个元组(match, score) ,或为提供.items()函数的对象返回一个元组(match, score, key)例如dictpandas.Series

from rapidfuzz import process, fuzz

short_list = ['differences']
companies = df['Company']
for x in short_list:
  match = process.extractOne(x, companies, scorer=fuzz.ratio, processor=None)
  print("best match for {} is {} with a score of {} at the index {}"
    .format(x, match[0], match[1], match[2]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM