简体   繁体   English

TypeError:内置max函数中的列表上的'float'对象不可迭代

[英]TypeError: 'float' object is not iterable on a list in built in max function

I am trying to find the closest match to an approximate movie title given an actual movie title using the max function and its key argument. 我正在尝试使用max函数及其键参数在给定实际电影标题的情况下找到与近似电影标题最接近的匹配项。 If I define a sample list and test the function it works... 如果我定义了一个示例列表并测试了功能,那么它可以工作...

from difflib import SequenceMatcher as SM
movies = ['fake movie title', 'faker movie title', 'shaun died']
approx_title = 'Shaun of the Dead.'
max(movies, key = lambda title: SM(None, approx_title, title).ratio())
'shaun died'

But I am trying to match to an entire column in a separate dataframe, so I tried converting that Pandas Series to a list and running the same function, but instead I get a type_error, even though I've checked the datatype of both movies & movie_lst are lists. 但是我试图匹配单独数据框中的整个列,因此我尝试将Pandas Series转换为列表并运行相同的功能,但是即使我检查了两部电影的数据类型,也遇到了type_error错误, movie_lst是列表。

Old id  New id  Title   Year    Critics Score   Audience Score  Rating
NaN     21736.0 Peter Pan   1999.0  NaN 70.0    PG nothing objectionable
NaN     771471359.0 Dragonheart Battle for the Heartfire    2017.0  NaN 50.0    PG13
NaN     770725090.0 The Nude Vampire Vampire nue, La    1974.0  NaN 24.0    NR
2281.0  19887.0 Beyond the Clouds   1995.0  65.0    67.0    NR
10913.0 11286.0 Wild America    1997.0  27.0    59.0    PG violence

movie_lst = rt_info['Title'].tolist()
 ['Peter Pan',
 'Dragonheart Battle for the Heartfire',
 'The Nude Vampire Vampire nue, La',
 'Beyond the Clouds',
 'Wild America',
 'Sexual Dependency',
 'Body Slam',
 'Hatchet II',
 'Lion of the Desert Omar Mukhtar',
 'Imagine That',
 'Harold',
 'A United Kingdom',
 'Violent City The FamilyCitt violenta',
 'Ratchet  Clank',
 'Wes Craven Presents Carnival of Souls',
 'The Adventures of Ociee Nash',
 'Blackfish',
 'For Petes Sake',
 'Daybreakers',
 'The Big One',
 'Godzilla vs Megaguirus',
 'In a Lonely Place',
 'Case 39', ...
]

max(movie_lst, key = lambda title: SM(None, approx_title, title).ratio())

TypeError                                 Traceback (most recent call last)
<ipython-input-88-0022a3c1bdb9> in <module>()
----> 1 max(movie_lst, key = lambda title: SM(None, approx_title, title).ratio())

<ipython-input-88-0022a3c1bdb9> in <lambda>(title)
----> 1 max(movie_lst, key = lambda title: SM(None, approx_title, title).ratio())

/usr/lib/python3.4/difflib.py in __init__(self, isjunk, a, b, autojunk)
    211         self.a = self.b = None
    212         self.autojunk = autojunk
--> 213         self.set_seqs(a, b)
    214 
    215     def set_seqs(self, a, b):

/usr/lib/python3.4/difflib.py in set_seqs(self, a, b)
    223 
    224         self.set_seq1(a)
--> 225         self.set_seq2(b)
    226 
    227     def set_seq1(self, a):

/usr/lib/python3.4/difflib.py in set_seq2(self, b)
    277         self.matching_blocks = self.opcodes = None
    278         self.fullbcount = None
--> 279         self.__chain_b()
    280 
    281     # For each element x in b, set b2j[x] to a list of the indices in

/usr/lib/python3.4/difflib.py in __chain_b(self)
    309         self.b2j = b2j = {}
    310 
--> 311         for i, elt in enumerate(b):
    312             indices = b2j.setdefault(elt, [])
    313             indices.append(i)

TypeError: 'float' object is not iterable

I'm stumped as to why - any help would be appreciated! 我很困惑为什么-任何帮助将不胜感激!

Not a pandas expert and cannot reproduce but depending on how the file is read, since there are titles (like the french movie 11.6 for instance) which match a float, it's possible that some data are float s instead of strings (well your issue proves that it is possible :)) 不是熊猫专家,不能复制,但是取决于文件的读取方式,因为存在与浮点数匹配的标题(例如,法国电影11.6 ),所以某些数据可能是float而不是字符串(很好,您的问题证明了) 可能:))

A good workaround would be to force data as string like this: 一个好的解决方法是将数据强制为这样的字符串:

movie_lst = [str(x) for x in movie_lst]

It doesn't create copies of the strings if they are already strings ( Should I avoid converting to a string if a value is already a string? ) so it's efficient, and you are sure to get only strings. 如果它们已经是字符串,它不会创建字符串的副本( 如果值已经是字符串,是否应该避免转换为字符串? ),所以它是有效的,而且您肯定只会得到字符串。

note that you can find the offenders by printing: 请注意,您可以通过打印找到违规者:

[x for x in movie_lst if not isinstance(x,str)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM