[英]Using a lambda function for pandas.DataFrame boolean indexing reports TypeError
I have a list of words called sowpods and I need to verify which combination of letters exist either as a word or within a word.我有一个名为 sowpods 的单词列表,我需要验证哪些字母组合作为单词存在或存在于单词中。
For example, if my letters are ['r', 't', 'e', 'f']
, one of the possible combinations is 're'
which is within 'red'
, therefore the word 'red' should be kept.例如,如果我的字母是
['r', 't', 'e', 'f']
,则可能的组合之一是're'
,它在'red'
内,因此应该保留单词 'red' .
I already have some code that can figure out all of the possible combinations, but now I want to find how to add all of the words that fit the requirements to a list.我已经有一些代码可以找出所有可能的组合,但现在我想找到如何将所有符合要求的单词添加到列表中。
I have done the following:我做了以下事情:
import pandas as pd
sowpods = pd.read_csv('sowpods.csv', names=['Word'])
possible_combination = 'RE'
possible_words = pd.DataFrame([], columns=['Word'])
comb_in_word = lambda _: True if (possible_combination in _) else False # ------ line 8
sowpods_bool = sowpods['Word'].apply(comb_in_word) # --------------------------- line 10
possible_words.append(sowpods.loc[sowpods_bool, 'Word'])
But then I get:但后来我得到:
File "c:\tests.py", line 10, in <module>
sowpods_bool = sowpods['Word'].apply(comb_in_word)
File "C:\Python38-32\lib\site-packages\pandas\core\series.py", line 3848, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas\_libs\lib.pyx", line 2329, in pandas._libs.lib.map_infer
File "c:\Users\lenovo\OneDrive\Prog\Projects\Scrabble\tests.py", line 8, in <lambda>
comb_in_word = lambda _: True if possible_combination in _ else False
TypeError: argument of type 'float' is not iterable
I tested my lambda function in a more controlled environment and it worked fine, so I'm confident that the error's not coming from there.我在一个更受控的环境中测试了我的 lambda function,它运行良好,所以我确信错误不是来自那里。
I don't understand why I get this error when I'm not iterating through anything myself.我不明白为什么我自己没有迭代任何东西时会出现这个错误。 I get that pandas is iterating through the DataFrame's column, but it shouldn't do an error where it's using floats instead of integers.
我知道 pandas 正在遍历 DataFrame 的列,但它不应该在使用浮点数而不是整数时出错。
Edit:编辑:
[In]
print(sowpods.head())
[Out]
Word
0 AA
1 AAH
2 AAHED
3 AAHING
4 AAHS
[In]
print(sowpods.dtypes)
[Out]
Word object
dtype: object
In the list of words there were 'NA'
and 'NULL'
, which Pandas represented as NaN
s.在单词列表中有
'NA'
和'NULL'
,其中 Pandas 表示为NaN
s。 I had to specify keep_default_na=False
:我必须指定
keep_default_na=False
:
sowpods = pd.read_csv('projects/scrabble/sowpods_en.csv', names=['Word'], keep_default_na=False)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.