Using a lambda function for pandas.DataFrame boolean indexing reports TypeError

Question

I have a list of words called sowpods and I need to verify which combination of letters exist either as a word or within a word.我有一个名为 sowpods 的单词列表，我需要验证哪些字母组合作为单词存在或存在于单词中。

For example, if my letters are ['r', 't', 'e', 'f'] , one of the possible combinations is 're' which is within 'red' , therefore the word 'red' should be kept.例如，如果我的字母是['r', 't', 'e', 'f'] ，则可能的组合之一是're' ，它在'red'内，因此应该保留单词 'red' .

I already have some code that can figure out all of the possible combinations, but now I want to find how to add all of the words that fit the requirements to a list.我已经有一些代码可以找出所有可能的组合，但现在我想找到如何将所有符合要求的单词添加到列表中。

I have done the following:我做了以下事情：

import pandas as pd

sowpods = pd.read_csv('sowpods.csv', names=['Word'])

possible_combination = 'RE'
possible_words = pd.DataFrame([], columns=['Word'])

comb_in_word = lambda _: True if (possible_combination in _) else False # ------ line 8

sowpods_bool = sowpods['Word'].apply(comb_in_word) # --------------------------- line 10
possible_words.append(sowpods.loc[sowpods_bool, 'Word'])

But then I get:但后来我得到：

 File "c:\tests.py", line 10, in <module>
    sowpods_bool = sowpods['Word'].apply(comb_in_word)
  File "C:\Python38-32\lib\site-packages\pandas\core\series.py", line 3848, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas\_libs\lib.pyx", line 2329, in pandas._libs.lib.map_infer
  File "c:\Users\lenovo\OneDrive\Prog\Projects\Scrabble\tests.py", line 8, in <lambda>
    comb_in_word = lambda _: True if possible_combination in _ else False
TypeError: argument of type 'float' is not iterable

I tested my lambda function in a more controlled environment and it worked fine, so I'm confident that the error's not coming from there.我在一个更受控的环境中测试了我的 lambda function，它运行良好，所以我确信错误不是来自那里。

I don't understand why I get this error when I'm not iterating through anything myself.我不明白为什么我自己没有迭代任何东西时会出现这个错误。 I get that pandas is iterating through the DataFrame's column, but it shouldn't do an error where it's using floats instead of integers.我知道 pandas 正在遍历 DataFrame 的列，但它不应该在使用浮点数而不是整数时出错。

Edit:编辑：

[In]
print(sowpods.head())
[Out]
      Word
0      AA
1     AAH
2   AAHED
3  AAHING
4    AAHS

[In]
print(sowpods.dtypes)
[Out]
Word    object
dtype: object

Answer 1

In the list of words there were 'NA' and 'NULL' , which Pandas represented as NaN s.在单词列表中有'NA'和'NULL' ，其中 Pandas 表示为NaN s。 I had to specify keep_default_na=False :我必须指定keep_default_na=False ：

sowpods = pd.read_csv('projects/scrabble/sowpods_en.csv', names=['Word'], keep_default_na=False)

Using a lambda function for pandas.DataFrame boolean indexing reports TypeError

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-05-28 14:10:57

Using a lambda function for pandas.DataFrame boolean indexing reports TypeError

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-05-28 14:10:57

解决方案1
0 已采纳 2020-05-28 14:10:57