简体   繁体   English

如何使用 pandas.apply 检查 Pandas DataFrame 中的条目是否在列表中

[英]How to check if entries in Pandas DataFrame are in a List using pandas.apply

I have a DataFrame with a column name that includes string data-type.我有一个 DataFrame,其列name包含字符串数据类型。 I want to check if entries of this column exist in a Reference list.我想检查此列的条目是否存在于参考列表中。 I tried pandas.apply , but it doesn't work.我试过pandas.apply ,但它不起作用。

Sample data:样本数据:

import pandas as pd

data = [('A', '10'),
        ('B', '10'),
        ('C', '10'),
        ('D', '10'),
        ('E', '20'),
        ('F', '20'),
        ('G', '25') ]

data_df = pd.DataFrame(data, columns = ['name', 'value'])

Sample code:示例代码:

reference = ['A', 'B', 'Z']


def is_in_reference(x, reference):
    if x in reference:
        return 'Yes'
    else:
        return 'No'
    

data_df['is_in_reference'] = data_df['name'].apply(is_in_reference, args=(reference))

But, I get the error:但是,我得到了错误:

TypeError: is_in_reference() takes 2 positional arguments but 4 were given

I appreciate it if you could help me on this.如果您能在这方面帮助我,我将不胜感激。

You can actually use the built-in Series.isin function as in您实际上可以使用内置的Series.isin function,如

data_df['is_in_reference'] = data_df['name'].isin(reference)

But since you asked about apply , the fix is actually a small yet nefarious Python syntax issue, you MUST add a trailing comma in the args tuple:但是由于您询问apply ,修复实际上是一个小而邪恶的 Python 语法问题,您必须在 args 元组中添加一个尾随逗号:

data_df['is_in_reference'] = data_df['name'].apply(is_in_reference, args=(reference,))

NOTE the , in (reference,) , otherwise Python does not turn this into a tuple.注意, in (reference,) ,否则 Python 不会把它变成一个元组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM