[英]How to check if entries in Pandas DataFrame are in a List using pandas.apply
I have a DataFrame with a column name
that includes string data-type.我有一个 DataFrame,其列name
包含字符串数据类型。 I want to check if entries of this column exist in a Reference list.我想检查此列的条目是否存在于参考列表中。 I tried pandas.apply
, but it doesn't work.我试过pandas.apply
,但它不起作用。
Sample data:样本数据:
import pandas as pd
data = [('A', '10'),
('B', '10'),
('C', '10'),
('D', '10'),
('E', '20'),
('F', '20'),
('G', '25') ]
data_df = pd.DataFrame(data, columns = ['name', 'value'])
Sample code:示例代码:
reference = ['A', 'B', 'Z']
def is_in_reference(x, reference):
if x in reference:
return 'Yes'
else:
return 'No'
data_df['is_in_reference'] = data_df['name'].apply(is_in_reference, args=(reference))
But, I get the error:但是,我得到了错误:
TypeError: is_in_reference() takes 2 positional arguments but 4 were given
I appreciate it if you could help me on this.如果您能在这方面帮助我,我将不胜感激。
You can actually use the built-in Series.isin
function as in您实际上可以使用内置的Series.isin
function,如
data_df['is_in_reference'] = data_df['name'].isin(reference)
But since you asked about apply
, the fix is actually a small yet nefarious Python syntax issue, you MUST add a trailing comma in the args tuple:但是由于您询问apply
,修复实际上是一个小而邪恶的 Python 语法问题,您必须在 args 元组中添加一个尾随逗号:
data_df['is_in_reference'] = data_df['name'].apply(is_in_reference, args=(reference,))
NOTE the ,
in (reference,)
, otherwise Python does not turn this into a tuple.注意,
in (reference,)
,否则 Python 不会把它变成一个元组。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.