[英]Python Pandas Dataframe create column as number of occurrence of string in another columns
I have a dataframe and I want to count how many times a string (say 'Yes') has occurred in all other columns. 我有一个数据框,我想计算一个字符串在所有其他列中出现了多少次(例如“是”)。 I want to add count into new column and call it 'Yes-Count'.
我想将计数添加到新列中,并将其称为“是计数”。
I have it working using lamda and following example Creating a new column based on if-elif-else condition 我使用lamda和下面的示例进行工作, 基于if-elif-else条件创建新列
I am curious if this can be done in one line. 我很好奇是否可以一行完成。
This is sample data and code. 这是示例数据和代码。
import pandas as pd
def finalCount(row):
count = 0
if row['Col1'] == 'Yes':
count = count + 1
if row['Col2'] == 'Yes':
count = count + 1
if row['Col3'] == 'Yes':
count = count + 1
if row['Col4'] == 'Yes':
count = count + 1
return count
data = {
'Col1': ['Yes', 1, 'No', 'Yes'],
'Col2': ['Yes', 2, 'No', 'Yes'],
'Col3': ['No', 3, 'Yes', 'Yes'],
'Col4': ['Yes', 4, 'No', 'Yes'],
}
dfData = pd.DataFrame(data, columns= ['Col1','Col2','Col3','Col4'])
dfData['Yes-Count'] = dfData.apply(finalCount, axis =1)
I get result as expected. 我得到预期的结果。
Is there a way to get rid of finalCount method and do this in one line? 有没有一种方法可以摆脱finalCount方法,而只需一行呢?
Here's one way using a boolean mask and sum: 这是使用布尔掩码和求和的一种方法:
dfData["Yes-Count"] = dfData.eq('Yes').sum(axis=1)
print(dfData)
# Col1 Col2 Col3 Col4 Yes-Count
#0 Yes Yes No Yes 3
#1 1 2 3 4 0
#2 No No Yes No 1
#3 Yes Yes Yes Yes 4
Explanation 说明
dfData.eq("Yes")
returns a dataframe of equal shape with boolean values representing if the value in that location is equal to "Yes"
dfData.eq("Yes")
返回具有相等形状的数据dfData.eq("Yes")
,该布尔值表示该位置的值是否等于"Yes"
Here is another approach using the isin()
function: 这是使用
isin()
函数的另一种方法:
list_of_words = ['Yes']
dfData["Yes-Count"] = dfData.isin(list_of_words).sum(axis='columns')
Using this approach you can compare your DataFrame
elements with multiple values. 使用这种方法,您可以将
DataFrame
元素与多个值进行比较。 The isin()
function returns a boolean DataFrame
which shows whether your DataFrame
elements match to any of the words in list_of_words
. DataFrame
isin()
函数返回一个布尔型DataFrame
,它显示您的DataFrame
元素是否与list_of_words
任何单词匹配。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.