Python Pandas Dataframe create column as number of occurrence of string in another columns

Question

I have a dataframe and I want to count how many times a string (say 'Yes') has occurred in all other columns. I want to add count into new column and call it 'Yes-Count'.

I have it working using lamda and following example Creating a new column based on if-elif-else condition

I am curious if this can be done in one line.

This is sample data and code.

import pandas as pd

def finalCount(row):
    count = 0
    if row['Col1'] == 'Yes':
        count = count + 1 
    if row['Col2'] == 'Yes':
        count = count + 1 
    if row['Col3'] == 'Yes':
        count = count + 1
    if row['Col4'] == 'Yes':
        count = count + 1
    return count

data = {
         'Col1': ['Yes', 1, 'No', 'Yes'],
         'Col2': ['Yes', 2, 'No', 'Yes'],
         'Col3': ['No', 3, 'Yes', 'Yes'],
         'Col4': ['Yes', 4, 'No', 'Yes'],
    }
dfData = pd.DataFrame(data, columns= ['Col1','Col2','Col3','Col4'])
dfData['Yes-Count'] = dfData.apply(finalCount, axis =1)

I get result as expected.

Is there a way to get rid of finalCount method and do this in one line?

Answer 1

Here's one way using a boolean mask and sum:

dfData["Yes-Count"] = dfData.eq('Yes').sum(axis=1)
print(dfData)
#  Col1 Col2 Col3 Col4  Yes-Count
#0  Yes  Yes   No  Yes          3
#1    1    2    3    4          0
#2   No   No  Yes   No          1
#3  Yes  Yes  Yes  Yes          4

Explanation

dfData.eq("Yes") returns a dataframe of equal shape with boolean values representing if the value in that location is equal to "Yes"
Sum these across the columns (axis=1)
Assign the output back as a new column

Answer 2

Here is another approach using the isin() function:

list_of_words = ['Yes']
dfData["Yes-Count"] = dfData.isin(list_of_words).sum(axis='columns')

Using this approach you can compare your DataFrame elements with multiple values. The isin() function returns a boolean DataFrame which shows whether your DataFrame elements match to any of the words in list_of_words .

Python Pandas Dataframe create column as number of occurrence of string in another columns

Question

2 answers

solution1
3 ACCPTED 2018-04-10 18:50:08

solution2
1 2018-04-10 19:40:28

Python Pandas Dataframe create column as number of occurrence of string in another columns

Question

2 answers

solution1 3 ACCPTED 2018-04-10 18:50:08

solution2 1 2018-04-10 19:40:28

solution1
3 ACCPTED 2018-04-10 18:50:08

solution2
1 2018-04-10 19:40:28