How can I get each of the individual names in the following script to have both 'YES' and 'NO' counts beside their names? I need to have some value for each even if it's zero.
import pandas as pd
import numpy as np
df = pd.DataFrame({'names': ['Charlie', 'Charlie', 'Charlie', 'Charlie', 'Bryan',
'Bryan', 'Bryan', 'Bryan', 'Jaimie', 'Jaimie',
'Jaimie', 'Jaimie'],
'passed': ['YES', 'YES', 'YES', 'YES', 'NO', 'NO', 'NO', 'NO',
'YES', 'NO', 'YES', 'NO']})
df2 = pd.DataFrame(df.groupby([df['names'], df['passed']]).size())
df2.columns = ['Count']
print(df2)
Count
names passed
Bryan NO 4
Charlie YES 4
Jaimie NO 2
YES 2
You can use reindex:
df2
Out:
Count
names passed
Bryan NO 4
Charlie YES 4
Jaimie NO 2
YES 2
idx = pd.MultiIndex.from_product([df['names'].unique(), df['passed'].unique()])
df2.reindex(idx, fill_value=0)
Out:
Count
Charlie YES 4
NO 0
Bryan YES 0
NO 4
Jaimie YES 2
NO 2
For this example, crosstab with unstack can also be an option:
pd.crosstab(df['passed'], df['names']).unstack()
Out:
names passed
Bryan NO 4
YES 0
Charlie NO 0
YES 4
Jaimie NO 2
YES 2
dtype: int64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.