[英]pandas groupby columns missing
How can I get each of the individual names in the following script to have both 'YES' and 'NO' counts beside their names? 如何在以下脚本中获取每个名称,并在其名称旁边加上“是”和“否”? I need to have some value for each even if it's zero.
即使它为零,我也需要为每个人提供一些价值。
import pandas as pd
import numpy as np
df = pd.DataFrame({'names': ['Charlie', 'Charlie', 'Charlie', 'Charlie', 'Bryan',
'Bryan', 'Bryan', 'Bryan', 'Jaimie', 'Jaimie',
'Jaimie', 'Jaimie'],
'passed': ['YES', 'YES', 'YES', 'YES', 'NO', 'NO', 'NO', 'NO',
'YES', 'NO', 'YES', 'NO']})
df2 = pd.DataFrame(df.groupby([df['names'], df['passed']]).size())
df2.columns = ['Count']
print(df2)
Count
names passed
Bryan NO 4
Charlie YES 4
Jaimie NO 2
YES 2
You can use reindex: 你可以使用reindex:
df2
Out:
Count
names passed
Bryan NO 4
Charlie YES 4
Jaimie NO 2
YES 2
idx = pd.MultiIndex.from_product([df['names'].unique(), df['passed'].unique()])
df2.reindex(idx, fill_value=0)
Out:
Count
Charlie YES 4
NO 0
Bryan YES 0
NO 4
Jaimie YES 2
NO 2
For this example, crosstab with unstack can also be an option: 对于此示例,带有unstack的交叉表也可以是一个选项:
pd.crosstab(df['passed'], df['names']).unstack()
Out:
names passed
Bryan NO 4
YES 0
Charlie NO 0
YES 4
Jaimie NO 2
YES 2
dtype: int64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.