简体   繁体   English

pandas groupby列缺失

[英]pandas groupby columns missing

How can I get each of the individual names in the following script to have both 'YES' and 'NO' counts beside their names? 如何在以下脚本中获取每个名称,并在其名称旁边加上“是”和“否”? I need to have some value for each even if it's zero. 即使它为零,我也需要为每个人提供一些价值。

import pandas as pd
import numpy as np

df = pd.DataFrame({'names': ['Charlie', 'Charlie', 'Charlie', 'Charlie', 'Bryan', 
                             'Bryan', 'Bryan', 'Bryan', 'Jaimie', 'Jaimie',
                             'Jaimie', 'Jaimie'], 
                   'passed': ['YES', 'YES', 'YES', 'YES', 'NO', 'NO', 'NO', 'NO', 
                              'YES', 'NO', 'YES', 'NO']})

df2 = pd.DataFrame(df.groupby([df['names'], df['passed']]).size())
df2.columns = ['Count']

print(df2)

                Count
names   passed       
Bryan   NO          4
Charlie YES         4
Jaimie  NO          2
        YES         2

You can use reindex: 你可以使用reindex:

df2
Out: 
                Count
names   passed       
Bryan   NO          4
Charlie YES         4
Jaimie  NO          2
        YES         2

idx = pd.MultiIndex.from_product([df['names'].unique(), df['passed'].unique()])

df2.reindex(idx, fill_value=0)
Out: 
             Count
Charlie YES      4
        NO       0
Bryan   YES      0
        NO       4
Jaimie  YES      2
        NO       2

For this example, crosstab with unstack can also be an option: 对于此示例,带有unstack的交叉表也可以是一个选项:

pd.crosstab(df['passed'], df['names']).unstack()
Out: 
names    passed
Bryan    NO        4
         YES       0
Charlie  NO        0
         YES       4
Jaimie   NO        2
         YES       2
dtype: int64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM