繁体   English   中英

如何在 Pandas Groupby 中仅显示带有值的列

[英]How to show only column with Values in Pandas Groupby

你好数据科学家和熊猫专家,

我需要一些帮助,因为我无法正确组织我的数据。 这是我的数据框:

df_dict = [ {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store1', 'employee': 'emp1', 'duties': 'opening'}, \
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'deli'}, \
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store1', 'employee': 'emp3', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store2', 'employee': 'emp1', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store2', 'employee': 'emp4', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store2', 'employee': 'emp4', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store2', 'employee': 'emp5', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store3', 'employee': 'emp2', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store3', 'employee': 'emp6', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store3', 'employee': 'emp7', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store3', 'employee': 'emp6', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store1', 'employee': 'emp1', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store1', 'employee': 'emp3', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store2', 'employee': 'emp1', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store2', 'employee': 'emp4', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store2', 'employee': 'emp4', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store2', 'employee': 'emp5', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store3', 'employee': 'emp2', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store3', 'employee': 'emp6', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store3', 'employee': 'emp7', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store3', 'employee': 'emp6', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store1', 'employee': 'emp1', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store1', 'employee': 'emp3', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store2', 'employee': 'emp1', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store2', 'employee': 'emp4', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store2', 'employee': 'emp4', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store2', 'employee': 'emp5', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store3', 'employee': 'emp2', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store3', 'employee': 'emp6', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store3', 'employee': 'emp7', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store3', 'employee': 'emp6', 'duties': 'deli'}]

我想按如下方式组织我的输出:

                     Store 1               Store 2          store3      
    Week          emp1  emp2  emp3     emp1 emp4 emp5   emp2 emp6 emp7
    2013-12-30     2    4       2        2    4   2      2    4    2
    2014-01-06     1    1       1        1    1   1      2    1    1

所以我尝试通过表达式遵循 Group:

df_group = dict_df.groupby([pd.Grouper(key='Date', freq='W-MON'), 'Store', 'employee'])\
                            ['duties'].count().unstack(level=1).unstack(level=1).reset_index()

但是,它显示了所有员工,而不是显示员工在该特定商店中的工作示例:

                      Store 1                            
Week          emp1  emp2  emp3 emp4 emp5 emp6  emp7 
2013-12-30     2    4       2   NaN NaN  NaN   NaN 
2014-01-06     1    1       1   NaN NaN  NaN   NaN

那么我怎样才能得到我想要的结果。 基本上我想过滤掉不在该商店工作的员工。

为了这个需要使用 Groupby 更好还是我应该考虑其他方法?

预先感谢您的帮助和考虑。

尝试取消堆叠多个级别[1, 2]

df_out = (df.groupby([pd.Grouper(key='Date', freq='W-MON'), 'Store', 'employee'])['duties']
            .count()
            .unstack(level=[1, 2])
        )
print(df_out)

印刷:

Store      store1           store2           store3          
employee     emp1 emp2 emp3   emp1 emp4 emp5   emp2 emp6 emp7
Date                                                         
2014-01-06      2    4    2      2    4    2      2    4    2
2014-01-13      1    2    1      1    2    1      1    2    1

您可以同时取消堆叠两个级别:

(df.groupby([pd.Grouper(key='Date', freq='W-MON'), 'Store','employee'])
   .size().unstack(['Store','employee'])
)

输出:

Store      store1           store2           store3          
employee     emp1 emp2 emp3   emp1 emp4 emp5   emp2 emp6 emp7
Date                                                         
2014-01-06      2    4    2      2    4    2      2    4    2
2014-01-13      1    2    1      1    2    1      1    2    1

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM