簡體   English   中英

如何在 Pandas Groupby 中僅顯示帶有值的列

[英]How to show only column with Values in Pandas Groupby

你好數據科學家和熊貓專家,

我需要一些幫助,因為我無法正確組織我的數據。 這是我的數據框:

df_dict = [ {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store1', 'employee': 'emp1', 'duties': 'opening'}, \
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'deli'}, \
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store1', 'employee': 'emp3', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store2', 'employee': 'emp1', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store2', 'employee': 'emp4', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store2', 'employee': 'emp4', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store2', 'employee': 'emp5', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store3', 'employee': 'emp2', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store3', 'employee': 'emp6', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store3', 'employee': 'emp7', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store3', 'employee': 'emp6', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store1', 'employee': 'emp1', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store1', 'employee': 'emp3', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store2', 'employee': 'emp1', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store2', 'employee': 'emp4', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store2', 'employee': 'emp4', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store2', 'employee': 'emp5', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store3', 'employee': 'emp2', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store3', 'employee': 'emp6', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store3', 'employee': 'emp7', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-04 00:00:00'), 'Store': 'store3', 'employee': 'emp6', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store1', 'employee': 'emp1', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store1', 'employee': 'emp3', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store2', 'employee': 'emp1', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store2', 'employee': 'emp4', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store2', 'employee': 'emp4', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store2', 'employee': 'emp5', 'duties': 'deli'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store3', 'employee': 'emp2', 'duties': 'closing'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store3', 'employee': 'emp6', 'duties': 'opening'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store3', 'employee': 'emp7', 'duties': 'cashier'},\
            {'Date': Timestamp('2014-01-10 00:00:00'), 'Store': 'store3', 'employee': 'emp6', 'duties': 'deli'}]

我想按如下方式組織我的輸出:

                     Store 1               Store 2          store3      
    Week          emp1  emp2  emp3     emp1 emp4 emp5   emp2 emp6 emp7
    2013-12-30     2    4       2        2    4   2      2    4    2
    2014-01-06     1    1       1        1    1   1      2    1    1

所以我嘗試通過表達式遵循 Group:

df_group = dict_df.groupby([pd.Grouper(key='Date', freq='W-MON'), 'Store', 'employee'])\
                            ['duties'].count().unstack(level=1).unstack(level=1).reset_index()

但是,它顯示了所有員工,而不是顯示員工在該特定商店中的工作示例:

                      Store 1                            
Week          emp1  emp2  emp3 emp4 emp5 emp6  emp7 
2013-12-30     2    4       2   NaN NaN  NaN   NaN 
2014-01-06     1    1       1   NaN NaN  NaN   NaN

那么我怎樣才能得到我想要的結果。 基本上我想過濾掉不在該商店工作的員工。

為了這個需要使用 Groupby 更好還是我應該考慮其他方法?

預先感謝您的幫助和考慮。

嘗試取消堆疊多個級別[1, 2]

df_out = (df.groupby([pd.Grouper(key='Date', freq='W-MON'), 'Store', 'employee'])['duties']
            .count()
            .unstack(level=[1, 2])
        )
print(df_out)

印刷:

Store      store1           store2           store3          
employee     emp1 emp2 emp3   emp1 emp4 emp5   emp2 emp6 emp7
Date                                                         
2014-01-06      2    4    2      2    4    2      2    4    2
2014-01-13      1    2    1      1    2    1      1    2    1

您可以同時取消堆疊兩個級別:

(df.groupby([pd.Grouper(key='Date', freq='W-MON'), 'Store','employee'])
   .size().unstack(['Store','employee'])
)

輸出:

Store      store1           store2           store3          
employee     emp1 emp2 emp3   emp1 emp4 emp5   emp2 emp6 emp7
Date                                                         
2014-01-06      2    4    2      2    4    2      2    4    2
2014-01-13      1    2    1      1    2    1      1    2    1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM