简体   繁体   English

如何在 pandas 中使用 pd.grouper 和 groupby

[英]How to use pd.grouper along with groupby in pandas

This is my dataframe这是我的 dataframe

    S2PName-Category    S2BillDate  totSale
0   Food               2019-05-18   2150.0
1   Beverages          2019-05-19   403.0
2   Food               2019-05-19   7254.0
3   Others             2019-05-19   200.0
4   Juice              2019-05-19   125.0
5   Snacks             2019-05-19   70.0
6   Food               2019-06-21   11932.0

I want to group by s2PName-category and group s2Billdate by freq(monthly or weekly or day) and agg totsale我想按 s2PName-category 分组,按频率(每月或每周或每天)和 agg totsale 对 s2Billdate 分组

ie if I groupby Billdate with freq as monthly, then my resul df shld have 'Food' for the months 'may' and 'june' with their total sale summed up.即,如果我按月按频率对 Billdate 进行分组,那么我的结果 df 将在“可能”和“六月”这两个月有“食物”,并将它们的总销售额相加。

I managed to write some code which is like below,我设法编写了一些代码,如下所示,

basic_df = basic_df.groupby(['S2PName-Category','S2BillDate'], sort=False)['S2PGTotal'].agg([('totSale','sum')]).reset_index()

Expected DF output:预期 DF output:

  S2PName-Category    S2BillDate  totSale
0   Food               2019-05-31   9404.0
1   Beverages          2019-05-31   403.0
3   Others             2019-05-31   200.0
4   Juice              2019-05-31   125.0
5   Snacks             2019-05-31   70.0
6   Food               2019-06-30   11932.0

In my expected o/p df, I have the s2Billdate set to last day of the month and totSale agg for that month.在我预期的 o/p df 中,我将 s2Billdate 设置为该月的最后一天,并将该月的 totSale agg 设置为。 How can I achieve this?我怎样才能做到这一点?

You can do something like this:你可以这样做:

In [706]: df                                                                                                                                                                                                
Out[706]: 
    Category    BillDate  totSale
0       Food  2019-05-18   2150.0
1  Beverages  2019-05-19    403.0
2       Food  2019-05-19   7254.0
3     Others  2019-05-19    200.0
4      Juice  2019-05-19    125.0
5     Snacks  2019-05-19     70.0
6       Food  2019-06-21  11932.0

In [710]: df.groupby([df['BillDate'].dt.strftime('%B'), 'Category'])['totSale'].sum()                                                                                                                       
Out[710]: 
BillDate  Category 
June      Food         11932.0
May       Beverages      403.0
          Food          9404.0
          Juice          125.0
          Others         200.0
          Snacks          70.0
Name: totSale, dtype: float64

I believe this is what you wanted.我相信这就是你想要的。

basic_df_2 = basic_df.groupby(['S2PName-Category',basic_df['S2BillDate'].dt.to_period('M')], sort=False)['S2PGTotal'].agg([('totSale','sum')]).reset_index()

dt.to_period will help in taking up arguments related to frequency ! dt.to_period 将有助于占用与频率相关的 arguments !

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM