简体   繁体   English

计算 Pandas 数据框中行之间的差异

[英]Calculate difference between rows in Pandas dataframe

I have a data frame with data about affiliates, offers, clicks, etc. I want to calculate the difference in conversions( column name « Appr ») between yesterday and today for each offer + affiliate.我有一个数据框,其中包含有关附属公司、优惠、点击等的数据。我想计算每个优惠 + 附属公司昨天和今天之间的转换差异(列名称« Appr »)。

D;H;AfID;Affil_name;M;OfID;Offer_name;Clicks;Revenue;Earnings;Appr;Decl;CR;Tr-back
28;11;10;elephant;Ella;1132;App_Aweepstakes;2100;0;0;100;0;1:10;1
28;11;1828;a.kalen;Ella;2675;Cash App_Sweepstakes/CPA_US;3;0;0;200;0;1:50;0
29;11;1828;a.kalen;Ella;2675;Cash App_Sweepstakes/CPA_US;11;0;0;350;0;1:50;0

To do this, I use groupby and diff ():为此,我使用 groupby 和 diff():

final_df[´DifAppr’] = final_df.groupby(['H', 'AfID', 'Affil_name', 'M', 'OfID','Offer_name'])[´Appr’].diff().fillna(0)

But if there is no data in the dataframe for the previous day for this offer + affiliate, then this line is ignored and not calculated:但如果前一天的数据框中没有此优惠 + 会员的数据,则忽略此行且不计算:

D;H;AfID;Affil_name;M;OfID;Offer_name;Clicks;Revenue;Earnings;Appr;Decl;CR;Tr-back, DiffAppr
29;11;1828;a.kalen;Ella;2675;Cash App_Sweepstakes/CPA_US;11;0;0;350;0;1:50;0;150

I want this line to remain the same in this case.我希望这条线在这种情况下保持不变。 That is, the conversions for the previous day would have been 0, and for the difference in conversions between today and yesterday, the data for today is displayed.也就是说,前一天的转换为 0,而对于今天和昨天之间的转换差异,则显示今天的数据。 That is, for yesterday the affiliate had 0 conversion and this line is not in the dataframe, therefore, today there were 68 conversions.也就是说,昨天附属公司有 0 次转换,而这条线不在数据框中,因此,今天有 68 次转换。 In this case, the "DiffAppr" column should be 68.在这种情况下,“DiffAppr”列应为 68。

I haven't tried it if there are more than two groups in the same group, but if the number of groupings is 1, I don't do anything;同一组有两个以上的组我没试过,但是如果分组数是1,我什么都不做; if there are more than two, I do a diff() .如果有两个以上,我会做一个diff()

final_df['DifAppr'] = (final_df.groupby(['H', 'AfID', 'Affil_name', 'M', 'OfID','Offer_name'])['Appr']
                        .apply(lambda x: x.diff() if len(x) >= 2 else x)).fillna(0)
final_df

D   H   AfID    Affil_name  M   OfID    Offer_name  Clicks  Revenue Earnings    Appr    Decl    CR  Tr-back DifAppr
0   28  11  10  elephant    Ella    1132    App_Aweepstakes 2100    0   0   100 0   1:10    1   100.0
1   28  11  1828    a.kalen Ella    2675    Cash App_Sweepstakes/CPA_US 3   0   0   200 0   1:50    0   0.0
2   29  11  1828    a.kalen Ella    2675    Cash App_Sweepstakes/CPA_US 11  0   0   350 0   1:50    0   150.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM