简体   繁体   中英

Find percent increase and diff regardless of the date order (in Python)

I have a dataset, df, where I wish to calculate the percent increase of the sum of a particular group over a time period. Here is the dataset:

   date      size               type
   1/1/2020  3                   a
   1/1/2020  13                  b
   1/1/2020  1                   c
   2/1/2020  51                  a
   2/1/2019  10                  b

Desired output

Then find percent diff and diff from earliest date,

date       diff     percentdiff    type

2/1/2020   48       1600           a
1/1/2020   3        30             b
1/1/2020   0        0              c   

We see that group 'a' went from 3 to 51 , ( from 1/1/2020 to 2/1/2020 ) which gives us a difference of 48 , and a percent difference of 1600% Group c is 0 because there is no change.

Percent Increase/Change is  final-inital/initial * 100

This is what I have tried:

  df1 = df.groupby(['type','date'])['size'].agg(lambda x: 
  (x.iloc[-1]/x.iloc[0]-1)*100).to_frame('increase')
  df1['diff'] = df.groupby(['type','date']).agg(lambda x:x.iloc[-1]-x.iloc[0])

I am still researching this. Any suggestion is appreciated.

There is probably a more concise solution, but this works:

df['date'] = pd.to_datetime(df['date'])
grouped = df.sort_values('date').groupby(['type'])

output = pd.DataFrame({
  'date': grouped['date'].agg(lambda x: x.iloc[-1]).values,
  'diff': grouped['size'].agg(lambda x: x.diff().fillna(0).iloc[-1]).values,
  'percentdiff': grouped['size'].agg(lambda x: x.pct_change().fillna(0).iloc[-1] * 100).values,
  'type': grouped['type'].agg(lambda x: x.iloc[0]).values
})

demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM