how to groupby and aggregate in pandas

Question

I have following pandas dataframe

  index    key                                   start   end     nozzle  tank
  0        2018-01-01 02:00:01 - 02:30:00_1_1    2000    2003    1       1 
  1        2018-01-01 02:00:01 - 02:30:00_1_1    2003    2006    1       1 
  2        2018-01-01 02:00:01 - 02:30:00_1_1    2006    2008    1       1
  3        2018-01-01 02:00:01 - 02:30:00_1_1    2008    2010    1       1
  4        2018-01-01 02:00:01 - 02:30:00_1_1    2010    2012    1       1 
  5        2018-01-01 02:00:01 - 02:30:00_1_2    2002    2009    2       1 
  6        2018-01-01 02:00:01 - 02:30:00_1_2    2009    2011    2       1
  7        2018-01-01 02:00:01 - 02:30:00_1_2    2011    2013    2       1
  8        2018-01-01 02:00:01 - 02:30:00_1_2    2013    2015    2       1
  9        2018-01-01 03:30:01 - 04:00:00_1_3    2020    2022    3       1

Now I want to take first and last observation of every key and find the difference,where there is only one observation of key,it should calculate the difference between end - start of same observation.

calculation is for nozzle 1 = 2012-2000 = 12 nozzle 2 = 2015-2002 = 13

My desired dataframe would be

  index   key                                   nozzle_1  nozzle_2  nozzle_3
  0       2018-01-01 02:00:01 - 02:30:00_1_1    12        0         0 
  1       2018-01-01 02:00:01 - 02:30:00_1_2    0         13        0 
  2       2018-01-01 03:30:01 - 04:00:00_1_3    0         0         2

Answer 1

Use:

df1 = (df.groupby(['key','nozzle'])
        .agg({'start':'first','end':'last'})
        .assign(dif = lambda x: x['end'] - x['start'])['dif']
        .unstack(fill_value=0)
        .add_prefix('nozzle_')
        .reset_index()
        .rename_axis(None, axis=1))
print (df1)
                                  key  nozzle_1  nozzle_2  nozzle_3
0  2018-01-01 02:00:01 - 02:30:00_1_1        12         0         0
1  2018-01-01 02:00:01 - 02:30:00_1_2         0        13         0
2  2018-01-01 03:30:01 - 04:00:00_1_3         0         0         2

Explanation :

First aggregate by agg with first and last
Create new column by assign with subtraction
Reshape by unstack
Change columns names by add_prefix
Last data cleaning by reset_index and rename_axis

how to groupby and aggregate in pandas

Question

1 answers

solution1
2 ACCPTED 2018-10-09 12:13:03

how to groupby and aggregate in pandas

Question

1 answers

solution1 2 ACCPTED 2018-10-09 12:13:03

solution1
2 ACCPTED 2018-10-09 12:13:03