Alternative to pandas groupby with lambda and diff

Question

Assume I have df below:

And the desired output is:

    V
0   NaN
1   1.0
2   NaN
3   -1.0

This can be done using groupby and lambda with diff :

df.groupby('ID').apply(lambda x: x.diff())

I am trying to come up with a solution that doesn't rely on lambda as this quickly becomes very slow. Any ideas?

UPDATE

Performance comparison between (1) using groupby , lambda and diff , and, (2) only using groupby and diff :

1

3.67 ms ± 238 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

2

2.42 ms ± 20.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Answer 1

Use .agg and pass diff

 df.groupby('ID')['V'].agg('diff')

0    NaN
1    1.0
2    NaN
3   -1.0

Answer 2

Well, in this case, groupby objects directly support diff :

>>> df
  ID  V
0  A  1
1  A  2
2  B  4
3  B  3
>>> df.groupby('ID').diff()
     V
0  NaN
1  1.0
2  NaN
3 -1.0
>>>

But I'm not sure if this will actually improve your performance. Using .apply on columns, ie across the first axis, shouldn't be slower than the above, it is basically equivalent (unlike .apply ing on the rows).

Alternative to pandas groupby with lambda and diff

Question

2 answers

solution1
2 2020-07-22 13:21:22

solution2
2 ACCPTED 2020-07-22 13:21:30

Alternative to pandas groupby with lambda and diff

Question

2 answers

solution1 2 2020-07-22 13:21:22

solution2 2 ACCPTED 2020-07-22 13:21:30

solution1
2 2020-07-22 13:21:22

solution2
2 ACCPTED 2020-07-22 13:21:30