在Pandas中使用Groupby减去两列

Question

I have a dataframe and would like to subtract two columns of the previous row, provided that the previous row has the same Name value. 我有一个dataframe并希望减去前一行的两列，前提是前一行具有相同的Name值。 If it does not, then I would like it yield NAN and fill with - . 如果没有，那么我希望它产生NAN并填充- 。 My groupby expression yields the error, TypeError: 'Series' objects are mutable, thus they cannot be hashed , which is very ambiguous. 我的groupby表达式产生错误， TypeError: 'Series' objects are mutable, thus they cannot be hashed ，这是非常模糊的。 What am I missing? 我错过了什么？

import pandas as pd
df = pd.DataFrame(data=[['Person A', 5, 8], ['Person A', 13, 11], ['Person B', 11, 32], ['Person B', 15, 20]], columns=['Names', 'Value', 'Value1'])
df['diff'] = df.groupby('Names').apply(df['Value'].shift(1) - df['Value1'].shift(1)).fillna('-')
print df

Desired Output: 期望的输出：

      Names  Value  Value1  diff
0  Person A      5       8     -
1  Person A     13      11    -3
2  Person B     11      32     -
3  Person B     15      20   -21

Answer 1

You can add lambda x and change df['Value'] to x['Value'] , similar with Value1 and last reset_index : 您可以添加lambda x并将df['Value']更改为x['Value'] ，类似于Value1和reset_index ：

df['diff'] = df.groupby('Names')
               .apply(lambda x: x['Value'].shift(1) - x['Value1'].shift(1))
               .fillna('-')
               .reset_index(drop=True)
print (df)
      Names  Value  Value1 diff
0  Person A      5       8    -
1  Person A     13      11   -3
2  Person B     11      32    -
3  Person B     15      20  -21

Another solution with DataFrameGroupBy.shift : DataFrameGroupBy.shift另一个解决方案：

df1 = df.groupby('Names')['Value','Value1'].shift()
print (df1)
   Value  Value1
0    NaN     NaN
1    5.0     8.0
2    NaN     NaN
3   11.0    32.0
df['diff'] = (df1.Value - df1.Value1).fillna('-')

print (df)
      Names  Value  Value1 diff
0  Person A      5       8    -
1  Person A     13      11   -3
2  Person B     11      32    -
3  Person B     15      20  -21

Answer 2

you can also do it this way: 你也可以这样做：

In [76]: df['diff'] = (-df.groupby('Names')[['Value1','Value']].shift(1).diff(axis=1)['Value1']).fillna(0)

In [77]: df
Out[77]:
      Names  Value  Value1  diff
0  Person A      5       8   0.0
1  Person A     13      11  -3.0
2  Person B     11      32   0.0
3  Person B     15      20 -21.0

在Pandas中使用Groupby减去两列

问题描述

2 个解决方案

解决方案1
4 已采纳 2016-05-31 18:08:40

解决方案2
1 2016-05-31 18:11:39

在Pandas中使用Groupby减去两列

问题描述

2 个解决方案

解决方案1 4 已采纳 2016-05-31 18:08:40

解决方案2 1 2016-05-31 18:11:39

解决方案1
4 已采纳 2016-05-31 18:08:40

解决方案2
1 2016-05-31 18:11:39