简体   繁体   English

如何计算行与另一个特定行之间的差异?

[英]How to calculate the difference between rows compared to another specific row?

I have a dataframe 我有一个数据框

full_name  x
q          1.5
q_1        1.3
q_2        1.2
q_3        1.3
r          1.5
r_1        1.3
r_2        1.2
r_3        1.3

and I'd like to create a new column which is the difference between the suffixed full names and their bases, such as the following: 并且我想创建一个新列,该列是后缀全名与其基数之间的区别,例如:

full_name  x    x_diff
q          1.5  0
q_1        1.3  -0.2
q_2        1.2  -0.3
q_3        1.3  -0.2
r          1.5  0
r_1        1.3  -0.2
r_2        1.2  -0.3
r_3        1.3  -0.2

so, q - q , q_1 - q , q_2 - q , q_3 - q , and the same for r . 因此, q - qq_1 - qq_2 - qq_3 - qr相同。

I've tried something like df['x_diff'] = df.res - df[df.main_name == df.full_name].x but that doesn't work. 我已经尝试过类似df['x_diff'] = df.res - df[df.main_name == df.full_name].x但这是行不通的。 Any advice on what to do? 有什么建议吗?

Create Series for matched main_name with full_name with DataFrame.set_index and then subtract Series.map ed main_name : 使用DataFrame.set_index为匹配的main_namefull_name创建Series ,然后减去Series.map ed main_name

s =  df.loc[df.main_name == df.full_name].set_index('main_name')['x']

df['x_diff'] = df.x - df.main_name.map(s)
print (df)
  full_name main_name    x  x_diff
0         q         q  1.5     0.0
1       q_1         q  1.3    -0.2
2       q_2         q  1.2    -0.3
3       q_3         q  1.3    -0.2
4         r         r  1.5     0.0
5       r_1         r  1.3    -0.2
6       r_2         r  1.2    -0.3
7       r_3         r  1.3    -0.2

If always first values are equals in main_name with full_name per groups subtract Series created by GroupBy.first with GroupBy.transform : 如果总是第一个值是平等main_namefull_name每个组减去Series所创造GroupBy.firstGroupBy.transform

df['x_diff'] = df.x - df.groupby('main_name')['x'].transform('first')

You can do it in 3 steps: 您可以通过3个步骤进行操作:

  1. Groupby main_name main_name
  2. For each group: Create a new column (called for ex. x_shifted) that contains the previous value. 对于每个组:创建一个包含先前值的新列(例如x_shifted)。 For that, you can use df.shift(1) ( https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.shift.html ) 为此,您可以使用df.shift(1)https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.shift.html
  3. For each group: Create the column x_diff that is the difference between x_shifted and x 对于每个组:创建列x_diff ,它是x_shiftedx之差

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM