[英]How to calculate the difference between rows compared to another specific row?
I have a dataframe 我有一个数据框
full_name x
q 1.5
q_1 1.3
q_2 1.2
q_3 1.3
r 1.5
r_1 1.3
r_2 1.2
r_3 1.3
and I'd like to create a new column which is the difference between the suffixed full names and their bases, such as the following: 并且我想创建一个新列,该列是后缀全名与其基数之间的区别,例如:
full_name x x_diff
q 1.5 0
q_1 1.3 -0.2
q_2 1.2 -0.3
q_3 1.3 -0.2
r 1.5 0
r_1 1.3 -0.2
r_2 1.2 -0.3
r_3 1.3 -0.2
so, q
- q
, q_1
- q
, q_2
- q
, q_3
- q
, and the same for r
. 因此, q
- q
, q_1
- q
, q_2
- q
, q_3
- q
和r
相同。
I've tried something like df['x_diff'] = df.res - df[df.main_name == df.full_name].x
but that doesn't work. 我已经尝试过类似df['x_diff'] = df.res - df[df.main_name == df.full_name].x
但这是行不通的。 Any advice on what to do? 有什么建议吗?
Create Series
for matched main_name
with full_name
with DataFrame.set_index
and then subtract Series.map
ed main_name
: 使用DataFrame.set_index
为匹配的main_name
和full_name
创建Series
,然后减去Series.map
ed main_name
:
s = df.loc[df.main_name == df.full_name].set_index('main_name')['x']
df['x_diff'] = df.x - df.main_name.map(s)
print (df)
full_name main_name x x_diff
0 q q 1.5 0.0
1 q_1 q 1.3 -0.2
2 q_2 q 1.2 -0.3
3 q_3 q 1.3 -0.2
4 r r 1.5 0.0
5 r_1 r 1.3 -0.2
6 r_2 r 1.2 -0.3
7 r_3 r 1.3 -0.2
If always first values are equals in main_name
with full_name
per groups subtract Series
created by GroupBy.first
with GroupBy.transform
: 如果总是第一个值是平等main_name
与full_name
每个组减去Series
所创造GroupBy.first
与GroupBy.transform
:
df['x_diff'] = df.x - df.groupby('main_name')['x'].transform('first')
You can do it in 3 steps: 您可以通过3个步骤进行操作:
main_name
main_name
df.shift(1)
( https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.shift.html ) 为此,您可以使用df.shift(1)
( https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.shift.html ) x_diff
that is the difference between x_shifted
and x
对于每个组:创建列x_diff
,它是x_shifted
和x
之差
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.