[英]Groupby one column and find max (absolute) value of difference of other two columns in pandas
I have a dataframe with about 20 columns, but I need groupby one columns called ID and calculate difference of, let's called them, value1 and value 2 columns. 我有一个约20列的数据框,但是我需要将一列称为ID的分组并计算value1和value 2列的差。 Example df: df示例:
ID value1 value2
1 3 2
1 6 2
2 6 1
3 5 8
4 7 2
4 3 2
Desired output: 所需的输出:
ID value1 value2 maxabs
1 3 2 4
1 6 2 4
2 6 1 5
3 5 8 3
4 7 2 5
4 3 2 5
I've tried simply with this: 我已经试过了:
df['maxabs'] = df.groupby(['ID'])[(df['value1'] - df['value2'])].abs().idxmax()
Error said that columns are not found and printed me a lot of 'nan'. 错误说找不到列,并给我印了很多'nan'。 Columns are there, surely. 当然有专栏。 Maybe I need to add when both values are 'nan' to print 'nan; 也许我需要在两个值均为“ nan”时添加才能打印“ nan”。 But not sure that I even hit the direction. 但不确定我是否已按指示前进。
Switch the order of your calculation; 切换计算顺序; Calculate the difference between value1 and value2 firstly, and then group by ID and calculate max
with transform
: 首先计算value1和value2之差,然后按ID分组并使用transform
计算max
:
df['maxabs'] = df.value1.sub(df.value2).abs().groupby(df.ID).transform('max')
df
# ID value1 value2 maxabs
#0 1 3 2 4
#1 1 6 2 4
#2 2 6 1 5
#3 3 5 8 3
#4 4 7 2 5
#5 4 3 2 5
Try this . 尝试这个 。 Ps. PS。 you can also use merge
or join
, I just get used to map
您也可以使用merge
或join
,我只是习惯于map
df['maxabs']=df.ID.map(df.groupby(['ID']).apply(lambda x: max(abs(x.value1-x.value2))))
ID value1 value2 maxabs
0 1 3 2 4
1 1 6 2 4
2 2 6 1 5
3 3 5 8 3
4 4 7 2 5
5 4 3 2 5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.