[英]Select Pandas dataframe row where two or more columns have their maximum value together
Suppose you have a pandas.DataFrame
like so:假设你有一个
pandas.DataFrame
像这样:
Institution![]() |
Feat1![]() |
Feat2![]() |
Feat3![]() |
... ![]() |
---|---|---|---|---|
ID1 ![]() |
14.5 ![]() |
0 ![]() |
0.32 ![]() |
... ![]() |
ID2 ![]() |
322.12 ![]() |
1 ![]() |
0.94 ![]() |
... ![]() |
ID3 ![]() |
27.08 ![]() |
0 ![]() |
1.47 ![]() |
... ![]() |
My question is simple: how would one select rows from this dataframe based on the maximum combined values from two or more columns.我的问题很简单:如何根据两列或多列的最大组合值从 dataframe 中获得一个 select 行。 For example:
例如:
Feat1
and Feat3
have their maximum value together , returning:Feat1
和Feat3
一起具有最大值的行,返回: Institution![]() |
Feat1![]() |
Feat2![]() |
Feat3![]() |
... ![]() |
---|---|---|---|---|
ID2 ![]() |
322.12 ![]() |
1 ![]() |
0.94 ![]() |
... ![]() |
I am certain a good old for loop can take care of the problem given a little time, but I believe there must be a Pandas function for that, hope someone point me in the right direction.我确信一个好的旧 for 循环可以解决这个问题,但我相信必须有一个 Pandas function ,希望有人指出我正确的方向。
You can play arround with:你可以玩arround:
df.sum(axis=1)
df['row_sum'] = df.sum(axis=1)
or或者
df['sum'] = df['col1' ] + df['col3']
And then:接着:
df.sort(['sum' ],ascending=[False or True])
df.sort_index()
You can do it with slicing:你可以用切片来做到这一点:
output = df.loc[(df['Feat1'] + df['Feat3']).to_frame().idxmax(),:]
This outputs:这输出:
Institution Feat1 Feat2 Feat3
1 ID2 322.12 1 0.94
Alternatively you can always create a column and slice through it, but this would require a bit of an extra effort.或者,您始终可以创建一个列并对其进行切片,但这需要一些额外的努力。
df['filter'] = df['Feat1'] + df['Feat3']
output = df[df['filter'] == df['filter'].max()]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.