[英]How to select pandas row with maximum value in one column, from a group of rows that share two common columns?
The following Pandas DataFrame df
has 5 columns, colored, while the index numbers are on the very left in black.下面的 Pandas DataFrame df
有 5 列,彩色,而索引号在最左边的黑色。
Notice the last two columns (let's call them col4
and col5
) have static numbers, denoting a segment, group or chunk of the data.请注意最后两列(我们称它们为col4
和col5
)具有 static 编号,表示数据的段、组或块。 Other groups (that change their static numbers in these two columns) have been hidden from screenshot.其他组(在这两列中更改其 static 编号)已从屏幕截图中隐藏。
How to single out the row, or index of the row, that has the largest value in the third column (called col3
), circled in black: 1.90977
, conditional on the fact that the last 2 rows are static?如何挑出第三列(称为col3
)中具有最大值的行或行的索引,用黑色圈出: 1.90977
,条件是最后两行是 static? In other words, single out the best row in the group换句话说,挑出组中最好的行
looking for something like this, which doesn't work:寻找这样的东西,这是行不通的:
df.loc[(df['col3'] == 0.999141) & (df['col4'] == 0.000861559)]
If not last 2 columns has same values use numpy.isclose
for select columns by some precision, also for performance is better select by DataFrame.loc
by mask and column name: If not last 2 columns has same values use numpy.isclose
for select columns by some precision, also for performance is better select by DataFrame.loc
by mask and column name:
df.loc[np.isclose(df['col4'], 0.999141) & np.isclose(df['col5'], 0.000861559), 'col3'].max()
For index of maximum value use Series.idxmax
:对于最大值使用Series.idxmax
的索引:
df.loc[np.isclose(df['col4'], 0.999141) & np.isclose(df['col5'], 0.000861559), 'col3'].idxmax()
For select by maximum col4
and minimum col5
use:对于 select 通过最大col4
和最小col5
使用:
df.loc[df['col4'].eq(df['col4'].max()) & df['col5'].eq(df['col5'].min()), 'col3'].max()
df.loc[df['col4'].eq(df['col4'].max()) & df['col5'].eq(df['col5'].min()), 'col3'].idxmax()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.