简体   繁体   中英

How to select pandas row with maximum value in one column, from a group of rows that share two common columns?

The following Pandas DataFrame df has 5 columns, colored, while the index numbers are on the very left in black.

在此处输入图像描述

Notice the last two columns (let's call them col4 and col5 ) have static numbers, denoting a segment, group or chunk of the data. Other groups (that change their static numbers in these two columns) have been hidden from screenshot.

How to single out the row, or index of the row, that has the largest value in the third column (called col3 ), circled in black: 1.90977 , conditional on the fact that the last 2 rows are static? In other words, single out the best row in the group

looking for something like this, which doesn't work:

df.loc[(df['col3'] == 0.999141) & (df['col4'] == 0.000861559)]

If not last 2 columns has same values use numpy.isclose for select columns by some precision, also for performance is better select by DataFrame.loc by mask and column name:

df.loc[np.isclose(df['col4'], 0.999141) & np.isclose(df['col5'], 0.000861559), 'col3'].max()

For index of maximum value use Series.idxmax :

df.loc[np.isclose(df['col4'], 0.999141) & np.isclose(df['col5'], 0.000861559), 'col3'].idxmax()

For select by maximum col4 and minimum col5 use:

df.loc[df['col4'].eq(df['col4'].max()) & df['col5'].eq(df['col5'].min()), 'col3'].max()

df.loc[df['col4'].eq(df['col4'].max()) & df['col5'].eq(df['col5'].min()), 'col3'].idxmax()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM