简体   繁体   中英

Find the row which has the maximum difference between two columns

I have a DataFrame with columns Gold and Gold.1 . I want to find the row where the difference of these two columns is the maximum.

For the following DataFrame, this should return me row 6.

df
Out: 
   Gold  Gold.1
0     2       1
1     1       4
2     6       9
3     4       4
4     4       8
5     5       5
6     5       2 ---> The difference is maximum (3)
7     5       9
8     5       3
9     5       6

I tried using the following:

df.where(max(df['Gold']-df['Gold.1']))

However that raised a ValueError:

df.where(max(df['Gold']-df['Gold.1']))
Traceback (most recent call last):

  File "", line 1, in 
    df.where(max(df['Gold']-df['Gold.1']))

  File "../python3.5/site-packages/pandas/core/generic.py", line 5195, in where
    raise_on_error)

  File "../python3.5/site-packages/pandas/core/generic.py", line 4936, in _where
    raise ValueError('Array conditional must be same shape as '

ValueError: Array conditional must be same shape as self

How can I find the row that satisfies this condition?

Instead of .where , you can use .idxmax :

(df['Gold'] - df['Gold.1']).idxmax()
Out: 6

This will return the index where the difference is maximum.

If you want to find the row with the maximum absolute difference, then you can call .abs() first.

(df['Gold'] - df['Gold.1']).abs().idxmax()
Out: 4

Though my method is a longer than the above one, people who are comfortable working with lists may find this useful.

x= list((df['col1']-df['col2']).abs())
x.index(max(x))
pd.Series(df['Gold']-df['Gold.1']).argmax()

or using numpy library

numpy.argmax(df['Gold']-df['Gold.1'])

argmax() in pandas

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM