The problem is simple and so must be solution but I am not able to find it.
I want to find which row and column in Pandas DataFrame has minimum value and how much is it.
I have tried following code (in addition to various combinations):
df = pd.DataFrame(data=[[4,5,6],[2,1,3],[7,0,5],[2,5,3]],
index = ['R1','R2','R3','R4'],
columns=['C1','C2','C3'])
print(df)
print(df.loc[df.idxmin(axis=0), df.idxmin(axis=1)])
The dataframe (df) being searched is:
C1 C2 C3
R1 4 5 6
R2 2 1 3
R3 7 0 5
R4 2 5 3
Output for the loc
command:
C1 C2 C2 C1
R2 2 1 1 2
R3 7 0 0 7
R2 2 1 1 2
What I need is:
C2
R3 0
How can I get this simple result?
Use:
a, b = df.stack().idxmin()
print(df.loc[[a], [b]])
C2
R3 0
Another @John Zwinck solution working with missing values - use numpy.nanargmin
:
df = pd.DataFrame(data=[[4,5,6],[2,np.nan,3],[7,0,5],[2,5,3]],
index = ['R1','R2','R3','R4'],
columns=['C1','C2','C3'])
print(df)
C1 C2 C3
R1 4 5.0 6
R2 2 NaN 3
R3 7 0.0 5
R4 2 5.0 3
#https://stackoverflow.com/a/3230123
ri, ci = np.unravel_index(np.nanargmin(df.values), df.shape)
print(df.iloc[[ri], [ci]])
C2
R3 0.0
I'd get the index this way:
np.unravel_index(np.argmin(df.values), df.shape)
This is much faster than df.stack().idxmin()
.
It gives you a tuple such as (2, 1)
in your example. Pass that to df.iloc[] to get the value.
Or min
+ min
+ dropna
+ T
+ dropna
+ T
:
>>> df[df==df.min(axis=1).min()].dropna(how='all').T.dropna().T
C2
R3 0.0
>>>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.