简体   繁体   中英

Python Pandas extract value from dataframe based on minimum index

I've got a df:

import pandas as pd
import numpy as np

df = pd.DataFrame({"price":[1.1,66.3,11,15.2,1.1], 
                   "qty":[14,2,1,10,1],
                   "c_h":['cheese','ham','ham','cheese','ham'],
                   "qual":['good','good','bad','good','bad']})

The df looks like this when printed:

      c_h   price  qty  qual
0  cheese     1.1   14  good
1     ham    66.3    2  good
2     ham    11.0    1   bad
3  cheese    15.2   10  good
4     ham     1.1    1   bad

I'm trying to return a price of 'c_h'=='ham' and 'qual=='bad' on the minimum index value from a the df. The minimum index is the lowest numerical value for index currently [0,1,2,...] in which that criteria is met

In this example, the minimum index sought would be 2 and the returned price would be 11.0.

Note: I'm working mostly with pandas but I could also use numpy .

I thought it'd be something like

df[df['c_h']=='ham' and 'qual'=='bad'].min()[index]

but that's not working.

You want something like this:

>>> df[(df.c_h == 'ham') & (df.qual == 'bad')].index.min()
2

But if you don't just want the index, you can use the indexers:

>>> df.loc[(df.c_h == 'ham') & (df.qual == 'bad'), 'price'].iloc[0]
11.0
>>> df.loc[(df.c_h == 'ham') & (df.qual == 'bad'), 'price'].iloc[[0]]
2    11.0
Name: price, dtype: float64

Note, the above takes the first index, not the lowest index. If you index is a normal, int-range index, then these will be equivalent.

However, if it isn't:

>>> df.index = [3,1,2,4,0]
>>> df
      c_h  price  qty  qual
3  cheese    1.1   14  good
1     ham   66.3    2  good
2     ham   11.0    1   bad
4  cheese   15.2   10  good
0     ham    1.1    1   bad
>>> df.loc[(df.c_h == 'ham') & (df.qual == 'bad'), 'price']
2    11.0
0     1.1
Name: price, dtype: float64

Then get the first the same way:

>>> df.loc[(df.c_h == 'ham') & (df.qual == 'bad'), 'price'].iloc[[0]]
2    11.0
Name: price, dtype: float64

But the lowest would require something to the effect of:

>>> df.loc[(df.c_h == 'ham') & (df.qual == 'bad'), 'price'].sort_index().iloc[[0]]
0    1.1
Name: price, dtype: float64

您可以使用'pandas.DataFrame.query()'方法:

df.query("col1 == value1 and col2==value2").sort_index()["target_column"].iloc[0]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM