[英]Pandas Dataframe: groupby id to find max column value and return corresponding value of another column
I have a large dataframe with different food entries.我有一个带有不同食物条目的大型 dataframe。 Each food has one nutrient (A, B, C, D) with a corresponding value for that nutrient in another column.每种食物都有一种营养素(A、B、C、D),该营养素在另一列中具有相应的值。 I want to define a function which takes a specific nutrient as an argument and returns the name of the food with the highest nutrient value.我想定义一个 function ,它将特定营养素作为参数并返回具有最高营养素值的食物的名称。 If the argument does not exist, it should return 'Sorry, {requested nutrient} not found'.如果参数不存在,它应该返回“抱歉,{requested nutrient} not found”。
df = pd.DataFrame([[0.99, 0.87, 0.58, 0.66, 0.62, 0.81, 0.63, 0.71, 0.77, 0.73, 0.69, 0.61, 0.92, 0.49],
list('DAABBBBABCBDDD'),
['apple', 'banana', 'kiwi', 'lemon', 'grape', 'cheese', 'eggs', 'spam', 'fish', 'bread',
'salad', 'milk', 'soda', 'juice'],
['***', '**', '****', '*', '***', '*', '**', '***', '*', '*', '****', '**', '**', '****']]).T
df.columns = ['value', 'nutrient', 'food', 'price']
I have tried the following:我尝试了以下方法:
def food_for_nutrient(lookup_nutrient, dataframe=df):
max_values = dataframe.groupby(['nutrient'])['value'].max()
result = max_values[lookup_nutrient]
return print(result)
It seems to identify the max values of the nutrients correctly but it returns only the nutrient value.它似乎正确识别了营养素的最大值,但它只返回营养素值。 I need the corresponding str from column food .我需要来自列food的相应 str 。 For instance, if I give the following argument例如,如果我给出以下论点
food_for_nutrient('A‘)
My desired output is:我想要的 output 是:
banana
My second problem is that my if statement doesn't work.我的第二个问题是我的if 语句不起作用。 It always returns else它总是返回else
def food_for_nutrient(lookup_nutrient, dataframe=df):
max_values = dataframe.groupby(['nutrient'])['value'].max()
if lookup_nutrient in dataframe['nutrient']:
result = max_values[lookup_nutrient]
return print(result)
else:
return print(f'Sorry, {lookup_nutrient} not found.')
food_for_nutrient('A')
Thanks a lot for your help!非常感谢你的帮助!
Try this:尝试这个:
def food_for_nutrient(lookup_nutrient):
try:
return df[df['nutrient'] == lookup_nutrient].set_index('food')['value'].astype(float).idxmax()
except ValueError:
return f'Sorry, {lookup_nutrient} not found.'
Output: Output:
>>> food_for_nutrient('A')
'banana'
>>> food_for_nutrient('B')
'cheese'
>>> food_for_nutrient('C')
'bread'
>>> food_for_nutrient('D')
'apple'
>>> food_for_nutrient('E')
'Sorry, E not found.'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.