[英]find max value based on row condition python
i have DataFrame look something like this but more data and stuff, for example,我有 DataFrame 看起来像这样,但更多的数据和东西,例如,
|index | year | drinks | sold |
|------|------|--------|------|
|0 | 2010 | pepsi | 3456 |
|1 | 2010 | spirit | 32755|
|2 | 2010 | cola | 7854 |
|3 | 2011 | pepsi | 6787 |
|4 | 2011 | spirit | 7899 |
|5 | 2011 | cola | 4657 |
I want to get: **the drinks that has been sold in each year more than the year average?我想得到:**每年销售的饮料超过年平均水平? like喜欢
in 2010 spirit sold 32755 and it is more the 2010 average 14,688 sells
and same goes to other years.其他年份也是如此。 i know i have to get the average for every year first, then compare it to the sold
column, but idk how** to reach it.我知道我必须首先获得每年的平均值,然后将其与sold
列进行比较,但不知道如何达到它。
You can try:你可以试试:
import pandas as pd
data={'year':[2010,2010,2010,2011,2011,2011],
'drinks':['pepsi','spirit','cola','pepsi','spirit','cola'],
'sold':[3456,32755,7854,6787,7899,4657]}
df=pd.DataFrame(data)
for year_searched in df['year'].unique():
df_year_searched=df[df['year']==year_searched]
sold_avg=df_year_searched['sold'].mean()
df_yeat_search_more_than_avg=df_year_searched[df_year_searched['sold']>sold_avg].sort_values(by='sold',ascending=False)
for index,row in df_yeat_search_more_than_avg.iterrows():
drink=row['drinks']
sold=row['sold']
print(f"in {year_searched} {drink} sold {sold} and it is more the {year_searched} average {round(sold_avg,2)} sells")
result:结果:
in 2010 spirit sold 32755 and it is more the 2010 average 14688.33 sells
in 2011 spirit sold 7899 and it is more the 2011 average 6447.67 sells
in 2011 pepsi sold 6787 and it is more the 2011 average 6447.67 sells
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.