I have this DataFrame:
Name Year Publisher Global_Sales
0 Wii Sports 2006.0 Nintendo 82.74
1 Super Mario Bros. 1985.0 Nintendo 40.24
2 Mario Kart Wii 2008.0 Nintendo 35.82
3 Wii Sports Resort 2009.0 Nintendo 33.00
4 Pokemon Red/Pokemon Blue 1996.0 Nintendo 31.37
I want to group it by Year and see max Global_Sales per Year:
comp_group=df_comparation.groupby('Year')['Global_Sales'].max()
I obtain:
Year
1980.0 4.31
1981.0 4.50
1982.0 7.81
1983.0 3.20
1984.0 28.31
1985.0 40.24
1986.0 6.51
1987.0 4.38
1988.0 17.28
1989.0 30.26
1990.0 20.61
Now I want to know what Publisher made the max Global_Sales and add it as a column:
Year Global_Sales Publisher
1980.0 4.31 Nintendo
1981.0 4.50 EA Sports
1982.0 7.81 ...
1983.0 3.20 ...
1984.0 28.31 ...
1985.0 40.24 ...
1986.0 6.51 ...
1987.0 4.38 ...
1988.0 17.28 ...
1989.0 30.26 ...
1990.0 20.61 ...
Thanks!
you can aggregate with .idxmax()
instead to get the index of the maximal sale per year, and then index with it to get the result:
indexes = df.groupby("Year")["Global_Sales"].idxmax()
result = df.loc[indexes, ["Year", "Global_Sales", "Publisher"]]
Group the dataframe by Year
then apply a function to get the Global_Sales and Publisher for maximum Global_Sales:
(df
.groupby('Year')
.apply(lambda x: x.loc[x['Global_Sales'].idxmax(), ['Global_Sales', 'Publisher']])
)
Global_Sales Publisher
Year
1985.0 40.24 Nintendo
1996.0 31.37 Nintendo
2006.0 82.74 Nintendo
2008.0 35.82 Nintendo
2009.0 33.00 Nintendo
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.