简体   繁体   中英

Trying to groupby two columns in pandas and return a max value based on criteria

I need some help with a dataset that I am trying to perform a .groupby() on to find the .max() consumption value based on Entity and Year .

enter image description here import

The consumption column that I am using to perform the max function, sometimes has the same value for different Year s. When this occurs I would like to return the max Year for that occurrence.

df.groupby(['Entity','Year']).consumption.max().reset_index()

returns

enter image description here

In the end I would like a DataFrame with ['Entity','Year','consumption'] as the columns and when the consumption is the same for a specific Entity Year pair, to return the highest Year of the two.

I worked up a solution

df4 = df.groupby('Entity').consumption.max().reset_index()
df5 = df4.merge(df, left_on=['Entity', 'consumption'] , right_on=['Entity', 'consumption'])
df5 = df5.groupby(['Entity', 'consumption']).max().reset_index()
df5.head(10)

Unique Countries with max consumption rates, and if the consumption rate was the same year-over-year, return the max year.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM