pandas groupby latest observation for each group

Question

I have a panel dataframe (ID and time) and want to collect the recent (latest) rows for each ID. Here is the table:

df = pd.DataFrame({'ID': [1,1,2,3], 'Year': [2018,2019,2019,2020], 'Var1':list("abcd"), 'Var2': list("efgh")})

and the end result would be:

Answer 1

Use tail :

df.groupby("ID").tail(1)

The output is:

   ID  Year Var1 Var2
1   1  2019    b    f
2   2  2019    c    g
3   3  2020    d    h

Another alternative is to use last :

df.groupby("ID").last()

Answer 2

Use drop_duplicates:

df.sort_values('Year').drop_duplicates('ID', keep='last')

Output:

   ID  Year Var1 Var2
1   1  2019    b    f
2   2  2019    c    g
3   3  2020    d    h