简体   繁体   中英

groupby, count past occurences of events, and show the most recent event

How can I group by a unique identifier and count the number of past delinquencies('Bad') and past non delinquencies ('Good') before the most recent event.

For example, given the following dataframe:

ID    Date         Class    
112   2018-02-12    Good
112   2019-01-20    Bad
113   2018-10-11    Bad
113   2019-01-01    Good
113   2020-02-03    Good

This should be the end goal:

ID    Past_deliq  Past_non_deliq  Class   Date
112      0           1             Bad    2019-01-20
113      1           1             Good   2020-02-03

I can get the most recent event by doing the following, df.loc[df.groupby('ID').Date.idxmax()] , but I cant find a way to count past occurrences.

Any help is greatly appreciated.

Just some basic reshaping and crosstab .

The idea is to filter your dataframe by values that aren't the max, do a values count aggregation and re-join your dataframe with the max dates.

max_date = df.groupby('ID')['Date'].max()
s1 = df.loc[~df.index.isin(df.groupby("ID")["Date"].idxmax())]

df1 = pd.crosstab(s1.ID, s1.Class).join(max_date).rename(
    columns={"Bad": "Past_deliq", "Good": "Past_non_deliq"}
)



     Past_deliq  Past_non_deliq       Date
ID                                        
112           0               1 2019-01-20
113           1               1 2020-02-03
b=df.groupby(["ID","Class"])["Class"].count().unstack()

You groupby both the ID and the Class which means you will get the count of each class for each ID. Than you call unstack which takes the left most labels from the index and inserts them as columns.

After you make another groupby by which you determine the last occurence(this solution has the assumption your data is ordered by date, if not use function max).

c=df.groupby("ID").agg({"Date":"last","Class":"last"})

After you merger the two dataframes.

b.merge(c, on="ID")

And you get what you requested.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM