简体   繁体   中英

Pandas Groupby Grouper, how do I groupby dates that don't exist for every group?

Trying to plot a 2 year history of every single person in pandas.

The problem is that not everyone has 2 years of data, often much less.

In a dataset of all transactions by all people, I'm doing a groupby on dates, but pd.Grouper doesn't do groupby(grouperObj).count() if an applicant doesn't have transaction history.

So person A's transaction history would be 10/1/2017 to 10/1/2018, but person B's history spans from 10/1/2016 to 8/1/2018. I'm trying to plot from 1/1/2015 to 10/1/2018 for all people.

How can I normalize for this?

You can convert the date to category datatype

Data input

df=pd.DataFrame({'person':['A','B'],'date':['2018-09-23','2017-10-02']})

df.date=pd.to_datetime(df.date)

Solution

df.date=pd.Categorical(df.date,categories=pd.date_range(start='10/1/2017',end='10/1/2018',freq='D'))
target=pd.crosstab(df.person,df.date).stack()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM