简体   繁体   中英

get number of days between two datetimes on a month level in pandas

Here is the dataset that contains when the person was assigned to a role, and has their and start date, and the year month in order for those months that they were in the role for:

  | ID | Name | strt_dt | end_dt | yearmo | 
  | 1  | Jay  | 4-22-19 | 7-30-19| 201904 | 
  | 1  | Jay  | 4-22-19 | 7-30-19| 201905 |  
  | 1  | Jay  | 4-22-19 | 7-30-19| 201906 |   
  | 1  | Jay  | 4-22-19 | 7-30-19| 201907 |  
  | 2  | Fao  | 7-14-19 |10-14-19| 201907 |    
  | 2  | Fao  | 7-14-19 |10-14-19| 201908 |   
  | 2  | Fao  | 7-14-19 |10-14-19| 201909 |   
  | 2  | Fao  | 7-14-19 |10-14-19| 201910 |    

I was to calculate for each year-month the person was in the role, how many days of that month were they in the role. The output should look like this:

  | ID | Name | strt_dt | end_dt | yearmo | no_of days|
  | 1  | Jay  | 4-22-19 | 7-30-19| 201904 |  9 |
  | 1  | Jay  | 4-22-19 | 7-30-19| 201905 |  31|  
  | 1  | Jay  | 4-22-19 | 7-30-19| 201906 |  30|  
  | 1  | Jay  | 4-22-19 | 7-30-19| 201907 |  30| 
  | 2  | Fao  | 7-14-19 |10-14-19| 201907 |  18|  
  | 2  | Fao  | 7-14-19 |10-14-19| 201908 |  31|  
  | 2  | Fao  | 7-14-19 |10-14-19| 201909 |  30|  
  | 2  | Fao  | 7-14-19 |10-14-19| 201910 |  14|  

I tried to extract the day they from the strt ( subtract it by 30 to get the no of ddays) and end date and create a seperate column. But I am stuck on how to proceed from there. Any ideas or suggestions are welcome.

df['strt_yearmo'] = df['strt_dt'].dt.year * 100 +df['strt_dt'].dt.month
df['end_yearmo'] = df['end_dt'].dt.year * 100 +df['end_dt'].dt.month


  | ID | Name | strt_dt | end_dt | yearmo | strt_yearmo|end_yearmo|
  | 1  | Jay  | 4-22-19 | 7-30-19| 201904 |  201904    |201907|
  | 1  | Jay  | 4-22-19 | 7-30-19| 201905 |  201904    |201907|
  | 1  | Jay  | 4-22-19 | 7-30-19| 201906 |  201904    |201907|  
  | 1  | Jay  | 4-22-19 | 7-30-19| 201907 |  201904    |201907 |
  | 2  | Fao  | 7-14-19 |10-14-19| 201907 |  201907    |201910 |
  | 2  | Fao  | 7-14-19 |10-14-19| 201908 |  201907    |201910 | 
  | 2  | Fao  | 7-14-19 |10-14-19| 201909 |  201907    |201910 |
  | 2  | Fao  | 7-14-19 |10-14-19| 201910 |  201907    |201910 | 

Use np.select(condition, choice,alternative) after coercing the dates to datetime and extracting end month date in yearmo

Extract endmonth date frm yearmo

df['startmo']=pd.to_datetime(df['yearmo'].astype(str), format='%Y%m')+ pd.offsets.MonthEnd(0)

Coerce strt_dt and end_dt to date

datedf['strt_dt'],df['end_dt']=pd.to_datetime(df['strt_dt']),pd.to_datetime(df['end_dt'])

Come up with conditions

conditions=[df.startmo.dt.month==df.strt_dt.dt.month, df.startmo.dt.month==df.end_dt.dt.month]

#If month in yearmo is the same with strt_dt,substract strt_dt from endmont.
#If month in yearmo is the same with end_dt, extract the days in end_dt

Come up with Choices coresponding to each condition above

choices=[df.startmo.sub(df.strt_dt).dt.days+1,df.end_dt.dt.day]

Calculate the days by matching condition and choice. Include alternative as well. Here alternative is where month in start and end does not match yearmo, it means the month is in the middle so just extract the days as an alternative to the conditions

df['no_of days']=np.select(conditions,choices,df.startmo.dt.day)




ID Name    strt_dt     end_dt  yearmo    startmo  no_of days
0   1  Jay 2019-04-22 2019-07-30  201904 2019-04-30           9
1   1  Jay 2019-04-22 2019-07-30  201905 2019-05-31          31
2   1  Jay 2019-04-22 2019-07-30  201906 2019-06-30          30
3   1  Jay 2019-04-22 2019-07-30  201907 2019-07-31          30
4   2  Fao 2019-07-14 2019-10-14  201907 2019-07-31          18
5   2  Fao 2019-07-14 2019-10-14  201908 2019-08-31          31
6   2  Fao 2019-07-14 2019-10-14  201909 2019-09-30          30
7   2  Fao 2019-07-14 2019-10-14  201910 2019-10-31          14

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM