简体   繁体   中英

How create field weeks in between two dates

How to calculate using pandas weeks between two dates such 2019-12-15 and 2019-12-28

Data:

cw = pd.DataFrame({ "lead_date" : ["2019-12-28" , "2019-12-23"] ,
                    "Received_date" : ["2019-12-15" , "2019-12-21"  ]  })

So I could do something like

cw["weeks_between"]= ( cw["lead_date"] - cw["Received_date"]) / 7

The problem is..

  • For row 1:
    it will return 1.85, but is wrong value because one day starts in on beginning of week Vs End of week

  • For row 2: It will return 0.28, but also wrong because one day starts end of week Vs beginning of week.

-

So how can I get the number of weeks in between this two dates?

Method 1: Using list comprehension , dt.period & getattr

provided by Jon Clements in comments

This method will work when years change between the compared dates:

cw['weeks_diff'] = (
    [getattr(el, 'n', 0) 
     for el in cw['lead_date'].dt.to_period('W') - cw['Received_date'].dt.to_period('W')]
)

Method 2: using weeknumbers with dt.strftime('%W')

We can use pd.to_datetime to convert your dates to datetime. Then we use the dt.strftime accessor to get the weeknumbers with %W .

Finally we substract both weeknumbers:

weeks = (cw[['lead_date', 'Received_date']]
         .apply(lambda x: pd.to_datetime(x).dt.strftime('%W'))
         .replace('NaT', 0)
         .astype(int)
        )

cw['weeks_diff'] = weeks['lead_date'] - weeks['Received_date']

    lead_date Received_date  weeks_diff
0  2019-12-28    2019-12-15           2
1  2019-12-23    2019-12-21           1

You need to use convert to datetime using pandas

import pandas as pd
import numpy as np

df = pd.DataFrame({ "lead_date" : ["2019-12-28" , "2019-12-23"] ,
                "Received_date" : ["2019-12-15" , "2019-12-21"  ]  })

df['lead_date']=pd.to_datetime(df['lead_date'])
df['Received_date']=pd.to_datetime(df['Received_date'])

Here is the difference in days between "lead_date" and "Received_date"

df['time_between'] =df['lead_date']-df['Received_date']

print(df.head())

 lead_date Received_date time_between
0 2019-12-28    2019-12-15      13 days
1 2019-12-23    2019-12-21       2 days

Update: edits below to get number of weeks. Also added import pandas and numpy.

To get 'time_between' column in weeks:

df['time_between']= df['time_between']/np.timedelta64(1,'W')

will yield

 lead_date Received_date  time_between
0 2019-12-28    2019-12-15      1.857143
1 2019-12-23    2019-12-21      0.285714

Update 2: If you want week number subtractions and not days between then use:

df['lead_date']=pd.to_datetime(df['lead_date']).dt.week 
df['Received_date']=pd.to_datetime(df['Received_date']).dt.week
df['time_between'] =df['lead_date']-df['Received_date']

yields,

    lead_date  Received_date  time_between
0         52             50             2
1         52             51             1

.dt.week returns week number in the year.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM