I have a column 'datedif' in my dataframe as :
exposuredate min_exposure_date datedif
2014-10-08 2014-09-27 11 days
2014-10-09 2014-09-27 12 days
2014-09-27 2014-09-27 0 days
2014-09-27 2014-09-27 0 days
2014-10-22 2014-09-27 25 days
data.exposuredate = pd.to_datetime(data.exposuredate)
data.min_exposure_date = pd.to_datetime(data.min_exposure_date)
data['datedif'] = ((data.exposuredate)-(data.min_exposure_date))
The format for the columns are datetime64[ns]. I want to extract the number of days in the field 'datedif'. I am not able to find anything which could help me to extract the difference in terms of number of days.
I tried:
data['datedif_day'] = data['datedif'].dt.days
Error:
AttributeError: 'Series' object has no attribute 'dt'
The Pandas Docs relate to the types of conversions you are looking for as Frequency Conversion
The two options are 1) division by Timedelta or 2) type conversion. There is a subtle difference between the two as stated in the docs:
"Note that division by the numpy scalar is true division, while astyping is equivalent of floor division."
data = pd.DataFrame([("2014-10-08", "2014-09-27"),
("2014-10-09", "2014-09-27"),
("2014-09-27", "2014-09-27"),
("2014-10-22", "2014-09-27")],
columns=["exposuredate", "min_exposure_date"])
data['datediff'] = pd.to_datetime(data.exposuredate)
- pd.to_datetime(data.min_exposure_date)
data['datediff'] / pd.Timedelta(1, unit='d')
data['datediff'].astype('timedelta64[D]')
Both operations yield:
0 11.0
1 12.0
2 0.0
3 25.0
Name: datediff, dtype: float64
If you are using the date difference as a feature for training a machine learning algorithm, it doesn't matter in which form they are represented as they should be normalised anyway. timedelta64[ns]
is a perfectly fine for that. When it comes to visualisation purposes, see this post .
The 'datedif' looks in days format but actually it is in seconds. So in order to get number of days for furhter use add the following line in the code:
data['datedif'] = data['datedif'].astype(np.numpy64)
data['datedif_day'] = (data['datedif']/86400000000000)
Came across this same question today and I think the following solutin is the easiest:
Setup:
df = pd.DataFrame([("2014-10-08", "2014-09-27"),
("2014-10-09", "2014-09-27"),
("2014-09-27", "2014-09-27"),
("2014-10-22", "2014-09-27")],
columns=["exposuredate", "min_exposure_date"])
df['datediff'] = pd.to_datetime(df.exposuredate) - pd.to_datetime(df.min_exposure_date)
exposuredate min_exposure_date datediff
0 2014-10-08 2014-09-27 11 days
1 2014-10-09 2014-09-27 12 days
2 2014-09-27 2014-09-27 0 days
3 2014-10-22 2014-09-27 25 days
Solution:
df.datediff.apply(lambda x: x.days)
0 11
1 12
2 0
3 25
Name: datediff, dtype: int64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.