简体   繁体   中英

Python & Pandas: series to timedelta

M is a col in dataframe df which indicates the number of month.

M
1
0
15

I am trying to find the number of days between 2015-01-01 and 2015-01-01 + df.M. The following col is what i want to get.

daynum
31
0
456

I know how to do it by using loop and list:

int((datetime.strptime("2015-01-01", "%Y-%m-%d") + relativedelta(months=df.M[i]) 
                    - datetime.strptime("2015-01-01", "%Y-%m-%d")).days)

Is there any build-in function in pandas that can solve this problem easily?

You can use the same approach as in the question, but using the automatic vectorized operations instead of looping.
First convert the series of integers to relativedelta's:

In [76]: M = pd.Series([1, 0, 15])

In [77]: M2 = M.apply(lambda x: dateutil.relativedelta.relativedelta(months=x))

In [78]: M2
Out[78]:
0              relativedelta(months=+1)
1                       relativedelta()
2    relativedelta(years=+1, months=+3)
dtype: object

Then you can do the same calculation:

In [80]: (pd.Timestamp('2015-01-01') + M2) - pd.Timestamp('2015-01-01')
Out[80]:
0    31 days
1     0 days
2   456 days
dtype: timedelta64[ns]

If you want to have it as integer values instead of the timedelta as above, you can get this with .dt.days :

In [81]: days = (pd.Timestamp('2015-01-01') + M2) - pd.Timestamp('2015-01-01')

In [82]: days.dt.days
Out[82]:
0     31
1      0
2    456
dtype: int64

Reason to not use Timedelta

In this case, you cannot work with a timedelta, as this does not exactly shift the date with a certain amount of months, but it appears to give you a certain kind of mean month length:

In [83]: pd.to_timedelta(1, unit='M')
Out[83]: Timedelta('30 days 10:29:06')

In [84]: (pd.Timestamp('2015-01-01') + pd.to_timedelta(M, unit='M')) - pd.Timestamp('2015-01-01')
Out[84]:
0    30 days 10:29:06
1     0 days 00:00:00
2   456 days 13:16:30
dtype: timedelta64[ns]

So this will give slightly different answers. For example in this case, it gives you 30 days instead of 31 for the first element.

The pandas equivalent to relativedelta would be to use a DateOffset . In this case eg pd.DateOffset(months=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM