简体   繁体   中英

Convert fractional day of year to Pandas Datetime

I have a column of a Pandas DataFrame that is fractional day of year (DOY). This column appears as:

               DOY
0       200.749967
1       200.791667
2       200.833367
3       200.874967
4       200.916667
5       200.958367
6       200.999967
7       201.041667
       ...    
3491    627.166667
3492    627.333367
3493    627.499967
3494    627.666667
3495    627.833367
3496    627.999967
3497    628.166667
3498    628.333367
Name: DOY, Length: 3499, dtype: float64

The starting year is 2011, however the DOY data continues with increasing values through 2012 without resetting to zero on the new year.

How do I convert this to a Pandas DatetimeIndex with format 'YYYY-MM-DD HH:MM:SS'?

One way I can think to do this is to convert your column to TimeDelta and then add it to the base offset (2011/1/1).

df.DOY = pd.to_datetime('2011-1-1') + pd.to_timedelta(df.DOY, unit='D')
print(df.DOY)
0      2011-07-20 17:59:57.148800
1      2011-07-20 19:00:00.028800
2      2011-07-20 20:00:02.908800
3      2011-07-20 20:59:57.148800
4      2011-07-20 22:00:00.028800
5      2011-07-20 23:00:02.908800
6      2011-07-20 23:59:57.148800 
7      2011-07-21 01:00:00.028800
       ... 
3491   2012-09-19 04:00:00.028800
3492   2012-09-19 08:00:02.908800
3493   2012-09-19 11:59:57.148800
3494   2012-09-19 16:00:00.028800
3495   2012-09-19 20:00:02.908800
3496   2012-09-19 23:59:57.148800
3497   2012-09-20 04:00:00.028800
3498   2012-09-20 08:00:02.908800
Name: DOY, dtype: datetime64[ns]

Another method would be to call pd.to_datetime with the origin parameter set, as agtoever shows in their answer .

Although the accepted answer is correct in the conversion of DOY to Datetime, there is a slight mistake that has been overlooked.

Midnight of January 1 for any year is DOY 1.0. As you proceed with fractional DOY time, Jan 1 12:00 is DOY 1.5, Jan 2 00:00 is DOY 2.0, etc...

If you add the DOY time to a base offset date, as suggested in other answers, the resulting time is offset forward by one day. For example, pd.to_datetime('2011-01-01') + pd.to_timedelta(df.DOY, unit='D') , with a DOY series that starts with 1.0, results in a starting date of '2011-01-02' which is incorrect. This is a result of the convention that DOY time starts with 1 instead of 0. See here for more info.

Therefore, the correct answer (producing correct Datetime results) is:

df.DOY = pd.to_datetime('2011-1-1') + pd.to_timedelta(gps.DOY, unit='D') - pd.Timedelta(days=1)

Just use to_datetime with the appropiate parameters ( read the manual ):

>>> pandas.to_datetime([0,0.1,200,400,800], unit='D', origin=pandas.Timestamp('01-01-2011'))

DatetimeIndex(['2011-01-01 00:00:00', '2011-01-01 02:24:00', '2011-07-20 00:00:00', '2012-02-05 00:00:00', '2013-03-11 00:00:00'], dtype='datetime64[ns]', freq=None)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM