简体   繁体   中英

Convert column 'day' to datetime with year specification

I have a dataframe that includes a column of day numbers for which the year is known:

print (df)
        year  day  time  
0       2012  227   800
15      2012  227   815
30      2012  227   830
...     ...   ...   ...
194250  2013  226  1645
194265  2013  226  1700

I have attempted to convert the day numbers to datetime %m-%d using:

import pandas as pd    
df['day'] = pd.to_datetime(df['day'], format='%j').dt.strftime('%m-%d')

which gives:

        year    day  time
0       2012  08-15   800
15      2012  08-15   815
30      2012  08-15   830
...     ...   ...     ...
194250  2013  08-14  1645
194265  2013  08-14  1700

but this conversion is incorrect because the 227th day of 2012 is August 14th (08-14). I believe this error is down to the lack of year specification in the conversion.

How can I specify the year in the conversion to get a) %Y-%m-%d ; b) %m-%d ; c) %Y-%m-%dT%H:%M from the dataframe I have?

Thank you

you can convert to string and feed into pd.to_datetime , which you supply with the right parsing directive:

import pandas as pd

df = pd.DataFrame({'year': [2012, 2012], 'day' : [227, 228], 'time': [800, 0]})

df['datetime'] = pd.to_datetime(df.year.astype(str) + ' ' +
                                df.day.astype(str) + ' ' +
                                df.time.astype(str).str.zfill(4), 
                                format='%Y %j %H%M')

df['datetime']

0   2012-08-14 08:00:00
1   2012-08-15 00:00:00
Name: datetime, dtype: datetime64[ns]

Formatting to string is just a call to strftime via dt accessor, eg

df['datetime'].dt.strftime('%Y-%m-%dT%H:%M')

0    2012-08-14T08:00
1    2012-08-15T00:00
Name: datetime, dtype: object

You can try converting year into datetime type and day into timedelta type, remember to offset the date:

dates = pd.to_datetime(df['year'], format='%Y') + \
        pd.to_timedelta(df['day'] -1, unit='D')  

Output:

0        2012-08-14
15       2012-08-14
30       2012-08-14
194250   2013-08-14
194265   2013-08-14
dtype: datetime64[ns]

Then extract the date-month with strftime :

df['day'] = dates.dt.strftime('%M-%D')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM