Hi I am trying to import csv file and set index as datetime obj. This is sample of csv:
date,wind_force,wind_dir,cloud_cover,temp
2019-01-01 04:00:00+01:00,13.9,234.0,100.0,3.8
2019-01-01 05:00:00+01:00,14.333333,239.33333,100.0,4.5333333
I import file and I try to use pd.to_datetime
directly on my index:
dfw = pd.read_csv(r'C:\Path\weather.csv', index_col = 'date')
dfw.index = pd.to_datetime(dfw.index)
Then the dfw.index
returns:
Index([2019-01-01 04:00:00+01:00, 2019-01-01 05:00:00+01:00,
......
2020-01-01 00:00:00+01:00, 2020-01-01 01:00:00+01:00],
dtype='object', name='date', length=8750)
If i try dfw.index.hour
I get an error:
AttributeError: 'Index' object has no attribute 'hour'
And when I use utc = True
while changing index to datetime it converts it properly:
dfw.index = pd.to_datetime(dfw.index, utc = True)
But it return datetime in UTC and I want them to stay in previous timezone
DatetimeIndex(['2019-01-01 03:00:00+00:00', '2019-01-01 04:00:00+00:00',
...
'2019-12-31 23:00:00+00:00', '2020-01-01 00:00:00+00:00'],
dtype='datetime64[ns, UTC]', name='date', length=8750, freq=None)
What's also strange is when I try to call index by it's number like dfw.index[33]
it returns:
datetime.datetime(2019, 1, 2, 13, 0, tzinfo=tzoffset(None, 3600))
And then i can call dfw.index[33].hours etc.
So where's the gotcha here?
What about:
dfw.index = pd.to_datetime(dfw.index, format='%Y-%m-%d %H:%M:%S+01:00')
You are giving a precise format that enables you to conserve the timezone you are interested in. More information on letters for datetime format here
Edit: If you want to deal with summer / winter hours, you can replace +01
by +%f
dfw.index = pd.to_datetime(dfw.index, format='%Y-%m-%d %H:%M:%S+%f:00')
dfw.index[0].hour # returns 4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.