[英]Parsing csv with changing timezone (due to daylight saving time) with pandas
I'm trying to parse a csv that looks like this我正在尝试解析一个看起来像这样的 csv
time val
28.10.2007 00:00:00.000 GMT+0100 1
28.10.2007 00:01:00.000 GMT+0100 2
28.10.2007 01:00:00.000 GMT-0000 3
28.10.2007 01:01:00.000 GMT-0000 4
To do so, I use为此,我使用
pd.read_csv(f,
parse_dates=[0],
dayfirst=True,
index_col=0)
However, the result looks like this然而,结果看起来像这样
val
time
2007-10-28 00:00:00-01:00 1
2007-10-28 00:01:00-01:00 2
2007-10-28 00:00:00-01:00 3
2007-10-28 00:01:00-01:00 4
This cause the 3rd and 4th row to be duplicated values.这会导致第 3 行和第 4 行是重复值。 Is there a way to ask pandas to convert this time to UTC and understand the change in TZ?
有没有办法让 Pandas 将这个时间转换为 UTC 并了解 TZ 的变化?
I tried this and somehow it works but I don't know if this is something that you want.我试过这个,不知何故它有效,但我不知道这是否是你想要的。
df = pd.read_csv('data.csv')
df['time'] = pd.to_datetime(df['time'], format='%d.%m.%Y %H:%M:%S.%f GMT%z')
df['time_'] = pd.to_datetime(df['time'], utc=True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.