简体   繁体   English

使用熊猫解析时区变化(由于夏令时)的csv

[英]Parsing csv with changing timezone (due to daylight saving time) with pandas

I'm trying to parse a csv that looks like this我正在尝试解析一个看起来像这样的 csv

time                                val
28.10.2007 00:00:00.000 GMT+0100    1
28.10.2007 00:01:00.000 GMT+0100    2
28.10.2007 01:00:00.000 GMT-0000    3
28.10.2007 01:01:00.000 GMT-0000    4

To do so, I use为此,我使用

pd.read_csv(f,
            parse_dates=[0],
            dayfirst=True,
            index_col=0)

However, the result looks like this然而,结果看起来像这样

                           val
time                          
2007-10-28 00:00:00-01:00    1
2007-10-28 00:01:00-01:00    2
2007-10-28 00:00:00-01:00    3
2007-10-28 00:01:00-01:00    4

This cause the 3rd and 4th row to be duplicated values.这会导致第 3 行和第 4 行是重复值。 Is there a way to ask pandas to convert this time to UTC and understand the change in TZ?有没有办法让 Pandas 将这个时间转换为 UTC 并了解 TZ 的变化?

I tried this and somehow it works but I don't know if this is something that you want.我试过这个,不知何故它有效,但我不知道这是否是你想要的。

df = pd.read_csv('data.csv')

df['time'] = pd.to_datetime(df['time'], format='%d.%m.%Y %H:%M:%S.%f GMT%z')
df['time_'] = pd.to_datetime(df['time'], utc=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM