简体   繁体   中英

How to convert multiple time zone column to UTC in Python

I have a dataset where the date_time column contains a mixture of BST and GMT date and times, in the following format 'Sun 27 Mar 2022 12:59:03 AM GMT'. I would like to convert this whole column into the following format '2022-03-27 00:59:03', and into one standard time zone, such as UTC. Where am I going wrong??

I tried using strptime() to convert the a tester date, but that only works with GMT times.

date_time_str = 'Sun 27 Mar 2022 12:59:03 AM GMT'
date_time_obj = datetime.datetime.strptime(date_time_str, '%a %d %b %Y %I:%M:%S %p %Z')
print('Date-time:', date_time_obj)

Date-time: 2022-03-27 00:59:03

Then tried with BST

date_time_str_bst = 'Sun 27 Mar 2022 02:00:02 AM BST'
date_time_obj_bst = datetime.datetime.strptime(date_time_str_bst, '%a %d %b %Y %I:%M:%S %p %Z')

ValueError: time data 'Sun 27 Mar 2022 02:00:02 AM BST' does not match format '%a %d %b %Y %I:%M:%S %p %Z'

Date and time temp
Sun 27 Mar 2022 12:57:03 AM GMT 33.9
Sun 27 Mar 2022 12:58:02 AM GMT 33.6
Sun 27 Mar 2022 12:59:03 AM GMT 33.6
Sun 27 Mar 2022 02:00:02 AM BST 33.9
Sun 27 Mar 2022 02:01:03 AM BST 33.6
Sun 27 Mar 2022 02:02:02 AM BST 33.6

在此处输入图像描述

Here's an option how you can handle this. As commented, abbreviated tz names such as "BST" are ambiguous. You will have to define to which tz the abbreviations map; eg "Europe/London" for BST/GMT.

define timezone mapping {abbreviation --> IANA tz name}, apply parser, then convert to UTC

given

df
                     Date and time  temp
0  Sun 27 Mar 2022 12:57:03 AM GMT  33.9
1  Sun 27 Mar 2022 12:58:02 AM GMT  33.6
2  Sun 27 Mar 2022 12:59:03 AM GMT  33.6
3  Sun 27 Mar 2022 02:00:02 AM BST  33.9
4  Sun 27 Mar 2022 02:01:03 AM BST  33.6
5  Sun 27 Mar 2022 02:02:02 AM BST  33.6

running

import pandas as pd
import dateutil

tzmapping = {"GMT": dateutil.tz.gettz("Europe/London"),
             "BST": dateutil.tz.gettz("Europe/London")}

df["dt_UTC"] = df["Date and time"].apply(dateutil.parser.parse, tzinfos=tzmapping).dt.tz_convert("UTC")

gives

df
                     Date and time  temp                    dt_UTC
0  Sun 27 Mar 2022 12:57:03 AM GMT  33.9 2022-03-27 00:57:03+00:00
1  Sun 27 Mar 2022 12:58:02 AM GMT  33.6 2022-03-27 00:58:02+00:00
2  Sun 27 Mar 2022 12:59:03 AM GMT  33.6 2022-03-27 00:59:03+00:00
3  Sun 27 Mar 2022 02:00:02 AM BST  33.9 2022-03-27 01:00:02+00:00
4  Sun 27 Mar 2022 02:01:03 AM BST  33.6 2022-03-27 01:01:03+00:00
5  Sun 27 Mar 2022 02:02:02 AM BST  33.6 2022-03-27 01:02:02+00:00

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM