简体   繁体   中英

How to convert strings with UTC-# at the end to a DateTimeField in Python?

I have a list of strings formatted as follows: '2/24/2021 3:37:04 PM UTC-6'

How would I convert this?

I have tried

datetime.strptime(my_date_object, '%m/%d/%Y %I:%M:%s %p %Z')

but I get an error saying "unconverted data remains: -6"

Is this because of the UTC-6 at the end?

The approach that @ MrFuppes mentioned in their comment is the easiest way to do this.

Ok seems like you need to split the string on 'UTC' and parse the offset separately. You can then set the tzinfo from a timedelta

input_string = '2/24/2021 3:37:04 PM UTC-6'

try:
    dtm_string, utc_offset = input_string.split("UTC", maxsplit=1)
except ValueError:
    # Couldn't split the string, so no "UTC" in the string
    print("Warning! No timezone!")
    
    dtm_string = input_string
    utc_offset = "0"


dtm_string = dtm_string.strip() # Remove leading/trailing whitespace '2/24/2021 3:37:04 PM'
utc_offset = int(utc_offset)    # Convert utc offset to integer -6

tzinfo = tzinfo = datetime.timezone(datetime.timedelta(hours=utc_offset))

result_datetime = datetime.datetime.strptime(dtm_string, '%m/%d/%Y %I:%M:%S %p').replace(tzinfo=tzinfo)

print(result_datetime)
# prints 2021-02-24 15:37:04-06:00

Alternatively, you can avoid using datetime.strptime if you extract the relevant components pretty easily with regular expressions

rex = r"(\d{1,2})\/(\d{1,2})\/(\d{4}) (\d{1,2}):(\d{2}):(\d{2}) (AM|PM) UTC(\+|-)(\d{1,2})"

input_string = '2/24/2021 3:37:04 PM UTC-6'

r = re.findall(rex, input_string)
# gives:  [('2', '24', '2021', '3', '37', '04', 'PM', '-', '6')]

mm = int(r[0][0])
dd = int(r[0][1])
yy = int(r[0][2])

hrs = int(r[0][3])
mins = int(r[0][4])
secs = int(r[0][5])

if r[0][6].upper() == "PM":
    hrs = hrs + 12

tzoffset = int(f"{r[0][7]}{r[0][8]}")

tzinfo = datetime.timezone(datetime.timedelta(hours=tzoffset))

result_datetime = datetime.datetime(yy, mm, dd, hrs, mins, secs, tzinfo=tzinfo)
print(result_datetime)
# prints 2021-02-24 15:37:04-06:00

The regular expression (\d{1,2})\/(\d{1,2})\/(\d{4}) (\d{1,2}):(\d{2}):(\d{2}) (AM|PM) UTC(\+|-)(\d{1,2}) Demo

Explanation:

  • (\d{1,2}) : One or two digits. Surrounding parentheses indicate that this is a capturing group. A similar construct is used to get the month, date and hours, and UTC offset
  • \/ : A forward slash
  • (\d{4}) : Exactly four digits. Also a capturing group. A similar construct is used for minutes and seconds.
  • (AM|PM) : Either "AM" or "PM"
  • UTC(\+|-)(\d{1,2}) : "UTC", followed by a plus or minus sign, followed by one or two digits.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM