简体   繁体   中英

How should I store times with timezones in postgresql using python3?

I've been reading about best practices for handling datetimes in Python and for storing them into postgresql (using utc as much as I can, with pytz for conversions, avoiding the tzinfo parameter in datetime constructor, etc...).

But my doubt now is that I'm surprised not finding anything about the datetime.time object and its best practices.

For the sake of example, imagine I want to store only a time, like 20:30, because I'm scheduling some task every week a couple of days on that time, but the day in the week could change every week. And probably the user has entered the time in his/her timezone. In my situation it would be a user in the Spanish timezone 'Europe/Madrid'.

My questions are:

  • 1) Once I get the time as a datetime.time, how do I should store the timezone info into a datetime.time variable. Is it ok to use

    datetime.time(h, m, s, ms, tzinfo=pytz_spanish_timezone) ???

  • 2) If not with the former line of code, how do I localize properly a naive time? datetime.datetime uses my_datetime.localize(pytz_spanish_timezone)

  • 3) How do I convert one datetime.time object from a timezone to another, given that with datetime and pytz it would be using

    new_tz_datetime = my_datetime.astimezone(pytz_spanish_timezone)

    but with time there is no similar method

  • 4) How should I store the datetime.time in postgresql database? I know there are time and timetz data types. I suppose I should store the time as UTC. Would the timezone matter? Should I store it somehow?

  • 5) How to parse a time from a string without going through datetime? (I've made myself a function, but I'm sure there must exist some way using datetime or some robust module that probably covers cases I do not).

2) [H]ow do I localize properly a naive time? datetime.datetime uses my_datetime.localize(pytz_spanish_timezone)

Actually, it is the other way around. localize is a pytz timezone method, not a datetime method:

import pytz
madrid = pytz.timezone('Europe/Madrid')
aware_datetime = madrid.localize(naive_datetime)

You need a datetime.datetime here. There is no equivalent for datetime.time objects. See below for the reason why.

3) How do I convert one datetime.time object from a timezone to another?

Consider the following situation: We know the time is 20:30 and the timezone is Europe/Madrid and we wish to convert it to UTC. The result will be different depending on whether or not the date falls within Daylight Savings Time (CEST) or not (CET): For example,

import datetime as DT
import pytz
madrid = pytz.timezone('Europe/Madrid')
utc = pytz.utc

CET_date = madrid.localize(DT.datetime(2019, 3, 30, 20, 30, 0), is_dst=None)
# the most recent transition occurred at `2019-03-31 02:00:00+01:00 CEST` 
CEST_date = madrid.localize(DT.datetime(2019, 3, 31, 20, 30, 0), is_dst=None)
print(CET_date.astimezone(utc))
print(CEST_date.astimezone(utc))

# 2019-03-30 19:30:00+00:00
# 2019-03-31 18:30:00+00:00

Notice that when the date is in CET, the time 20:30 is "converted" to 19:30, but when the date is in CEST, the time is converted to 18:30. There is no (simple) answer to your question without first knowing the date.

4a) How should I store the datetime.time in postgresql database? I know there are time and timetz data types.

Per the docs :

The type time with time zone is defined by the SQL standard, but the definition exhibits properties which lead to questionable usefulness.

I think the docs are alluding to the issue shown above. Don't use time with time zone . If you want to store a time, use the PostgreSQL plain time type.

You could store the time and the timezone in the database, and reconstitute a timezone-aware datetime later once you have a date. But note that there are pitfalls:

  1. There are ambiguous local datetimes

     import datetime as DT import pytz madrid = pytz.timezone('Europe/Madrid') date = madrid.localize(DT.datetime(2019, 10, 27, 2, 0, 0), is_dst=None) 

    raises pytz.exceptions.AmbiguousTimeError: 2019-10-27 02:00:00 . To avoid the AmbiguousTimeError , is_dst must be specified explicitly:

     import datetime as DT import pytz madrid = pytz.timezone('Europe/Madrid') date = madrid.localize(DT.datetime(2019, 10, 27, 2, 0, 0), is_dst=False) print(date) date = madrid.localize(DT.datetime(2019, 10, 27, 2, 0, 0), is_dst=True) print(date) # 2019-10-27 02:00:00+01:00 # 2019-10-27 02:00:00+02:00 
  2. There are nonexistent local datetimes

     import datetime as DT import pytz madrid = pytz.timezone('Europe/Madrid') madrid.localize(DT.datetime(2019, 3, 31, 2, 0, 0), is_dst=None) 

    raises pytz.exceptions.NonExistentTimeError: 2019-03-31 02:00:00

    You can avoid the NonExistentTimeError by specifying whether or not the naive localtime refers to a time during DST (Daylight Savings Time):

     import datetime as DT import pytz madrid = pytz.timezone('Europe/Madrid') date = madrid.normalize(madrid.localize(DT.datetime(2019, 3, 31, 2, 0, 0), is_dst=False)) print(date) date = madrid.normalize(madrid.localize(DT.datetime(2019, 3, 31, 2, 0, 0), is_dst=True)) print(date) # 2019-03-31 03:00:00+02:00 # 2019-03-31 01:00:00+01:00 
  3. There can be datetimes which are unrepresentable given a local datetime and a particular timezone.

    The AmbiguousTimeError and NonExistentTimeError above show the importance of specifying the value of is_dst . To avoid these errors you would need to store a boolean is_dst along with time and timezone in the database.

    You might think that you could avoid the problem simply choosing one value of is_dst for all times. But you'd be mistaken. Here is a peculiar example (taken from the pytz docs ) which shows if you always choose is_dst = False (or is_dst = True ) there can be UTC datetimes which can not be expressed given just a naive localtime and a timezone!

     import datetime as DT import pytz warsaw = pytz.timezone('Europe/Warsaw') utc = pytz.utc date1 = warsaw.localize(DT.datetime(1915, 8, 4, 23, 35, 59), is_dst=False).astimezone(utc) date2 = warsaw.localize(DT.datetime(1915, 8, 4, 23, 36, 0), is_dst=False).astimezone(utc) print('Datetimes between {} and {} can not be expressed if we assume is_dist=False.'.format(date1, date2)) date3 = warsaw.localize(DT.datetime(1915, 8, 4, 23, 59, 59), is_dst=True).astimezone(utc) date4 = warsaw.localize(DT.datetime(1915, 8, 5, 0, 0, 0), is_dst=True).astimezone(utc) print('Datetimes between {} and {} can not be expressed if we assume is_dist=True.'.format(date1, date2)) 

    prints

     Datetimes between 1915-08-04 22:11:59+00:00 and 1915-08-04 22:36:00+00:00 can not be expressed if we assume is_dist=False. Datetimes between 1915-08-04 22:11:59+00:00 and 1915-08-04 22:36:00+00:00 can not be expressed if we assume is_dist=True. 

4b) I suppose I should store the time as UTC. Would the timezone matter? Should I store it somehow?

For the reasons shown above there is no such thing as a time (without a date) in UTC. But you can avoid the problems mentioned above by simplying storing datetimes in UTC.

If you create a table with a column with timestamptz data type, then you can use a database adapter such as psycopg2 to store Python timezone-aware datetimes as PostgreSQL timestamptz s. When you query the database, psycopg2 will convert the timestamptz s back into timezone-aware datetimes for you.

Internally, PostgreSQL stores all timestamptz s in UTC, but it reports the values with respect to the PostgreSQL user's timezone setting. On the Python side, given a timezone-aware datetime, you can use its astimezone method to convert it to any timezone you like.

You don't need to store the timezone separately unless you want to report different datetimes with respect to different timezones.

5) How to parse a time from a string without going through datetime?

You could use regex to parse time strings:

import re
import datetime as DT
atime = DT.time(*map(int, re.search(r'(\d{,2}):(\d{,2}):(\d{,2})', 'blueberry jam at 13:32:02').groups()))
print(repr(atime))
# datetime.time(13, 32, 2)

Above, the regex pattern \\d matches a single digit. \\d{1,2} matches 1 or 2 digits.

Alternatively, the 3rd-party dateutil package can parse time strings in a variety of formats:

import dateutil.parser as DP
print(DP.parse("13:32:02").time())
# 13:32:02

print(DP.parse("blueberry jam at 13:32:02", fuzzy=True).time())
# 13:32:02

print(DP.parse("30 minutes 12 hours").time())
# 12:30:00

print(DP.parse("2:30pm").time())
# 14:30:00

There is a lot to digest here, and there is probably more that could be said about each of these issues. In the future, you may want to split your post into multiple questions. This will lower the barrier for people who may wish to answer one question but not all of them, and will help you get more answers quicker.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM