简体   繁体   中英

How can I interpret a year-naive, RFC 3339 datetime string in Python?

I am interfacing with an API which gives year-naive RFC 3339 datetime strings for representing a users birthday. Naturally, I want to interpret this as some sort of datetime object - However, the python datetime library doesn't support datetime strings with values less than one.

Here's an example datetime string given by the API: 0000-09-01T00:00:00-00:00 (Notice the year is set to 0000 ). If I were to just throw this into datetime.fromisoformat , it unsuprisingly raises an error:

In [1]: from datetime import datetime

In [2]: datetime.fromisoformat("0000-09-01T00:00:00-00:00")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-e1d8a5624d92> in <module>
----> 1 datetime.fromisoformat("0000-09-01T00:00:00-00:00")

ValueError: year 0 is out of range

If I were to entirely remove the year section of the string, It gives the following:

In [1]: from datetime import datetime

In [2]: datetime.fromisoformat("09-01T00:00:00-00:00")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-a027335f00c1> in <module>
----> 1 datetime.fromisoformat("09-01T00:00:00-00:00")

ValueError: Invalid isoformat string: '09-01T00:00:00-00:00'

At first, I thought this was a bug or limitation. But after a little research, I found that the RFC3339 Standard states the following in its introduction:

All dates and times are assumed to be in the "current era", somewhere between 0000AD and 9999AD.

Assuming that this range is inclusive (This is based on the other uses of the term "between" within the standard, although it is never strictly specified), it is implied that the datetime module does not conform to the RFC3339 standard as it hard codes a minimum and maximum year value and also makes it a required value. However, it never claims that it does conform to the standard. So the new issue is that if the included library doesn't support RFC3339, what does?

My question is: Is there a method of interpreting this string as some kind of datetime object or use a third-party library?

There is no year 0 in the Anno Domini date presentation system.

A quick look at the common datetime alternatives ( Pendulum , Arrow ) show that the ValueError error for parsing an ISO format string with 0000- as the year is universal. That is not a valid year and the error lies with the data source.

A date with only a month and a day is not really a date - it is ambiguous. Is the date 2/23 before or after 3/1 ? Is 2/23 + 6 days the end of February or the first of March? In both cases, it depends entirely on the year.

It appears that the Square API is using 0000- as a flag for the year being optional since some people do not want to disclose their age.

If your data is standardized to year 0000 , you can probably just do a string replacement to standardize on year 1:

from datetime import datetime

s="0000-09-01T00:00:00-00:00"

>>> datetime.fromisoformat(s.replace("0000-","0001-"))
datetime.datetime(1, 9, 1, 0, 0, tzinfo=datetime.timezone.utc)

Or, as stated in comments, perhaps use 0004 to accommodate 2/29 as a birthday:

s="0000-02-29T00:00:00-00:00"

>>> datetime.fromisoformat(s.replace("0000-","0004-"))
datetime.datetime(4, 2, 29, 0, 0, tzinfo=datetime.timezone.utc)

This is a partial solution at best. Again, a date without a year is not a date and you will need to write and validate a lot of code to try and solve the ambiguity for sorting, comparisons, date offsets, presentation, etc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM