简体   繁体   中英

Convert an RFC 3339 time to a standard Python timestamp

Is there an easy way to convert an RFC 3339 time into a regular Python timestamp?

I've got a script which is reading an ATOM feed and I'd like to be able to compare the timestamp of an item in the ATOM feed to the modification time of a file.

I notice from the ATOM spec , that ATOM dates include a time zone offset ( Z<a number> ) but, in my case, there's nothing after the Z so I guess we can assume GMT.

I suppose I could parse the time with a regex of some sort but I was hoping Python had a built-in way of doing it that I just haven't been able to find.

You don't include an example, but if you don't have a Z-offset or timezone, and assuming you don't want durations but just the basic time, then maybe this will suit you:

import datetime as dt
>>> dt.datetime.strptime('1985-04-12T23:20:50.52', '%Y-%m-%dT%H:%M:%S.%f')
datetime.datetime(1985, 4, 12, 23, 20, 50, 520000)

The strptime() function was added to the datetime module in Python 2.5 so some people don't yet know it's there.

Edit : The time.strptime() function has existed for a while though, and works about the same to give you a struct_time value:

>>> ts = time.strptime('1985-04-12T23:20:50.52', '%Y-%m-%dT%H:%M:%S.%f')
>>> ts
time.struct_time(tm_year=1985, tm_mon=4, tm_mday=12, tm_hour=23, tm_min=20, tm_sec=50, tm_wday=4, tm_yday=102, tm_isdst=-1)
>>> time.mktime(ts)
482210450.0

I struggled with RFC3339 datetime format a lot, but I found a suitable solution to convert date_string <=> datetime_object in both directions.

You need two different external modules, because one of them is is only able to do the conversion in one direction (unfortunately):

first install:

sudo pip install rfc3339
sudo pip install iso8601

then include:

import datetime     # for general datetime object handling
import rfc3339      # for date object -> date string
import iso8601      # for date string -> date object

For not needing to remember which module is for which direction, I wrote two simple helper functions:

def get_date_object(date_string):
  return iso8601.parse_date(date_string)

def get_date_string(date_object):
  return rfc3339.rfc3339(date_object)

which inside your code you can easily use like this:

input_string = '1989-01-01T00:18:07-05:00'
test_date = get_date_object(input_string)
# >>> datetime.datetime(1989, 1, 1, 0, 18, 7, tzinfo=<FixedOffset '-05:00' datetime.timedelta(-1, 68400)>)

test_string = get_date_string(test_date)
# >>> '1989-01-01T00:18:07-05:00'

test_string is input_string # >>> True

Heureka! Now you can easily ( haha ) use your date strings and date strings in a useable format.

No builtin, afaik.

feed.date.rfc3339 This is a Python library module with functions for converting timestamp strings in RFC 3339 format to Python time float values, and vice versa. RFC 3339 is the timestamp format used by the Atom feed syndication format.

It is BSD-licensed.

http://home.blarg.net/~steveha/pyfeed.html

(Edited so it's clear I didn't write it. :-)

http://pypi.python.org/pypi/iso8601/似乎能够解析 iso 8601,它是 RFC 3339 的一个子集,也许这可能有用,但同样,不是内置的。

If you're using Django, you could use Django's function parse_datetime :

>>> from django.utils.dateparse import parse_datetime
>>> parse_datetime("2016-07-19T07:30:36+05:00")
datetime.datetime(2016, 7, 19, 7, 30, 36, tzinfo=<django.utils.timezone.FixedOffset object at 0x101c0c1d0>)

http://bugs.python.org/issue15873 (duplicate of http://bugs.python.org/issue5207 )

Looks like there isn't a built-in as of yet.

feedparser.py provides robust/extensible way to parse various date formats that may be encountered in real-world atom/rss feeds:

>>> from feedparser import _parse_date as parse_date
>>> parse_date('1985-04-12T23:20:50.52Z')
time.struct_time(tm_year=1985, tm_mon=4, tm_mday=12, tm_hour=23, tm_min=20,
                 tm_sec=50, tm_wday=4, tm_yday=102, tm_isdst=1)

The new datetime.fromisoformat(date_string) method which was added in Python 3.7 will parse most RFC 3339 timestamps, including those with time zone offsets. It's not a full implementation, so be sure to test your use case.

>>> from datetime import datetime
>>> datetime.fromisoformat('2011-11-04')
datetime.datetime(2011, 11, 4, 0, 0)
>>> datetime.fromisoformat('2011-11-04T00:05:23')
datetime.datetime(2011, 11, 4, 0, 5, 23)
>>> datetime.fromisoformat('2011-11-04 00:05:23.283')
datetime.datetime(2011, 11, 4, 0, 5, 23, 283000)
>>> datetime.fromisoformat('2011-11-04 00:05:23.283+00:00')
datetime.datetime(2011, 11, 4, 0, 5, 23, 283000, tzinfo=datetime.timezone.utc)
>>> datetime.fromisoformat('2011-11-04T00:05:23+04:00')   
datetime.datetime(2011, 11, 4, 0, 5, 23,
    tzinfo=datetime.timezone(datetime.timedelta(seconds=14400)))

try this, it works fine for me

datetime_obj =  datetime.strptime("2014-01-01T00:00:00Z", '%Y-%m-%dT%H:%M:%SZ')

or

datetime_obj = datetime.strptime("Mon, 01 Jun 2015 16:41:40 GMT", '%a, %d %b %Y %H:%M:%S GMT')

在另一个问题中遇到了很棒的dateutil.parser模块,并在我的 RFC3339 问题上尝试了它,它似乎比这个问题中的任何其他回答都更理智地处理了我抛出的所有问题。

Using Python 3, you can use RegEx to break the RFC 3339 timestamp into its components. Then, directly create the datetime object, no additional modules needed:

import re
import datetime

def parse_rfc3339(dt):
    broken = re.search(r'([0-9]{4})-([0-9]{2})-([0-9]{2})T([0-9]{2}):([0-9]{2}):([0-9]{2})(\.([0-9]+))?(Z|([+-][0-9]{2}):([0-9]{2}))', dt)
    return(datetime.datetime(
        year = int(broken.group(1)),
        month = int(broken.group(2)),
        day = int(broken.group(3)),
        hour = int(broken.group(4)),
        minute = int(broken.group(5)),
        second = int(broken.group(6)),
        microsecond = int(broken.group(8) or "0"),
        tzinfo = datetime.timezone(datetime.timedelta(
            hours = int(broken.group(10) or "0"),
            minutes = int(broken.group(11) or "0")))))

This example theads missing timezones or microseconds as "0" but might need additional error checking. Cheers, Alex

The simplest solution for me has been dateutil python standart library.

from dateutil.parser import parse

dt = "2020-11-23T11:08:23.022277705Z"
print(parse(dt))

Output:

2020-11-23 11:08:23.022277+00:00

If you don't need the timezone element, just simply set timezone info to None

print(parse(t).replace(tzinfo=None))

The output is a nice and clean datetime object:

2020-11-23 11:08:23.022277

You could use a Google API Core package. They have a really straightforward Datetime to RFC 3339 conversion function. You can find more info in their docs .

Its usage is as simple as:

from google.api_core.datetime_helpers import to_rfc3339

rfc3339_str = to_rfc3339(datetime.now())

They even have a function that works the other way around from_rfc3339 and from_rfc3339_nanos .

I have been doing a deep dive in dateimes and RFC3339 and recently come across the arrow library and have just used and solved my problem:

import arrow

date_string = "2015-11-24 00:00:00+00:00"
my_datetime = arrow.get(date_string).datetime

rfc3339 库: http ://henry.precheur.org/python/rfc3339

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM