从 numpy datetime64 获取年、月或日

Question

I have an array of datetime64 type:我有一个 datetime64 类型的数组：

dates = np.datetime64(['2010-10-17', '2011-05-13', "2012-01-15"])

Is there a better way than looping through each element just to get np.array of years:有没有比遍历每个元素以获得 np.array 年更好的方法：

years = f(dates)
#output:
array([2010, 2011, 2012], dtype=int8) #or dtype = string

I'm using stable numpy version 1.6.2.我正在使用稳定的 numpy 版本 1.6.2。

Answer 1

I find the following tricks give between 2x and 4x speed increase versus the pandas method described in this answer (ie pd.DatetimeIndex(dates).year etc.).我发现以下技巧与此答案中描述的 pandas 方法（即pd.DatetimeIndex(dates).year等）相比，速度提高了 2 倍和 4 倍。 The speed of [dt.year for dt in dates.astype(object)] I find to be similar to the pandas method.我发现[dt.year for dt in dates.astype(object)]的速度与 pandas 方法相似。 Also these tricks can be applied directly to ndarrays of any shape (2D, 3D etc.)这些技巧也可以直接应用于任何形状（2D、3D 等）的 ndarray

dates = np.arange(np.datetime64('2000-01-01'), np.datetime64('2010-01-01'))
years = dates.astype('datetime64[Y]').astype(int) + 1970
months = dates.astype('datetime64[M]').astype(int) % 12 + 1
days = dates - dates.astype('datetime64[M]') + 1

Answer 2

As datetime is not stable in numpy I would use pandas for this:由于 numpy 中的 datetime 不稳定，我会为此使用 pandas：

In [52]: import pandas as pd

In [53]: dates = pd.DatetimeIndex(['2010-10-17', '2011-05-13', "2012-01-15"])

In [54]: dates.year
Out[54]: array([2010, 2011, 2012], dtype=int32)

Pandas uses numpy datetime internally, but seems to avoid the shortages, that numpy has up to now. Pandas 在内部使用 numpy 日期时间，但似乎避免了 numpy 到目前为止的短缺。

Answer 3

There should be an easier way to do this, but, depending on what you're trying to do, the best route might be to convert to a regular Python datetime object :应该有一种更简单的方法来执行此操作，但是，根据您要执行的操作，最好的方法可能是转换为常规的Python 日期时间对象：

datetime64Obj = np.datetime64('2002-07-04T02:55:41-0700')
print datetime64Obj.astype(object).year
# 2002
print datetime64Obj.astype(object).day
# 4

Based on comments below, this seems to only work in Python 2.7.x and Python 3.6+根据下面的评论，这似乎只适用于 Python 2.7.x 和 Python 3.6+

Answer 4

This is how I do it.我就是这样做的。

import numpy as np

def dt2cal(dt):
    """
    Convert array of datetime64 to a calendar array of year, month, day, hour,
    minute, seconds, microsecond with these quantites indexed on the last axis.

    Parameters
    ----------
    dt : datetime64 array (...)
        numpy.ndarray of datetimes of arbitrary shape

    Returns
    -------
    cal : uint32 array (..., 7)
        calendar array with last axis representing year, month, day, hour,
        minute, second, microsecond
    """

    # allocate output 
    out = np.empty(dt.shape + (7,), dtype="u4")
    # decompose calendar floors
    Y, M, D, h, m, s = [dt.astype(f"M8[{x}]") for x in "YMDhms"]
    out[..., 0] = Y + 1970 # Gregorian Year
    out[..., 1] = (M - Y) + 1 # month
    out[..., 2] = (D - M) + 1 # dat
    out[..., 3] = (dt - D).astype("m8[h]") # hour
    out[..., 4] = (dt - h).astype("m8[m]") # minute
    out[..., 5] = (dt - m).astype("m8[s]") # second
    out[..., 6] = (dt - s).astype("m8[us]") # microsecond
    return out

It's vectorized across arbitrary input dimensions, it's fast, its intuitive, it works on numpy v1.15.4, it doesn't use pandas.它在任意输入维度上矢量化，速度快，直观，适用于 numpy v1.15.4，不使用 pandas。

I really wish numpy supported this functionality, it's required all the time in application development.我真的希望 numpy 支持这个功能，在应用程序开发中一直需要它。 I always get super nervous when I have to roll my own stuff like this, I always feel like I'm missing an edge case.当我不得不像这样滚动自己的东西时，我总是非常紧张，我总是觉得我错过了一个边缘案例。

Answer 5

Using numpy version 1.10.4 and pandas version 0.17.1,使用 numpy 版本 1.10.4 和 pandas 版本 0.17.1，

dates = np.array(['2010-10-17', '2011-05-13', '2012-01-15'], dtype=np.datetime64)
pd.to_datetime(dates).year

I get what you're looking for:我得到你要找的东西：

array([2010, 2011, 2012], dtype=int32)

Answer 6

Use dates.tolist() to convert to native datetime objects, then simply access year .使用dates.tolist()转换为本机日期时间对象，然后只需访问year 。 Example:例子：

>>> dates = np.array(['2010-10-17', '2011-05-13', '2012-01-15'], dtype='datetime64')
>>> [x.year for x in dates.tolist()]
[2010, 2011, 2012]

This is basically the same idea exposed in https://stackoverflow.com/a/35281829/2192272 , but using simpler syntax.这基本上与https://stackoverflow.com/a/35281829/2192272中公开的想法相同，但使用更简单的语法。

Tested with python 3.6 / numpy 1.18.使用 python 3.6 / numpy 1.18 测试。

Answer 7

Another possibility is:另一种可能是：

np.datetime64(dates,'Y') - returns - numpy.datetime64('2010')

or或者

np.datetime64(dates,'Y').astype(int)+1970 - returns - 2010

but works only on scalar values, won't take array但仅适用于标量值，不会采用数组

Answer 8

如果您升级到 numpy 1.7（其中 datetime 仍被标记为实验性），则以下内容应该可以工作。

dates/np.timedelta64(1,'Y')

Answer 9

Anon's answer works great for me, but I just need to modify the statement for days Anon 的回答对我很有用，但我只需要修改声明days

from:从：

days = dates - dates.astype('datetime64[M]') + 1

to:至：

days = dates.astype('datetime64[D]') - dates.astype('datetime64[M]') + 1

Answer 10

There's no direct way to do it yet, unfortunately, but there are a couple indirect ways:不幸的是，目前还没有直接的方法可以做到这一点，但是有几种间接的方法：

[dt.year for dt in dates.astype(object)]

or或者

[datetime.datetime.strptime(repr(d), "%Y-%m-%d %H:%M:%S").year for d in dates]

both inspired by the examples here .两者都受到此处示例的启发。

Both of these work for me on Numpy 1.6.1.这两种方法都适用于 Numpy 1.6.1。 You may need to be a bit more careful with the second one, since the repr() for the datetime64 might have a fraction part after a decimal point.您可能需要更加小心第二个，因为 datetime64 的 repr() 可能在小数点后有小数部分。

Answer 11

convert `np.datetime64` to float-year将`np.datetime64`转换为浮点年

In this solution, you can see, step-by-step, how to process np.datetime64 datatypes.在此解决方案中，您可以逐步了解如何处理np.datetime64数据类型。

In the following dt64 is of type np.datetime64 (or even a numpy.ndarray of that type):在下面的 dt64 是np.datetime64类型（甚至是该类型的 numpy.ndarray ）：

year = dt64.astype('M8[Y]') contains just the year. year = dt64.astype('M8[Y]')只包含年份。 If you want a float array of that, do 1970 + year.astype(float) .如果您想要一个浮点数组，请执行1970 + year.astype(float) 。
the days (since January 1st) you can access by days = (dt64 - year).astype('timedelta64[D]')您可以通过days = (dt64 - year).astype('timedelta64[D]')
You can also deduce if a year is a leap year or not (compare days_of_year )您还可以推断一年是否是闰年（比较days_of_year ）

See also the numpy tutorial 另请参阅 numpy 教程

import numpy as np
import pandas as pd

def dt64_to_float(dt64):
    """Converts numpy.datetime64 to year as float.

    Rounded to days

    Parameters
    ----------
    dt64 : np.datetime64 or np.ndarray(dtype='datetime64[X]')
        date data

    Returns
    -------
    float or np.ndarray(dtype=float)
        Year in floating point representation
    """

    year = dt64.astype('M8[Y]')
    # print('year:', year)
    days = (dt64 - year).astype('timedelta64[D]')
    # print('days:', days)
    year_next = year + np.timedelta64(1, 'Y')
    # print('year_next:', year_next)
    days_of_year = (year_next.astype('M8[D]') - year.astype('M8[D]')
                    ).astype('timedelta64[D]')
    # print('days_of_year:', days_of_year)
    dt_float = 1970 + year.astype(float) + days / (days_of_year)
    # print('dt_float:', dt_float)
    return dt_float

if __name__ == "__main__":

    dt_str = '2011-11-11'
    dt64 = np.datetime64(dt_str)
    print(dt_str, 'as float:', dt64_to_float(dt64))
    print()

    dates = np.array([
        '1970-01-01', '2014-01-01', '2020-12-31', '2019-12-31', '2010-04-28'],
        dtype='datetime64[D]')
    float_dates = dt64_to_float(dates)


    print('dates:      ', dates)
    print('float_dates:', float_dates)

output输出

2011-11-11 as float: 2011.8602739726027

dates:       ['1970-01-01' '2014-01-01' '2020-12-31' '2019-12-31' '2010-04-28']
float_dates: [1970.         2014.         2020.99726776 2019.99726027 2010.32054795]

Answer 12

This is obviously quite late, but I benefitted from one of the answers, so sharing my bit here.这显然已经很晚了，但我从其中一个答案中受益，所以在这里分享我的一点。

The answer by Anon 🤔 is quite right- the speed is incredibly higher using numpy method instead of first casting them as pandas datetime series and then getting dates. Anon 🤔 的答案是非常正确的——使用 numpy 方法而不是首先将它们转换为 pandas 日期时间序列然后获取日期，速度非常快。 Albeit the offsetting and conversion of results after numpy transformations are bit shabby, a cleaner helper for this can be written, like so:-尽管 numpy 转换后结果的偏移和转换有点破旧，但可以编写一个更清洁的助手，如下所示：-

def from_numpy_datetime_extract(date: np.datetime64, extract_attribute: str = None):
    _YEAR_OFFSET = 1970
    _MONTH_OFFSET = 1
    _MONTH_FACTOR = 12
    _DAY_FACTOR = 24*60*60*1e9
    _DAY_OFFSET = 1

    if extract_attribute == 'year':
        return date.astype('datetime64[Y]').astype(int) + _YEAR_OFFSET
    elif extract_attribute == 'month':
        return date.astype('datetime64[M]').astype(int)%_MONTH_FACTOR + _MONTH_OFFSET
    elif extract_attribute == 'day':
        return ((date - date.astype('datetime64[M]'))/_DAY_FACTOR).astype(int) + _DAY_OFFSET
    else:
        raise ValueError("extract_attribute should be either of 'year', 'month' or 'day'")

Solving the ask dates = np.array(['2010-10-17', '2011-05-13', "2012-01-15"], dtype = 'datetime64') :-解决询问dates = np.array(['2010-10-17', '2011-05-13', "2012-01-15"], dtype = 'datetime64') :-

Numpy method (using the helper above) Numpy 方法（使用上面的助手）

%timeit -r10 -n1000 [from_numpy_datetime_extract(x, "year") for x in dates]
# 14.3 µs ± 4.03 µs per loop (mean ± std. dev. of 10 runs, 1000 loops each)

Pandas method熊猫方法

%timeit -r10 -n1000 pd.to_datetime(dates).year.tolist()
# 304 µs ± 32.2 µs per loop (mean ± std. dev. of 10 runs, 1000 loops each)

Answer 13

How about simply converting to string?简单地转换为字符串怎么样？

Probably the easiest way:可能是最简单的方法：

import numpy as np

date = np.datetime64("2000-01-01")
date_strings = date.astype(str).split('-'). 
# >> ['2000', '01', '01']

year_int = int(date_strings[0])

从 numpy datetime64 获取年、月或日

问题描述

13 个解决方案

解决方案1
74 2014-11-12 19:58:59

解决方案2
50 已采纳 2012-11-30 19:59:02

解决方案3
18 2016-02-09 00:21:42

解决方案4
12 2019-05-22 15:15:17

解决方案5
10 2017-03-30 15:00:41

解决方案6
5 2020-02-07 11:44:09

解决方案7
2 2017-07-14 13:38:22

解决方案8
1 2012-11-30 16:47:30

解决方案9
1 2017-06-17 21:35:12

解决方案10
0 2012-11-30 23:07:41

解决方案11
0 2021-04-17 18:06:20

convert `np.datetime64` to float-year将`np.datetime64`转换为浮点年

解决方案12
0 2021-06-09 05:50:54

解决方案13
0 2022-05-11 05:56:09

从 numpy datetime64 获取年、月或日

问题描述

13 个解决方案

解决方案1 74 2014-11-12 19:58:59

解决方案2 50 已采纳 2012-11-30 19:59:02

解决方案3 18 2016-02-09 00:21:42

解决方案4 12 2019-05-22 15:15:17

解决方案5 10 2017-03-30 15:00:41

解决方案6 5 2020-02-07 11:44:09

解决方案7 2 2017-07-14 13:38:22

解决方案8 1 2012-11-30 16:47:30

解决方案9 1 2017-06-17 21:35:12

解决方案10 0 2012-11-30 23:07:41

解决方案11 0 2021-04-17 18:06:20

convert np.datetime64 to float-year将np.datetime64转换为浮点年

解决方案12 0 2021-06-09 05:50:54

解决方案13 0 2022-05-11 05:56:09

解决方案1
74 2014-11-12 19:58:59

解决方案2
50 已采纳 2012-11-30 19:59:02

解决方案3
18 2016-02-09 00:21:42

解决方案4
12 2019-05-22 15:15:17

解决方案5
10 2017-03-30 15:00:41

解决方案6
5 2020-02-07 11:44:09

解决方案7
2 2017-07-14 13:38:22

解决方案8
1 2012-11-30 16:47:30

解决方案9
1 2017-06-17 21:35:12

解决方案10
0 2012-11-30 23:07:41

解决方案11
0 2021-04-17 18:06:20

convert `np.datetime64` to float-year将`np.datetime64`转换为浮点年

解决方案12
0 2021-06-09 05:50:54

解决方案13
0 2022-05-11 05:56:09