将熊猫 DateTimeIndex 转换为 Unix 时间？

Question

What is the idiomatic way of converting a pandas DateTimeIndex to (an iterable of) Unix Time?将 pandas DateTimeIndex 转换为（可迭代的）Unix 时间的惯用方法是什么？ This is probably not the way to go:这可能不是要走的路：

[time.mktime(t.timetuple()) for t in my_data_frame.index.to_pydatetime()]

Answer 1

As DatetimeIndex is ndarray under the hood, you can do the conversion without a comprehension (much faster). 由于DatetimeIndex是ndarray ，你可以在没有理解的情况下进行转换（更快）。

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: from datetime import datetime

In [4]: dates = [datetime(2012, 5, 1), datetime(2012, 5, 2), datetime(2012, 5, 3)]
   ...: index = pd.DatetimeIndex(dates)
   ...: 
In [5]: index.astype(np.int64)
Out[5]: array([1335830400000000000, 1335916800000000000, 1336003200000000000], 
        dtype=int64)

In [6]: index.astype(np.int64) // 10**9
Out[6]: array([1335830400, 1335916800, 1336003200], dtype=int64)

%timeit [t.value // 10 ** 9 for t in index]
10000 loops, best of 3: 119 us per loop

%timeit index.astype(np.int64) // 10**9
100000 loops, best of 3: 18.4 us per loop

Answer 2

Note: Timestamp is just unix time with nanoseconds (so divide it by 10**9): 注意：时间戳只是unix时间，以纳秒为单位（因此除以10 ** 9）：

[t.value // 10 ** 9 for t in tsframe.index]

For example: 例如：

In [1]: t = pd.Timestamp('2000-02-11 00:00:00')

In [2]: t
Out[2]: <Timestamp: 2000-02-11 00:00:00>

In [3]: t.value
Out[3]: 950227200000000000L

In [4]: time.mktime(t.timetuple())
Out[4]: 950227200.0

As @root points out it's faster to extract the array of values directly: 正如@root指出的那样，直接提取值数组更快：

tsframe.index.astype(np.int64) // 10 ** 9

Answer 3

A summary of other answers: 其他答案摘要：

df['<time_col>'].astype(np.int64) // 10**9

If you want to keep the milliseconds divide by 10**6 instead 如果你想保持毫秒除以10**6

Answer 4

Complementing the other answers: //10**9 will do a flooring divide, which gives full past seconds rather than the nearest value in seconds. 补充其他答案： //10**9将执行地板划分，它会提供完整的过去秒数，而不是最接近的秒数值。 A simple way to get more reasonable rounding, if that is desired, is to add 5*10**8 - 1 before doing the flooring divide. 如果需要，一种简单的方法可以获得更合理的舍入，即在进行地板划分之前增加5*10**8 - 1 。

Answer 5

To address the case of NaT, which above solutions will convert to large negative ints, in pandas>=0.24 a possible solution would be: 为了解决NaT的情况，上面的解决方案将转换为大的负面整数，在pandas> = 0.24中，可能的解决方案是：

def datetime_to_epoch(ser):
    """Don't convert NaT to large negative values."""
    if ser.hasnans:
        res = ser.dropna().astype('int64').astype('Int64').reindex(index=ser.index)
    else:
        res = ser.astype('int64')

    return res // 10**9

In the case of missing values this will return the nullable int type 'Int64' (ExtensionType pd.Int64Dtype): 在缺少值的情况下，这将返回nullable int类型'Int64'（ExtensionType pd.Int64Dtype）：

In [5]: dt = pd.to_datetime(pd.Series(["2019-08-21", "2018-07-28", np.nan]))                                                                                                                                                                                                    
In [6]: datetime_to_epoch(dt)                                                                                                                                                                                                                                                   
Out[6]: 
0    1566345600
1    1532736000
2           NaN
dtype: Int64

Otherwise a regular int64: 否则一个常规的int64：

In [7]: datetime_to_epoch(dt[:2])                                                                                                                                                                                                                                               
Out[7]: 
0    1566345600
1    1532736000
dtype: int64

Answer 6

If you have tried this on the datetime column of your dataframe:如果您在数据框的日期时间列上尝试过此操作：

dframe['datetime'].astype(np.int64) // 10**9

& that you are struggling with the following error: TypeError: int() argument must be a string, a bytes-like object or a number, not 'Timestamp' you can just use these two lines : ＆您正在努力解决以下错误： TypeError: int() argument must be a string, a bytes-like object or a number, not 'Timestamp'您可以使用这两行：

dframe.index = pd.DatetimeIndex(dframe['datetime'])
dframe['datetime']= dframe.index.astype(np.int64)// 10**9

Answer 7

The code from the other answers其他答案的代码

dframe['datetime'].astype(np.int64) // 10**9

prints the following warning as of the time of my post:截至我发帖时打印以下警告：

FutureWarning: casting datetime64[ns] values to int64 with .astype(...) is deprecated and will raise in a future version. FutureWarning：不推荐使用 .astype(...) 将 datetime64[ns] 值转换为 int64，并将在未来版本中提出。 Use .view(...) instead.使用 .view(...) 代替。

So use the following instead:因此，请改用以下内容：

dframe['datetime'].view(np.int64) // 10 ** 9

将熊猫 DateTimeIndex 转换为 Unix 时间？

问题描述

7 个解决方案

解决方案1
91 已采纳 2013-03-04 14:47:36

解决方案2
40 2013-03-04 14:31:21

解决方案3
7 2018-09-21 20:05:54

解决方案4
0 2019-06-09 16:35:36

解决方案5
0 2019-08-23 14:16:47

解决方案6
0 2019-09-13 14:29:21

解决方案7
0 2022-01-27 19:02:00

将熊猫 DateTimeIndex 转换为 Unix 时间？

问题描述

7 个解决方案

解决方案1 91 已采纳 2013-03-04 14:47:36

解决方案2 40 2013-03-04 14:31:21

解决方案3 7 2018-09-21 20:05:54

解决方案4 0 2019-06-09 16:35:36

解决方案5 0 2019-08-23 14:16:47

解决方案6 0 2019-09-13 14:29:21

解决方案7 0 2022-01-27 19:02:00

解决方案1
91 已采纳 2013-03-04 14:47:36

解决方案2
40 2013-03-04 14:31:21

解决方案3
7 2018-09-21 20:05:54

解决方案4
0 2019-06-09 16:35:36

解决方案5
0 2019-08-23 14:16:47

解决方案6
0 2019-09-13 14:29:21

解决方案7
0 2022-01-27 19:02:00