简体   繁体   English

timedelta64和日期时间转换

[英]timedelta64 and datetime conversion

I have two datetime (Timestamp) formatted columns in my dataframe, df['start'], df['end'] . 我的数据帧中有两个datetime(时间戳)格式的列df['start'], df['end'] I'd like to get the duration between the two dates. 我想获取两个日期之间的持续时间。 So I create the duration column 所以我创建了工期列

df['duration'] = df['start'] - df['end']

However, now the duration column is formatted as numpy.timedelta64 , instead of datetime.timedelta as I would expect. 但是,现在duration列的格式设置为numpy.timedelta64 ,而不是我期望的datetime.timedelta

>>> df['duration'][0]
>>> numpy.timedelta64(0,'ns')

While

>>> df['start'][0] - df['end'][0]
>>> datetime.timedelta(0)

Can someone explain to me why the array subtraction change the timedelta type? 有人可以向我解释为什么数组减法会更改timedelta类型吗? Is there a way that I keep the datetime.timedelta as it is easier to work with? 有没有一种方法可以保留datetime.timedelta因为它更易于使用?

This was one of the motivations for implementing a Timedelta scalar in pandas 0.15.0. 这是在熊猫0.15.0中实现Timedelta标量的动机之一。 See full docs here 在这里查看完整的文档

In >= 0.15.0 the implementation of a timedelta64[ns] Series is still np.timedelta64[ns] under the hood, but all is completely hidden from the user in a datetime.timedelta sub-classed scalar, Timedelta (which is basically a useful superset of timedelta and the numpy version). 在> = timedelta64[ns]系列的实现仍然是np.timedelta64[ns] ,但在datetime.timedelta子类化标量Timedelta (基本上是timedelta和numpy版本的有用超集)。

In [1]: df = DataFrame([[pd.Timestamp('20130102'),pd.Timestamp('20130101')]],columns=list('AB'))

In [2]: df['diff'] = df['A']-df['B']

In [3]: df.dtypes
Out[3]: 
A        datetime64[ns]
B        datetime64[ns]
diff    timedelta64[ns]
dtype: object

# this will return a Timedelta in 0.15.2
In [4]: df['A'][0]-df['B'][0]
Out[4]: datetime.timedelta(1)

In [5]: (df['A']-df['B'])[0] 
Out[5]: Timedelta('1 days 00:00:00')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM