简体   繁体   English

从pandas datetime index计算经过时间的时间

[英]calculate time elapsed timedelta from pandas datetime index

I have a pandas dataframe with a datetimeindex. 我有一个带有datetimeindex的pandas数据帧。 I would like to create a column that contains the elapsed time. 我想创建一个包含已用时间的列。 I'm calculating it like this: 我正在计算它:

startTime = df.index[0]
elapsed = df.index - startTime

Result: 结果:

TypeError                                 Traceback (most recent call last)
<ipython-input-56-279fd541b1e2> in <module>()
----> 1 df.index - startTime

C:\Python27\lib\site-packages\pandas\tseries\index.pyc in __sub__(self, other)
    612             return self.shift(-other)
    613         else:  # pragma: no cover
--> 614             raise TypeError(other)
    615 
    616     def _add_delta(self, delta):

TypeError: 2014-07-14 14:47:57

The weird thing is that for example: 奇怪的是,例如:

df.index[1] - startTime

returns: 收益:

datetime.timedelta(0, 1)

I thought that maybe the fact that it's a datetimeindex and not a plain series that caused the problem. 我认为这可能是因为它是一个datetimeindex而不是导致问题的普通系列。 However when I first create a new series with df.index as the data argument and then attempt the subtraction, I get a whole load of warnings saying that I'm implicitly casting two incompatible types and that it will not work in the future: 然而,当我第一次使用df.index作为数据参数创建一个新系列然后尝试减法时,我得到一大堆警告,说我隐式地投射了两个不兼容的类型,并且它将来不会起作用:

timeStamps =pd.Series(data=df.index)
elapsed = timeStamps - timeStamps[0]

returns 回报

C:\Python27\lib\site-packages\pandas\core\format.py:1851: DeprecationWarning:     Implicitly casting between incompatible kinds. In a future numpy release, this will raise an error. Use casting="unsafe" if this is intentional.
  elif format_short and x == 0:

Although I do get a correct series of TimeDelta's with the latter method, I don't like to rely on deprecated code. 虽然我确实使用后一种方法获得了一系列正确的TimeDelta,但我不喜欢依赖已弃用的代码。 Is there a 'proper' way to calculate elapsed times? 是否有“正确”的方法来计算经过的时间?

Here is a piece of the csv-file that I get the data from: 这是我从以下网站获取数据的csv文件:

Timestamp   Bubbler_Temperature_Setpoint
14-7-2014 14:47:57  13.000000
14-7-2014 14:47:58  13.000000
14-7-2014 14:47:59  13.000000
14-7-2014 14:48:00  13.000000
14-7-2014 14:48:01  13.000000
14-7-2014 14:48:02  13.000000
14-7-2014 14:48:03  13.000000
14-7-2014 14:48:04  13.000000
14-7-2014 14:48:05  13.000000

I read it into a dataframe with the 'read_csv' function: 我使用'read_csv'函数将其读入数据帧:

df = pd.read_csv('test.csv',sep='\t',parse_dates='Timestamp',index_col='Timestamp')

I'm using pandas version 0.13.1 我正在使用pandas版本0.13.1

You are de-factor doing this: 你是这样做的因素:

In [30]: ts = Series(13,date_range('20140714 14:47:57',periods=10,freq='s'))

In [31]: ts
Out[31]: 
2014-07-14 14:47:57    13
2014-07-14 14:47:58    13
2014-07-14 14:47:59    13
2014-07-14 14:48:00    13
2014-07-14 14:48:01    13
2014-07-14 14:48:02    13
2014-07-14 14:48:03    13
2014-07-14 14:48:04    13
2014-07-14 14:48:05    13
2014-07-14 14:48:06    13
Freq: S, dtype: int64

# iirc this is available in 0.13.1 (if not, use ``Series(ts.index)``
In [32]: x = ts.index.to_series()

In [33]: x-x.iloc[0]
Out[33]: 
2014-07-14 14:47:57   00:00:00
2014-07-14 14:47:58   00:00:01
2014-07-14 14:47:59   00:00:02
2014-07-14 14:48:00   00:00:03
2014-07-14 14:48:01   00:00:04
2014-07-14 14:48:02   00:00:05
2014-07-14 14:48:03   00:00:06
2014-07-14 14:48:04   00:00:07
2014-07-14 14:48:05   00:00:08
2014-07-14 14:48:06   00:00:09
Freq: S, dtype: timedelta64[ns]

doing df.index-df.index[0] in your example is NOT a timedelta operation, but a SET operation. 在您的示例中执行df.index-df.index[0]不是timedelta操作,而是SET操作。 See here 看到这里

I just changed 我改变了

elapsed = df.index - startTime

to

df['elapsed'] = df.index - startTime

to get the time change column. 获取时间更改列。 Isn't that all you need? 这不就是你需要的吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM