[英]DateTimeIndex Pandas .Series attribute Error
Right now, my data frame has two columns: a DateTimeIndex and a Load column.现在,我的数据框有两列:一个 DateTimeIndex 和一个 Load 列。 I want to add a third column with a consecutive second count, from zero, based on the DateTimeIndex.
我想根据 DateTimeIndex 添加第三列,从零开始连续计数。
import pandas as pd
import matplotlib.pyplot as plt
from scipy import signal
import numpy as np
# Create sample Data
df = pd.DataFrame([['2020-07-25 09:26:28',2],['2020-07-25 09:26:29',10],['2020-07-25 09:26:32',203],['2020-07-25 09:26:33',30]],
columns = ['Time','Load'])
df['Time'] = pd.to_datetime(df['Time'])
df = df.set_index("Time")
rng = pd.date_range(df.index[0], df.index[-1], freq='s')
df = df.reindex(rng).fillna(0)
## Create Elapsed Seconds Timeseries from DateTimeIndex
ts = pd.Series(df.index(range(len(df.index)), index=df.index))
# Desired Output
Load CountS
2020-07-25 09:26:28 2.0 1
2020-07-25 09:26:29 10.0 2
2020-07-25 09:26:30 0.0 3
2020-07-25 09:26:31 0.0 4
2020-07-25 09:26:32 203.0 5
2020-07-25 09:26:33 30.0 6
# Actual Output
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-32-02bfe0dcc12d> in <module>
17 ## Create Elapsed Seconds Column from DateTimeIndex
18
---> 19 ts = pd.Series(df.index(range(len(df.index)), index=df.index))
20
21 # df["Seconds"] =
TypeError: 'DatetimeIndex' object is not callable
seems like the issue is the instruction似乎问题是指令
df.index(range(len(df.index))
you're using df.index()
and that might be raising the not callable error (simple way to look at it: parenthesis are for methods, brackets are for indexing).您正在使用
df.index()
,这可能会引发不可调用的错误(查看它的简单方法:括号用于方法,括号用于索引)。 If you want to use a slice of df.index use the syntax df.index[]
.如果要使用 df.index 的一部分,请使用语法
df.index[]
。 Since it is not clear what you want to achieve I can't recommend a better solution由于不清楚您想要实现什么,我无法推荐更好的解决方案
UPDATE:更新:
after looking at your desired output, you can achieve that by doing在查看了您想要的 output 之后,您可以通过这样做来实现
df.asfreq('s').fillna(0)
Output: Output:
Load
Time
2020-07-25 09:26:28 2.0
2020-07-25 09:26:29 10.0
2020-07-25 09:26:30 0.0
2020-07-25 09:26:31 0.0
2020-07-25 09:26:32 203.0
2020-07-25 09:26:33 30.0
And regarding the seconds, there might be a simpler way, but this is what I have for you:关于秒数,可能有一种更简单的方法,但这就是我为您准备的:
df['CountS'] = df.index.to_series().diff().astype('timedelta64[s]').fillna(0).cumsum() + 1
Load CountS
Time
2020-07-25 09:26:28 2.0 1.0
2020-07-25 09:26:29 10.0 2.0
2020-07-25 09:26:30 0.0 3.0
2020-07-25 09:26:31 0.0 4.0
2020-07-25 09:26:32 203.0 5.0
2020-07-25 09:26:33 30.0 6.0
In case anyone else is asking a similar question to mine in a similarly confusing way (sorry, longtime users; I am still learning to ask questions better), here is the code that elegantly does what I want.如果其他人以同样令人困惑的方式向我提出类似的问题(对不起,长期用户;我仍在学习更好地提出问题),这里的代码可以优雅地完成我想要的。
# Change datetimeindex to timedelta by subtracting to datetimeindices.
# Change to integers by appending .seconds to datetime
# Assign values to new column "count"
df["Count"] = (df.index - df_index[0]).seconds
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.