获取均匀间隔数据的有效方法/pandas DataFrame.reindex

Question

In order to be able to compare different data sets I need a way to put these on a common time basis.为了能够比较不同的数据集，我需要一种方法将它们放在一个共同的时间基础上。 What is the most efficient way to achieve this?实现这一目标的最有效方法是什么？

I've tried a few ways and the most easy should - to my understanding - be with pandas DataFrame.reindex:我尝试了几种方法，据我所知，最简单的方法应该是使用 pandas DataFrame.reindex：

I have an unevenly spaced time array with associated values for the new status (on/off) which persists after the entry.我有一个间隔不均匀的时间数组，其中包含新状态（开/关）的相关值，该值在输入后仍然存在。 As such I want to use the previous value of the status column until a new value at a new time for the status is set.因此，我想使用状态列的先前值，直到为状态设置新时间的新值。

The typical array looks like, df is a one-column DataFrame with time as index and status as column:典型的数组看起来像， df是一个单列 DataFrame，时间作为索引，状态作为列：

In [58]: df
Out[58]: 
           status
time             
1632160022      0
1632986376   <NA>
1632986496      0
1633448715      1
1633452437      0
1633454358      1
1633461201      0
1633534763      1
1633551686      0 
...

From the docs of pandas DataFrame.reindex I read that rebasing / re-indexing with the fill-method pad / ffill should yield the previous value:从 pandas DataFrame.reindex的文档中，我读到使用填充方法pad / ffill重新定位 / 重新索引应该产生以前的值：

# creating evenly-spaced time base for observation duration
tmin = min(df.index)
tmax = max(df.index)
tspacing = 120
tbase = [t for t in range(tmin,tmax,tspacing)]

# create the temporally evenly-spaced DataFrame
ndf = df.reindex(index=tbase, method='pad', tolerance=120)

However the result is different to what I expect, all subsequent status entries get assigned NaN instead of the forward interpolated value:但是结果与我的预期不同，所有后续status条目都被分配了NaN而不是前向插值：

In[62]: ndf
Out[62]: 
           status
time             
1632160022      0
1632160142      0
1632160262    NaN
1632160382    NaN
1632160502    NaN
          ...

Any idea what I'm missing, doing wrong or if this method does not do the trick: is there another ready-made method available?知道我遗漏了什么，做错了什么，或者如果这种方法不起作用：是否有另一种现成的方法可用？

Answer 1

As such I want to use the previous value of the status column until a new value at a new time for the status is set.因此，我想使用状态列的先前值，直到为状态设置新时间的新值。

IIUC:国际大学联盟：

ndf = df.reindex(tbase, method='ffill')

获取均匀间隔数据的有效方法/pandas DataFrame.reindex

问题描述

1 个解决方案

解决方案1
0 2022-05-29 18:53:31

获取均匀间隔数据的有效方法/pandas DataFrame.reindex

问题描述

1 个解决方案

解决方案1 0 2022-05-29 18:53:31

解决方案1
0 2022-05-29 18:53:31