简体   繁体   English

在向前填充(填充)值的同时对Pandas数据框进行重新采样

[英]Resampling a Pandas dataframe while forward filling (ffill) the values

I have a dataframe where a snippet looks like this 我有一个数据段,其中的片段看起来像这样

    Time                    Temperature
19  2019-01-01 11:48:51     23.798
20  2019-01-01 11:48:53     23.832
21  2019-01-01 11:48:54     NaN
22  2019-01-01 11:48:55     23.817
23  2019-01-01 11:48:56     NaN

I want to resample this to '2S' while making sure that the last measured value will replace any NaNs. 我想将其重新采样为“ 2S”,同时确保最后一个测量值将替换任何NaN。

df.resample('2S', on='Time').mean().ffill()

A snippet of the result looks like this 结果片段如下所示

                        Temperature
Time            
2019-01-01 11:48:52     23.832
2019-01-01 11:48:54     23.817
2019-01-01 11:48:56     23.809

Notice the value at timestamp t=54s. 注意时间戳t = 54s处的值。 What I want is the temperature 23.832 from t=53s, since that is the last recorded value at this timestamp. 我想要的是从t = 53s开始的温度23.832,因为这是该时间戳的最后记录值。 Instead it fillings with the value from t=55s 相反,它填充了t = 55s的值

Edit 1: After a reply, I tried the following: 编辑1:回复后,我尝试了以下操作:

df.ffill().resample('2S', on='Time').first()

But this gives the following result, where the new t=52s is equal to the old t=t=53s, which is not the behavior I am after... 但这给出了以下结果,其中新的t = 52s等于旧的t = t = 53s,这不是我追求的行为...

                        Temperature
Time            
2019-01-01 11:48:50     23.798
2019-01-01 11:48:52     23.832
2019-01-01 11:48:54     23.832
2019-01-01 11:48:56     23.817

EDIT 2: To make it easier to understand, this is the output I desire. 编辑2:为了更容易理解,这是我想要的输出。 I don't care if it is sampled on odd or even seconds. 我不在乎它是在奇数秒还是偶数秒采样的。

                        Temperature
Time            
2019-01-01 11:48:52     23.798
2019-01-01 11:48:54     23.832
2019-01-01 11:48:56     23.817

Edit #3: 编辑#3:

idx = df.resample('2S').asfreq().index
df.reindex(df.index.union(idx)).ffill().resample('2S').asfreq()

Output: 输出:

                     Temperature
Time                            
2019-01-01 11:48:50          NaN
2019-01-01 11:48:52       23.798
2019-01-01 11:48:54       23.832
2019-01-01 11:48:56       23.817

Edit #2: 编辑#2:

idx = df.resample('2S').asfreq().index
df.reindex(df.index.union(idx)).bfill().resample('2S').first()

Output: 输出:

                     Temperature
Time                            
2019-01-01 11:48:50       23.798
2019-01-01 11:48:52       23.832
2019-01-01 11:48:54       23.817
2019-01-01 11:48:56          NaN

EDIT: 编辑:

df.reindex(df.index.union(df.resample('2S').asfreq().index))\
  .interpolate().resample('2S').asfreq()

Output: 输出:

                     Temperature
Time                            
2019-01-01 11:48:50          NaN
2019-01-01 11:48:52      23.8150
2019-01-01 11:48:54      23.8245
2019-01-01 11:48:56      23.8170

Do you want to resample two seconds on odd seconds or even seconds? 您要在奇数秒或什至秒数上对两秒进行重新采样吗?

df.ffill().resample('2S', on='Time', base=1).mean()

Output: 输出:

                    Temperature
Time                            
2019-01-01 11:48:51       23.798
2019-01-01 11:48:53       23.832
2019-01-01 11:48:55       23.817

Or just on even number of seconds: 或仅偶数秒:

df.ffill().resample('2S', on='Time').mean()

Output: 输出:

                     Temperature
Time                            
2019-01-01 11:48:50      23.7980
2019-01-01 11:48:52      23.8320
2019-01-01 11:48:54      23.8245
2019-01-01 11:48:56      23.8170

EDITED to use last, not first. 编辑使用最后,而不是第一个。 Probably doesn't matter with your sample data, but if you have multiple records in a 2 second period, this will ensure you take the most recent one. 样本数据可能无关紧要,但是如果您在2秒钟内有多个记录,则可以确保您获取了最近的记录。

There is an option when resampling to specify what edge of the bin to label the data with. 重采样时有一个选项可以指定容器的哪个边缘来标记数据。 The default for S is left - so the start of the 2 second time period. S的默认值为左-因此是2秒时间段的开始。 Changing to right I believe gives what you are after. 我相信,从right前进会给您带来追寻。

df.resample('2S', on='Time', label='right').last().ffill()

Time                Temperature
2019-01-01 11:48:52 23.798
2019-01-01 11:48:54 23.832
2019-01-01 11:48:56 23.817
2019-01-01 11:48:58 23.817

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM