简体   繁体   English

如何使用 interp1d 为时间序列数据绘制平滑曲线?

[英]How to plot a smooth curve using interp1d for time-series data?

I have the following dataframe, called new_df:我有以下数据框,称为 new_df:

     period1  intercept     error
0   2018-01-10 -33.707010  0.246193
1   2018-01-11 -36.151656  0.315618
2   2018-01-14 -37.846709  0.355960
3   2018-01-20 -37.170161  0.343631
4   2018-01-26 -31.785060  0.350386
..         ...        ...       ...
121 2020-05-03 -37.654889  0.489900
122 2020-05-06 -36.575763  0.559362
123 2020-06-10 -39.084314  0.756743
124 2020-06-11 -36.240442  0.705487
125 2020-06-14 -45.530748  0.991380

I am trying to plot a smooth curve (spline) with 'period1' on x-axis and 'intercept' on the y.我试图用 x 轴上的“period1”和 y 上的“截距”绘制平滑曲线(样条)。 Plotting this normally, without any interpolation I get:正常绘图,没有任何插值我得到:

在此处输入图片说明

To smooth this curve, I have tried the following using interp1d function from scipy:为了平滑这条曲线,我使用 scipy 的 interp1d 函数尝试了以下操作:

from matplotlib import dates
from scipy.interpolate import interp1d
import numpy as np
import matplotlib.plt as plt

x = new_df.period1.values # convert period1 column to a numpy array
y = new_df.intercept.values # convert the intercept column to a numpy array
x_dates = np.array([dates.date2num(i) for i in x]) # period1 values are datetime objects, this line converts them to numbers

f = interp1d(x_dates, y, kind = 'cubic')
x_smooth = np.linspace(x_dates.min(), x_dates.max(), endpoint = True) # unsure if this line is right?

plt.plot(x_dates, y, 'o', x_smooth, f(x_smooth),'--')
plt.xlabel('Date')
plt.ylabel('Intercept')
plt.legend(['data', 'cubic spline'], loc = 'lower right')
plt.show()

This gives the output:这给出了输出:

在此处输入图片说明

Which is not the correct smooth curve I'm trying to get.这不是我想要得到的正确平滑曲线。 Is there something I am doing wrong somewhere?我在某处做错了什么吗? Also how can I revert the xticks back to dates?另外,如何将 xticks 恢复为日期?

NB.注意。 There isn't a fixed interval between the dates in the period1 column and they're completely radnom period1 列中的日期之间没有固定的间隔,它们完全是 radnom

Any help is appreciated.任何帮助表示赞赏。 Thanks!谢谢!

Instead of interpolation (or perhaps use in addition to) try using data-smoothing (ie 'convolution').尝试使用数据平滑(即“卷积”)而不是插值(或者可能另外使用)。

The basic concept is simple - replace the value at a point t, with the average value of that point, and the ones around it.基本概念很简单 - 用该点的平均值及其周围的平均值替换点 t 处的值。

What this will do is remove the noise between adjacent points, and make the plot more look like the overall trend in the data.这将做的是去除相邻点之间的噪音,并使绘图更像数据中的整体趋势。

While it's easy to write this yourself, or use numpy convolve , there is a specialized method in scipy for this: savgol_filter that offers a few helpful features out of the box.虽然它很容易写自己,或使用numpy的convolve ,存在SciPy的这个专门的方法: savgol_filter是提供一些有用的功能开箱。

savgol_filter is in scipy.signal so you could check out the examples there. savgol_filter位于scipy.signal因此您可以查看那里的示例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM