简体   繁体   English

Python Matplotlib - 带有日期值的 x 轴平滑 plot 线

[英]Python Matplotlib - Smooth plot line for x-axis with date values

Im trying to smooth a graph line out but since the x-axis values are dates im having great trouble doing this.我试图平滑图形线,但由于 x 轴值是日期,我在执行此操作时遇到了很大的麻烦。 Say we have a dataframe as follows假设我们有一个 dataframe 如下

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
%matplotlib inline

startDate = '2015-05-15'
endDate = '2015-12-5'
index = pd.date_range(startDate, endDate)
data = np.random.normal(0, 1, size=len(index))
cols = ['value']

df = pd.DataFrame(data, index=index, columns=cols)

Then we plot the data然后我们plot这个数据

fig, axs = plt.subplots(1,1, figsize=(18,5))
x = df.index
y = df.value
axs.plot(x, y)
fig.show()

we get我们得到

在此处输入图像描述

Now to smooth this line there are some usefull staekoverflow questions allready like:现在为了平滑这条线,有一些有用的 staekoverflow 问题已经准备就绪,例如:

But I just cant seem to get some code working to do this for my example, any suggestions?但是我似乎无法获得一些代码来为我的示例执行此操作,有什么建议吗?

You can use interpolation functionality that is shipped with pandas .您可以使用pandas附带的插值功能。 Because your dataframe has a value for every index already, you can populate it with an index that is more sparse, and fill every previously non-existent indices with NaN values.因为您的数据框已经为每个索引提供了一个值,所以您可以使用更稀疏的索引填充它,并用NaN值填充每个以前不存在的索引。 Then, after choosing one of many interpolation methods available , interpolate and plot your data:然后,在选择了许多可用的插值方法之一后,对数据进行插值和绘图:

index_hourly = pd.date_range(startDate, endDate, freq='1H')
df_smooth = df.reindex(index=index_hourly).interpolate('cubic')
df_smooth = df_smooth.rename(columns={'value':'smooth'})

df_smooth.plot(ax=axs, alpha=0.7)
df.plot(ax=axs, alpha=0.7)
fig.show()

在此处输入图片说明

There is one workaround, we will create two plots - 1) non smoothed /interploted with date labels 2) smoothed without date labels.有一种解决方法,我们将创建两个图 - 1) 非平滑/使用日期标签进行插值 2) 不使用日期标签进行平滑。

Plot the 1) using argument linestyle=" " and convert the dates to be plotted on x-axis to string type.使用参数linestyle=" "绘制1)并将要绘制在 x 轴上的日期转换为字符串类型。

Plot the 2) using the argument linestyle="-" and interpolating the x-axis and y-axis using np.linespace and make_interp_spline respectively.使用参数linestyle="-"绘制2)并分别使用np.linespacemake_interp_spline插入 x 轴和 y 轴。

Following is the use of the discussed workaround for your code.以下是针对您的代码使用所讨论的解决方法。

# your initial code
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy.interpolate import make_interp_spline
%matplotlib inline
startDate = "2015-05-15"
endDate = "2015-07-5" #reduced the end date so smoothness is clearly seen
index = pd.date_range(startDate, endDate)
data = np.random.normal(0, 1, size=len(index))
cols = ["value"]

df = pd.DataFrame(data, index=index, columns=cols)
fig, axs = plt.subplots(1, 1, figsize=(40, 12))
x = df.index
y = df.value

# workaround by creating linespace for length of your x axis
x_new = np.linspace(0, len(df.index), 300)
a_BSpline = make_interp_spline(
    [i for i in range(0, len(df.index))],
    df.value,
    k=5,
)
y_new = a_BSpline(x_new)

# plot this new plot with linestyle = "-"
axs.plot(
    x_new[:-5], # removing last 5 entries to remove noise, because interpolation outputs large values at the end.
    y_new[:-5],
    "-",
    label="interpolated"
)

# to get the date on x axis we will keep our previous plot but linestyle will be None so it won't be visible
x = list(x.astype(str))
axs.plot(x, y, linestyle=" ", alpha=0.75, label="initial")
xt = [x[i] for i in range(0,len(x),5)]
plt.xticks(xt,rotation="vertical")
plt.legend()
fig.show()

Resulting Plot结果图情节

Overalpped plot to see the smoothing. Overalpped 图以查看平滑。 情节

Depending on what exactly you mean by "smoothing," the easiest way can be the use of savgol_filter or something similar.根据“平滑”的确切含义,最简单的方法是使用savgol_filter或类似的东西。 Unlike with interpolated splines, this method means that the smoothed line does not pass through the measured points, effectively filtering out higher-frequency noise.与插值样条不同,此方法意味着平滑线不会通过测量点,从而有效滤除高频噪声。

from scipy.signal import savgol_filter

...
windowSize = 21
polyOrder = 1
smoothed = savgol_filter(values, windowSize, polyOrder)
axes.plot(datetimes, smoothed, color=chart.color)

The higher the polynomial order value, the closer the smoothed line is to the raw data.多项式阶数越高,平滑后的线越接近原始数据。

Here is an example.这是一个例子。 原始数据和平滑数据的比较图。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM