简体   繁体   English

matplotlib 绘制 pandas 系列的趋势线

[英]matplotlib plotting trendline for pandas series

I have been trying to plot a trendline for a pandas series and have been successful although I am getting multiple trendlines whereas I am expecting only one.我一直在尝试 plot pandas 系列的趋势线并且已经成功,尽管我得到了多个趋势线,而我只期待一个。

Here is my code:这是我的代码:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.read_excel( 'cleaned_wind_turbine_data.xlsx' , index_col = 'Date' , parse_dates = True )
df_columns = df.columns.to_list()

df_1 = df.loc[  '2021-02-01 00:00:00' : '2021-02-28 23:50:00' ]

z1 = np.polyfit( df_1['Wind Speed (m/s)'] , df_1['Power ac (kW)'] , 6)
p1 = np.poly1d(z1)

plt.plot( df_1['Wind speed (m/s)'] , df_1['Power ac (kW)'] , 'bx' , 
         df_1['Wind speed (m/s)'] , p1(df_1['Wind speed (m/s)']) , 'r--' ,  markersize = 0.5 , linewidth = 1)
 

I am not getting an error but I am getting multiple trendlines, why is that?我没有收到错误,但我收到了多个趋势线,这是为什么呢?

You are getting "multiple" trendlines because your wind-speed column has a bunch of wind speeds that are in a jumbled order.你得到“多条”趋势线,因为你的风速列有一堆风速,它们的顺序很混乱。 For example, your windspeed array is probably something like例如,您的风速阵列可能类似于

np.array([0.0,5.2,1.0,8.8])

matplotlib is going to draw a line between each of those points sequentially. matplotlib将在每个点之间依次画一条线。 Instead, for your best fit line, you need to come up with an ordered x that is equally spaced (something like np.array([0.0,0.1,0.2... )相反,为了获得最佳拟合线,您需要提供一个等间距的有序 x(类似于np.array([0.0,0.1,0.2... )

To do that要做到这一点

x_trendline = np.arange(df_1['Wind Speed (m/s)'].min(), df_1['Wind Speed (m/s)'].max(), 0.05)
y_trendline = p1(x_trendline)

then when you plot,然后当你 plot,

plt.plot( df_1['Wind speed (m/s)'] , df_1['Power ac (kW)'] , 'bx' , 
          x_trendline, y_trendline , 'r--' ,  markersize = 0.5 , linewidth = 1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM