[英]using sklearn linear regression fit on timeseries + plotting
I have the following timeseries outputted by get_DP(): 我有以下由get_DP()输出的时间序列:
DP
date
1900-01-31 0.0357
1900-02-28 0.0362
1900-03-31 0.0371
1900-04-30 0.0379
... ...
2015-09-30 0.0219
[1389 rows x 1 columns]
note: There is a DP value for every month from 1900-2015, I simply excluded them to avoid clutter 注意:从1900年到2015年,每个月都有一个DP值,为避免混乱,我将它们排除在外
I want to use a simple regression on this DataFrame to calculate the alpha & beta (intercept and coefficient resectively) of this financial variable. 我想在此DataFrame上使用简单回归来计算此财务变量的alpha和beta(分别为截距和系数)。 I have the following code that is intended to do so:
我有以下旨在实现此目的的代码:
reg = linear_model.LinearRegression()
df = get_DP()
df=df.reset_index()
reg.fit(df['date'].values.reshape((1389,1)), df['DP'].values)
print("beta: {}".format(reg.coef_))
print("alpha: {}".format(reg.intercept_))
plt.scatter(df['date'].values.reshape((1389,1)), df['DP'].values, color='black')
plt.plot(df['date'].values.reshape((1389,1)), df['DP'].values, color='blue', linewidth=3)
However, I believe the reshaping of my x-axis data (the dates) messes up the entire regression, because the plot looks like so: 但是,我相信我的x轴数据(日期)的重塑会弄乱整个回归,因为该图看起来像这样:
Am I making a mistake? 我在弄错吗? I'm not entirely sure what the best tool is for regression w/ DataFrame's since pandas removed their OLS function with 0.20.
我并不完全确定使用DataFrame进行回归的最佳工具是什么,因为熊猫用0.20删除了OLS函数。
try this one 试试这个
reg = linear_model.LinearRegression()
df = get_DP()
df=df.reset_index()
reg.fit(df.date.values.reshape(-1, 1), df.DP.values.reshape(-1, 1))
print("beta: {}".format(reg.coef_))
print("alpha: {}".format(reg.intercept_))
plt.scatter(df.date.dt.date, df.DP.values, color='black')
plt.plot(df.date.dt.date, df.DP.values, color='blue', linewidth=3)
See reshape documentation 请参阅重塑文档
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.