[英]linear regression problems with statsmodel
我有一个看起来像这样的熊猫df:
broker-value-current broker-value-prior consensus-after
590.00 510.00 462.55
32.74 31.98 30.72
33.00 30.00 30.04
pctch_broker pctch_consensus pctch_frstrec_eps
15.686275 1.599051 1.421657
2.376485 0.195695 -82.098455
10.000000 0.805369 -82.098455
pctch_frstrec_rev
1.243782
-1.258936
-1.258936
最后几列的创建位置:
data['pctch_broker'] = ((data['broker-value-current']-data['broker-value-prior'])/data['broker-value-prior'])*100
data['pctch_consensus'] = ((data['consensus-after']-data['consensus-before'])/data['consensus-before'])*100
data['pctch_frstrec_eps'] = ((data['frstrec_eps_announced']-data['frstrec_eps_forecast'])/data['frstrec_eps_forecast'])*100
data['pctch_frstrec_rev'] = ((data['frstrec_rev_announced']-data['frstrec_rev_forecast'])/data['frstrec_rev_forecast'])*100
我也用这一行清除NA:
cleaned_data = data.dropna()
使用scipy统计信息时:
import statsmodels.formula.api as sm
但是,当我尝试使用以下代码将“ pctch_consensus”或“ pctch_broker”作为自变量与“ pctch_frstrec_rev”或“ pctch_frstrec_eps”作为自变量进行回归时:
reg1 = sm.ols(formula="pctch_consensus ~ pctch_frstrec_rev", data=cleaned_data).fit()
我收到此错误:
RuntimeWarning: invalid value encountered in greater return (S > tol).sum(axis=-1)
发生此问题,因为您的数据框中存在无限。 您可以通过在创建新变量时将其除以零来创建这些无穷大。
这应该解决它:
cleaned_data = data.replace([np.inf, -np.inf], np.nan)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.