statsmodel：模拟数据并运行简单的线性回归

Question

I'm new to python statsmodels package. 我是python statsmodels包的新手。 I'm trying to simulate some data linearly related to log(x) and run a simple linear regression using statsmodels formula interface. 我正在尝试模拟一些与log（x）线性相关的数据，并使用statsmodels公式接口运行简单的线性回归。 Here are the codes: 以下是代码：

import pandas as pd
import numpy as np
import statsmodels.formula.api as smf

B0 = 3
B1 = 0.5
x = np.linspace(10, 1e4, num = 1000)
epsilon = np.random.normal(0,3, size=1000)

y=B0 + B1*np.log(x)+epsilon
df1 = pd.DataFrame({'Y':y, 'X':x})

model = smf.OLS ('Y~np.log(X)', data=df1).fit()

I got error below: 我在下面出现错误：

ValueError                                Traceback (most recent call last)
<ipython-input-34-c0ab32ca2acf> in <module>()
      7 y=B0 + B1*np.log(X)+epsilon
      8 df1 = pd.DataFrame({'Y':y, 'X':X})
----> 9 smf.OLS ('Y~np.log(X)', data=df1)

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/regression/linear_model.py in __init__(self, endog, exog, missing, hasconst, **kwargs)
    689                  **kwargs):
    690         super(OLS, self).__init__(endog, exog, missing=missing,
--> 691                                   hasconst=hasconst, **kwargs)
    692         if "weights" in self._init_keys:
    693             self._init_keys.remove("weights")

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/regression/linear_model.py in __init__(self, endog, exog, weights, missing, hasconst, **kwargs)
    584             weights = weights.squeeze()
    585         super(WLS, self).__init__(endog, exog, missing=missing,
--> 586                                   weights=weights, hasconst=hasconst, **kwargs)
    587         nobs = self.exog.shape[0]
    588         weights = self.weights

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/regression/linear_model.py in __init__(self, endog, exog, **kwargs)
     89     """
     90     def __init__(self, endog, exog, **kwargs):
---> 91         super(RegressionModel, self).__init__(endog, exog, **kwargs)
     92         self._data_attr.extend(['pinv_wexog', 'wendog', 'wexog', 'weights'])
     93 

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/base/model.py in __init__(self, endog, exog, **kwargs)
    184 
    185     def __init__(self, endog, exog=None, **kwargs):
--> 186         super(LikelihoodModel, self).__init__(endog, exog, **kwargs)
    187         self.initialize()
    188 

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/base/model.py in __init__(self, endog, exog, **kwargs)
     58         hasconst = kwargs.pop('hasconst', None)
     59         self.data = self._handle_data(endog, exog, missing, hasconst,
---> 60                                       **kwargs)
     61         self.k_constant = self.data.k_constant
     62         self.exog = self.data.exog

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/base/model.py in _handle_data(self, endog, exog, missing, hasconst, **kwargs)
     82 
     83     def _handle_data(self, endog, exog, missing, hasconst, **kwargs):
---> 84         data = handle_data(endog, exog, missing, hasconst, **kwargs)
     85         # kwargs arrays could have changed, easier to just attach here
     86         for key in kwargs:

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/base/data.py in handle_data(endog, exog, missing, hasconst, **kwargs)
    562         exog = np.asarray(exog)
    563 
--> 564     klass = handle_data_class_factory(endog, exog)
    565     return klass(endog, exog=exog, missing=missing, hasconst=hasconst,
    566                  **kwargs)

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/base/data.py in handle_data_class_factory(endog, exog)
    551     else:
    552         raise ValueError('unrecognized data structures: %s / %s' %
--> 553                          (type(endog), type(exog)))
    554     return klass
    555 

ValueError: unrecognized data structures: <class 'str'> / <class 'NoneType'>

I checked the documentations and everything seems to be right. 我检查了文档，一切似乎都正确。 Spent long time trying to understand why I got these errors but could not figure out. 花了很长时间试图了解我为什么会出现这些错误，但无法弄清楚。 Help is very much appreciated. 非常感谢您的帮助。

Answer 1

In statsmodels.formula.api the ols method is lowercase. 在statsmodels.formula.api中，ols方法为小写。 In statsmodels.api the OLS is all caps. 在statsmodels.api中，OLS都是大写字母。 In your case you need... 在您的情况下，您需要...

model = smf.ols('Y~np.log(X)', data=df1).fit()

statsmodel：模拟数据并运行简单的线性回归

问题描述

1 个解决方案

解决方案1
7 已采纳 2017-02-02 14:51:21

statsmodel：模拟数据并运行简单的线性回归

问题描述

1 个解决方案

解决方案1 7 已采纳 2017-02-02 14:51:21

解决方案1
7 已采纳 2017-02-02 14:51:21