简体   繁体   中英

how to get equation for nonlinear mutivariate regression in which one variable is dependent on other two independent variables in python

I have set of 5000 data points of like_so_ (x,y,z) for eg (0,1,50) where x=1,y=2,z=120.with help of these 5000 enteries,i have to get an equation in
which given x and y ,equation should be able to get value of z

You can use statsmodels.ols . Some sample data - assuming you can create a pd.DataFrame from your (x, y, z) data:

import pandas as pd
df = pd.DataFrame(np.random.randint(100, size=(150, 3)), columns=list('XYZ'))
df.info()

RangeIndex: 150 entries, 0 to 149
Data columns (total 3 columns):
X    150 non-null int64
Y    150 non-null int64
Z    150 non-null int64

Now estimate linear regression parameters:

import numpy as np
import statsmodels.api as sm

model = sm.OLS(df['Z'], df[['X', 'Y']])
results = model.fit()

to get:

results.summary())

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      Z   R-squared:                       0.652
Model:                            OLS   Adj. R-squared:                  0.647
Method:                 Least Squares   F-statistic:                     138.6
Date:                Fri, 17 Jun 2016   Prob (F-statistic):           1.21e-34
Time:                        13:48:38   Log-Likelihood:                -741.94
No. Observations:                 150   AIC:                             1488.
Df Residuals:                     148   BIC:                             1494.
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
X              0.5224      0.076      6.874      0.000         0.372     0.673
Y              0.3531      0.076      4.667      0.000         0.204     0.503
==============================================================================
Omnibus:                        5.869   Durbin-Watson:                   1.921
Prob(Omnibus):                  0.053   Jarque-Bera (JB):                2.990
Skew:                          -0.000   Prob(JB):                        0.224
Kurtosis:                       2.308   Cond. No.                         2.70
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

to predict, use:

params = results.params
params = results.params
df['predictions'] = model.predict(params)

which yields:

    X   Y   Z  predictions
0  31  85  75    54.701830
1  36  46  43    34.828605
2  77  42   8    43.795386
3  78  84  65    66.932761
4  27  54  50    36.737606

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM