简体   繁体   中英

Constrained regression in Python

I have this simple regression model:

 y = a + b * x + c * z + error

with a constraint on parameters:

c = b - 1

There are similar questions posted on SO (like Constrained Linear Regression in Python ). However, the constraints' type is lb <= parameter =< ub .

What are the available options to handle this specific constrained linear regression problem?

This is how it can be done using GLM:

import statsmodels
import statsmodels.api as sm
import numpy as np

# Set the link function to identity
statsmodels.genmod.families.links.identity()

OLS_from_GLM = sm.GLM(y, sm.add_constant(np.column_stack(x, z)))

 '''Setting the restrictions on parameters in the form of (R, q), where R 
 and q are constraints' matrix and constraints' values, respectively. As
 for the restriction in the aforementioned regression model, i.e., 
 c = b - 1 or b - c = 1, R = [0, 1, -1] and q = 1.'''

res_OLS_from_GLM = OLS_from_GLM.fit_constrained(([0, 1.0, -1.0], 1))

print(res_OLS_from_GLM.summary())

There are a few constrained optimization packages in Python such as CVX, CASADI, GEKKO, Pyomo, and others that can solve the problem. I develop Gekko for linear, nonlinear, and mixed integer optimization problems with differential or algebraic constraints.

import numpy as np
from gekko import GEKKO

# Data
x = np.random.rand(10)
y = np.random.rand(10)
z = np.random.rand(10)

# Gekko for constrained regression
m = GEKKO(remote=False); m.options.IMODE=2
a,b,c  = m.Array(m.FV,3)
a.STATUS=1; b.STATUS=1; c.STATUS=1
x=m.Param(x); z=m.Param(z)
y = m.Var();  ym=m.Param(y)
m.Equation(y==a+b*x+c*z)
m.Equation(c==b-1)
m.Minimize((ym-y)**2)
m.options.SOLVER=1
m.solve(disp=True)
print(a.value[0],b.value[0],c.value[0])

This gives the solution that may be different when you run it because it uses random values for the data.

-0.021514129645 0.45830726553 -0.54169273447

The constraint c = b - 1 is satisfied with -0.54169273447 = 0.45830726553 - 1 . Here is a comparison to other linear regression packages in Python with an without constraints:

约束线性回归

import numpy as np
from scipy.stats import linregress
import statsmodels.api as sm
import matplotlib.pyplot as plt
from gekko import GEKKO

# Data
x = np.array([4,5,2,3,-1,1,6,7])
y = np.array([0.3,0.8,-0.05,0.1,-0.8,-0.5,0.5,0.65])

# calculate R^2
def rsq(y1,y2):
    yresid= y1 - y2
    SSresid = np.sum(yresid**2)
    SStotal = len(y1) * np.var(y1)
    r2 = 1 - SSresid/SStotal
    return r2

# Method 1: scipy linregress
slope,intercept,r,p_value,std_err = linregress(x,y)
a = [slope,intercept]
print('R^2 linregress = '+str(r**2))

# Method 2: numpy polyfit (1=linear)
a = np.polyfit(x,y,1); print(a)
yfit = np.polyval(a,x)
print('R^2 polyfit    = '+str(rsq(y,yfit)))

# Method 3: numpy linalg solution
#       y =     X a
#   X^T y = X^T X a
X = np.vstack((x,np.ones(len(x)))).T
# matrix operations
XX = np.dot(X.T,X)
XTy = np.dot(X.T,y)
a = np.linalg.solve(XX,XTy)
# same solution with lstsq
a = np.linalg.lstsq(X,y,rcond=None)[0]
yfit = a[0]*x+a[1]; print(a)
print('R^2 matrix     = '+str(rsq(y,yfit)))

# Method 4: statsmodels ordinary least squares
X = sm.add_constant(x,prepend=False)
model = sm.OLS(y,X).fit()
yfit = model.predict(X)
a = model.params
print(model.summary())

# Method 5: Gekko for constrained regression
m = GEKKO(remote=False); m.options.IMODE=2
c  = m.Array(m.FV,2); c[0].STATUS=1; c[1].STATUS=1
c[1].lower=-0.5
xd = m.Param(x); yd = m.Param(y); yp = m.Var()
m.Equation(yp==c[0]*xd+c[1])
m.Minimize((yd-yp)**2)
m.solve(disp=False)
c = [c[0].value[0],c[1].value[1]]
print(c)

# plot data and regressed line
plt.plot(x,y,'ko',label='data')
xp = np.linspace(-2,8,100)
slope     = str(np.round(a[0],2))
intercept = str(np.round(a[1],2))
eqn = 'LstSQ: y='+slope+'x'+intercept
plt.plot(xp,a[0]*xp+a[1],'r-',label=eqn)
slope     = str(np.round(c[0],2))
intercept = str(np.round(c[1],2))
eqn = 'Constraint: y='+slope+'x'+intercept
plt.plot(xp,c[0]*xp+c[1],'b--',label=eqn)
plt.grid()
plt.legend()
plt.show()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM