简体   繁体   中英

How to add "greater than 0 and sums to 1" constraint to a regression in Python?

I am using statsmodels (open to other python options) to run some linear regression. My problem is that I need the regression to have no intercept and constraint the coefficients in the range (0,1) and also sum to 1.

I tried something like this (for the sum of 1, at least):

from statsmodels.formula.api import glm
import pandas as pd

df = pd.DataFrame({'revised_guess':[0.6], "self":[0.55], "alter_1":[0.45], "alter_2":[0.2],"alter_3":[0.8]})
mod = glm("revised_guess ~ self + alter_1 + alter_2 + alter_3 - 1", data=df)
res = mod.fit_constrained(["self + alter_1 + alter_2 + alter_3  = 1"],
                          start_params=[0.25,0.25,0.25,0.25])
res.summary()

but still struggling to enforce the 'non-negative' coefficients constraint.

You could NNLS(Non-Negative Least Squares) which is defined under scipy. It based on FORTRAN non negative least square solver. You cant add constraints to it. So add another equation such that x1+x2+x3=1 to the input equations.

import numpy as np
from scipy.optimize import nnls 
##Define the input vectors
A = np.array([[1., 2., 5.], 
              [5., 6., 4.],
              [1.,  1.,   1. ]])

b = np.array([4., 7., 2.])

##Caluculate nnls
x, resdiual_norm = nnls(A,b)


##Find the difference
print(np.sum(A*x,1)-b)

Now perform NNLS over this matrix, it will return the x values and the residuals.

Simply do the L1 regularized regression:

import statsmodels.api as sm
from statsmodels.regression.linear_model import OLS
model = sm.OLS(Y,X)
model2=model.fit_regularized(method='elastic_net', alpha=0.0, L1_wt=1.0, start_params=None, profile_scale=False, refit=False)
model2.params

... and tune hyperparameters.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM