简体   繁体   English

Sklearn Lasso回归比Ridge Regression更糟糕几个数量级?

[英]Sklearn Lasso Regression is orders of magnitude worse than Ridge Regression?

I've currently implemented Ridge and Lasso regression using the sklearn.linear_model module. 我目前使用sklearn.linear_model模块实现了Ridge和Lasso回归。

However, the Lasso Regression seems to do 3 orders of magnitude worse on the same dataset! 然而,套索回归似乎在同一数据集上做了3个数量级的恶化!

I'm not sure what's wrong, because mathematically, this shouldn't be happening. 我不确定是什么问题,因为在数学上,这不应该发生。 Here's my code: 这是我的代码:

def ridge_regression(X_train, Y_train, X_test, Y_test, model_alpha):
    clf = linear_model.Ridge(model_alpha)
    clf.fit(X_train, Y_train)
    predictions = clf.predict(X_test)
    loss = np.sum((predictions - Y_test)**2)
    return loss

def lasso_regression(X_train, Y_train, X_test, Y_test, model_alpha):
    clf = linear_model.Lasso(model_alpha)
    clf.fit(X_train, Y_train)
    predictions = clf.predict(X_test)
    loss = np.sum((predictions - Y_test)**2)
    return loss


X_train, X_test, Y_train, Y_test = cross_validation.train_test_split(X, Y, test_size=0.1, random_state=0)
for alpha in [0, 0.01, 0.1, 0.5, 1, 2, 5, 10, 100, 1000, 10000]:
    print("Lasso loss for alpha=" + str(alpha) +": " + str(lasso_regression(X_train, Y_train, X_test, Y_test, alpha)))

for alpha in [1, 1.25, 1.5, 1.75, 2, 5, 10, 100, 1000, 10000, 100000, 1000000]:
    print("Ridge loss for alpha=" + str(alpha) +": " + str(ridge_regression(X_train, Y_train, X_test, Y_test, alpha)))

And here's my output: 这是我的输出:

Lasso loss for alpha=0: 20575.7121727
Lasso loss for alpha=0.01: 19762.8763969
Lasso loss for alpha=0.1: 17656.9926418
Lasso loss for alpha=0.5: 15699.2014387
Lasso loss for alpha=1: 15619.9772649
Lasso loss for alpha=2: 15490.0433166
Lasso loss for alpha=5: 15328.4303197
Lasso loss for alpha=10: 15328.4303197
Lasso loss for alpha=100: 15328.4303197
Lasso loss for alpha=1000: 15328.4303197
Lasso loss for alpha=10000: 15328.4303197
Ridge loss for alpha=1: 61.6235890425
Ridge loss for alpha=1.25: 61.6360790934
Ridge loss for alpha=1.5: 61.6496312133
Ridge loss for alpha=1.75: 61.6636076713
Ridge loss for alpha=2: 61.6776331539
Ridge loss for alpha=5: 61.8206621527
Ridge loss for alpha=10: 61.9883144732
Ridge loss for alpha=100: 63.9106882674
Ridge loss for alpha=1000: 69.3266510866
Ridge loss for alpha=10000: 82.0056669678
Ridge loss for alpha=100000: 88.4479064159
Ridge loss for alpha=1000000: 91.7235727543

Any idea why? 知道为什么吗?

Thanks! 谢谢!

Interesting problem. 有趣的问题。 I can confirm that it's not an issue with the implementation of the algorithm, but the correct response to your input. 我可以确认这不是算法实现的问题,而是对输入的正确响应。

Here's a thought: you are not normalizing the data I believe from your description. 这是一个想法:您没有规范我从您的描述中相信的数据。 This can lead to instability, as your features have significantly different orders of magnitude and variance. 这可能会导致不稳定,因为您的功能具有显着不同的数量级和方差。 Lasso is more "all or nothing" than ridge (you've probably noticed it chooses many more 0 coefficients than ridge), so that instability is magnified. 套索比山脊更“全有或全无”(你可能已经注意到它选择的系数多于岭数0),因此不稳定性会被放大。

Try to normalize your data, and see if you like your results better. 尝试规范化您的数据,看看您是否更喜欢您的结果。

Another thought: that might be intentional from the Berkeley teachers, to highlight the fundamentally different behavior between ridge and lasso. 另一个想法:可能是伯克利老师的故意,强调脊和套索之间根本不同的行为。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM