简体   繁体   English

如何仅对 pytorch 中的一层应用正则化?

[英]How to apply regularization only to one layer in pytorch?

Let's image a network with 2 layers (X1, X2).让我们想象一个有 2 层(X1,X2)的网络。 I want to use L1 Norm on X1 and then do (loss + L1).backward() on X1.我想在 X1 上使用 L1 Norm,然后在 X1 上执行 (loss + L1).backward()。 X2 is still trained but without the regularization. X2 仍在训练中,但没有进行正则化。 My goal is to make X1 become sparse.我的目标是让 X1 变得稀疏。

I have already tried this , unfortunately the regularization is applied to all layers, even though it only uses parameters from one layer.我已经尝试过,不幸的是,正则化适用于所有层,即使它只使用一层的参数。

I have also tried to freeze X1, do loss.backward() and then freeze X2 to apply do loss.backward(), including regularization.我还尝试冻结 X1,执行 loss.backward(),然后冻结 X2 以应用执行 loss.backward(),包括正则化。 Like this:像这样:

for parameter in model.X1.parameters():
     parameter.requires_grad = False

loss.backward(retain_graph=True)

for parameter in model.X1.parameters():
     parameter.requires_grad = True
for parameter in model.X2.parameters():
     parameter.requires_grad = False


loss += l1_regularization
loss.backward()
optimizer.step()

The outcome is not as expected though.然而结果并不像预期的那样。 X2 does not get updated at all anymore and the values in X1 seem to be too low (all weights become very close to zero). X2 不再更新,X1 中的值似乎太低(所有权重都变得非常接近于零)。

What am I doing wrong and is there any way to reach my goal?我做错了什么,有什么办法可以达到我的目标? Thanks for your help谢谢你的帮助

Your second implementation should work.您的第二个实现应该可以工作。 However, it doesn't show the part where you set requires_grad = True for X2 afterwards (or at the start where you freeze X1).但是,它不会显示您之后为 X2 设置requires_grad = True的部分(或在您冻结 X1 的开始处)。 If that part is indeed missing in your code, then from the second loop onward, X2 will not get trained.如果您的代码中确实缺少该部分,那么从第二个循环开始,X2 将不会得到训练。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM