為什么在Pytorch中對網絡的權重進行復制時，它將在反向傳播后自動更新？

Question

我編寫了以下代碼作為測試，因為在我的原始網絡中，我使用ModuleDict，並且取決於我提供的索引將僅對該網絡的一部分進行切片和訓練。

我想確保只有切成薄片的圖層會更新其權重，所以我編寫了一些測試代碼來進行仔細檢查。 好吧，我得到一些奇怪的結果。 假設我的模型有2個層，第1層是FC，第2層是Conv2d，如果我對網絡進行切片並且僅使用第2層，則我希望第1層的權重保持不變，因為它們未使用，並且第1層后將更新第2層的權重。

因此，我的計划是使用for循環從網絡中獲取所有權重，然后再進行訓練，然后在1 optimizer.step()之后執行此操作。 兩次，我都將那些權重完全存儲在2個Python列表中，以便以后可以比較它們的結果。 好吧，出於某種原因，如果我將兩個列表與torch.equal()進行比較，則它們是torch.equal()我認為這是因為內存中可能仍然存在某種隱藏鏈接？ 因此，當我從循環中獲取權重時，我嘗試在權重上使用.detach() ，結果仍然相同。 在這種情況下，第2層的權重應該有所不同，因為在訓練之前它應包含來自網絡的權重。

在下面的代碼中指出，我實際上是在使用layer1並忽略layer2。

完整代碼：

class mymodel(nn.Module):
    def __init__(self):
        super().__init__() 
        self.layer1 = nn.Linear(10, 5)
        self.layer2 = nn.Conv2d(1, 5, 4, 2, 1)
        self.act = nn.Sigmoid()
    def forward(self, x):
        x = self.layer1(x) #only layer1 and act are used layer 2 is ignored so only layer1 and act's weight should be updated
        x = self.act(x)
        return x
model = mymodel()

weights = []

for param in model.parameters(): # loop the weights in the model before updating and store them
    print(param.size())
    weights.append(param)

critertion = nn.BCELoss() #criterion and optimizer setup
optimizer = optim.Adam(model.parameters(), lr = 0.001)

foo = torch.randn(3, 10) #fake input
target = torch.randn(3, 5) #fake target

result = model(foo) #predictions and comparison and backprop
loss = criterion(result, target)
optimizer.zero_grad()
loss.backward()
optimizer.step()


weights_after_backprop = [] # weights after backprop
for param in model.parameters():
    weights_after_backprop.append(param) # only layer1's weight should update, layer2 is not used

for i in zip(weights, weights_after_backprop):
    print(torch.equal(i[0], i[1]))

# **prints all Trues when "layer1" and "act" should be different, I have also tried to call param.detach in the loop but I got the same result.

Answer 1

您必須clone參數，否則只需復制引用即可。

weights = []

for param in model.parameters():
    weights.append(param.clone())

criterion = nn.BCELoss() # criterion and optimizer setup
optimizer = optim.Adam(model.parameters(), lr=0.001)

foo = torch.randn(3, 10) # fake input
target = torch.randn(3, 5) # fake target

result = model(foo) # predictions and comparison and backprop
loss = criterion(result, target)
optimizer.zero_grad()
loss.backward()
optimizer.step()


weights_after_backprop = [] # weights after backprop
for param in model.parameters():
    weights_after_backprop.append(param.clone()) # only layer1's weight should update, layer2 is not used

for i in zip(weights, weights_after_backprop):
    print(torch.equal(i[0], i[1]))

這使

False
False
True
True

為什么在Pytorch中對網絡的權重進行復制時，它將在反向傳播后自動更新？

問題描述

1 個解決方案

解決方案1
1 已采納 2018-08-07 08:17:32

為什么在Pytorch中對網絡的權重進行復制時，它將在反向傳播后自動更新？

問題描述

1 個解決方案

解決方案1 1 已采納 2018-08-07 08:17:32

解決方案1
1 已采納 2018-08-07 08:17:32