简体   繁体   中英

Torch neural network does not train

I have implemented a very simple neural network in the torch framework

def mlp(sizes, activation, output_activation=torch.nn.Identity):
layers = []
for j in range(len(sizes)-1):
    act = activation if j < len(sizes)-1 else output_activation
    layers += [torch.nn.Linear(sizes[j], sizes[j+1]), act()]
return torch.nn.Sequential(*layers)

In order to train a network to make regression on the function y=sin(x)

x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype)
y = torch.sin(x)

the training code is here

size = [1,20,20,1]
activation = torch.nn.ReLU
model = mlp(size, activation)

optimizer = torch.optim.SGD(model.parameters(), lr=0.002)

n_epoch = 600
mse_loss = torch.nn.MSELoss()
X = x.unsqueeze(-1)
for i in range(n_epoch):
    y_pred = model(X)
    step_loss = mse_loss(y_pred, y)
    optimizer.zero_grad()
    step_loss.backward()
    optimizer.step()

Unfortunately, the network only learn an almost constant function $y=0$. I have already tried many things

  1. Change Hyperparameters of the network
  2. Add mini batches in training
  3. Change the number of epochs and learning rate

But nothing seems to work. The problem is so simple that I think there is an error in the code.

I am not sure if this is the main cause, but the statement

act = activation if j < len(sizes)-1 else output_activation

appears to be logically incorrect. In the loop, j can take values from 0 to len(sizes)-1 , so the condition is always true. This means that your network has a ReLU right at the end, and so can only ever give non-negative outputs. This can be corrected by changing that statement to:

act = activation if j < len(sizes)-2 else output_activation

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM