I have implemented a very simple neural network in the torch
framework
def mlp(sizes, activation, output_activation=torch.nn.Identity):
layers = []
for j in range(len(sizes)-1):
act = activation if j < len(sizes)-1 else output_activation
layers += [torch.nn.Linear(sizes[j], sizes[j+1]), act()]
return torch.nn.Sequential(*layers)
In order to train a network to make regression on the function y=sin(x)
x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype)
y = torch.sin(x)
the training code is here
size = [1,20,20,1]
activation = torch.nn.ReLU
model = mlp(size, activation)
optimizer = torch.optim.SGD(model.parameters(), lr=0.002)
n_epoch = 600
mse_loss = torch.nn.MSELoss()
X = x.unsqueeze(-1)
for i in range(n_epoch):
y_pred = model(X)
step_loss = mse_loss(y_pred, y)
optimizer.zero_grad()
step_loss.backward()
optimizer.step()
Unfortunately, the network only learn an almost constant function $y=0$. I have already tried many things
But nothing seems to work. The problem is so simple that I think there is an error in the code.
I am not sure if this is the main cause, but the statement
act = activation if j < len(sizes)-1 else output_activation
appears to be logically incorrect. In the loop, j
can take values from 0
to len(sizes)-1
, so the condition is always true. This means that your network has a ReLU right at the end, and so can only ever give non-negative outputs. This can be corrected by changing that statement to:
act = activation if j < len(sizes)-2 else output_activation
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.