神经网络 - 损失不收敛

Question

This network contains an input layer and an output layer, with no nonlinearities.该网络包含一个输入层和一个 output 层，没有非线性。 The output is just a linear combination of the input.I am using a regression loss to train the network. output 只是输入的线性组合。我使用回归损失来训练网络。 I generated some random 1D test data according to a simple linear function, with Gaussian noise added.我根据一个简单的线性 function 生成了一些随机的一维测试数据，并添加了高斯噪声。 The problem is that the loss function doesn't converge to zero.问题是损失 function 不会收敛到零。

import numpy as np
import matplotlib.pyplot as plt

n = 100
alp = 1e-4
a0 = np.random.randn(100,1) # Also x
y = 7*a0+3+np.random.normal(0,1,(100,1))

w = np.random.randn(100,100)*0.01
b = np.random.randn(100,1)

def compute_loss(a1,y,w,b):
       return np.sum(np.power(y-w*a1-b,2))/2/n

def gradient_step(w,b,a1,y):

    w -= (alp/n)*np.dot((a1-y),a1.transpose())
    b -= (alp/n)*(a1-y)  
    return w,b

loss_vec = []
num_iterations = 10000

for i in range(num_iterations):

    a1 = np.dot(w,a0)+b
    loss_vec.append(compute_loss(a1,y,w,b))
    w,b = gradient_step(w,b,a1,y)
plt.plot(loss_vec)

Answer 1

The convergence also depends on the value of alpha you use.收敛还取决于您使用的 alpha 值。 I played with your code a bit and for我玩了一下你的代码

alp = 5e-3

I get the following convergence plotted on a logarithmic x-axis我在对数 x 轴上绘制了以下收敛

plt.semilogx(loss_vec)

Output Output

Answer 2

If I understand your code correctly, you only have one weight matrix and one bias vector despite the fact that you have 2 layers.如果我正确理解您的代码，尽管您有 2 层，但您只有一个权重矩阵和一个偏置向量。 This is odd and might be at least part of your problem.这很奇怪，可能至少是您问题的一部分。

神经网络 - 损失不收敛

问题描述

2 个解决方案

解决方案1
1 2018-09-16 15:15:53

解决方案2
0 2018-09-16 15:00:55

神经网络 - 损失不收敛

问题描述

2 个解决方案

解决方案1 1 2018-09-16 15:15:53

解决方案2 0 2018-09-16 15:00:55

解决方案1
1 2018-09-16 15:15:53

解决方案2
0 2018-09-16 15:00:55