为什么我的网络无法学习？

Question

So I created a convolutional network in tensorflow but the accuracy will not change at all. 所以我在张量流中创建了一个卷积网络，但是精度根本不会改变。 I am trying to get it to tell the difference between triangles and circles. 我试图让它告诉三角形和圆形之间的区别。 They are different colors and similar sizes. 它们是不同的颜色和相似的大小。 This is the code for the network. 这是网络的代码。 Also when I tried with a fully connected network the accuracy was almost 1. 另外，当我尝试使用完全连接的网络时，精度几乎为1。

x = tf.placeholder("float", shape=[None, 3072])
y_ = tf.placeholder("float", shape=[None, 2])

W = tf.Variable(tf.zeros([3072, 2]))
b = tf.Variable(tf.zeros([2]))

y = tf.nn.softmax(tf.matmul(x,W) + b)

def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                           strides=[1, 2, 2, 1], padding='SAME')

W_conv1 = weight_variable([4, 4, 3, 32])
b_conv1 = bias_variable([32])

x_image = tf.reshape(x, [-1,32,32,3])

h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

W_conv2 = weight_variable([4, 4, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

W_fc1 = weight_variable([8 * 8 * 64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 8*8*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

W_fc2 = weight_variable([1024, 2])
b_fc2 = bias_variable([2])

y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv + 1e-9))
train_step = tf.train.AdamOptimizer(1e-6).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

self.feedin = np.reshape(self.inlist, (-1, 1, 32, 32, 3))
print(self.feedin)
sess.run(tf.initialize_all_variables())
for i in range(10000):
    j = i%int(self.ent)
    if i%100 == 0:
        train_accuracy = accuracy.eval(feed_dict={
            x:self.inlist, y_: self.outListm, keep_prob: 1.0})
        print("step %d, training accuracy %g"%(i, train_accuracy))
    train_step.run(feed_dict={x: self.feedin[j], y_: self.outListm[j], keep_prob: 0.5})

These are two of the images. 这是两个图像。

This is what I used to create self.in. 这就是我用来创建self.in的东西。 I have changed it so that the shape of the image remains however the problem is still there. 我已经对其进行了更改，以使图像的形状得以保留，但是问题仍然存在。

name = QtGui.QFileDialog.getOpenFileNames(self, 'Open File')
fname = [str(each) for each in name]
flist = []
dlist = []
self.inlist = [11, 32, 32, 3]
for n, val in enumerate(name):
    flist.append(val)
    img = Image.open(flist[n])
    img.load()
    data = np.asarray(img, dtype = "float32")
    dlist.append(data)
self.inlist = np.concatenate([arr[np.newaxis] for arr in dlist])

For self out I have a list with 2 elements where the first element is 2 if it is a triangle and the second element is 2 if it is a circle. 对于self out，我有一个包含2个元素的列表，其中第一个元素是2（如果是三角形），第二个元素是2（如果是圆形）。

Answer 1

You cannot initialize all parameters to zeros(or any constant), which is almost a commonsense for almost all kinds of neural network. 您不能将所有参数初始化为零（或任何常数），这几乎是几乎所有神经网络的常识。

Lets just imagine a simplest feed-forward network with all weight matrices initialized to the same constant(including but not just zero), what's gonna happen? 让我们想象一个最简单的前馈网络，其中所有权重矩阵都初始化为相同的常数（包括但不限于零），将会发生什么？ No matter what your input vector is, the activations(outputs) of all neurons in the same layer will be the same! 无论您的输入矢量是什么，同一层中所有神经元的激活（输出）都将是相同的！ which is definitely not what you want. 这绝对不是您想要的。 And in your case you initialize them all to be zeros, which makes it even worse. 在您的情况下，您将它们全部初始化为零，这使情况更糟。 since besides the downsides above, ReLU is not even derivable at point zero. 因为除了上述缺点之外，ReLU甚至无法在零点导出。

So the best practice for you is initializing your weight matrix(W) to random values so as to "break the symmetry". 因此，最佳实践是将权重矩阵（W）初始化为随机值，以“打破对称性”。 You can just do it by random.randn() , but there are many tricks to do this for even better performance, like Xavier initialiation, MSRA initialization, etc. For ReLU activation function in your case, one thing might guide your selection among all these initialization strategies is that you better initialize your weight matrix be slightly positive, in case that the input of ReLU function would be negative, which might make ReLU units become "dead" ones(gradients being zeros forever). 您可以通过random.randn()来完成此操作，但是有许多技巧可以实现更好的性能，例如Xavier初始化，MSRA初始化等。对于您的ReLU激活功能，一件事可能会指导您在所有方法中进行选择这些初始化策略是，最好在ReLU函数的输入为负的情况下将权重矩阵初始化为稍微为正，这可能会使ReLU单位变为“死”单位（渐变永远为零）。

Answer 2

因为您的1e-6学习率太低，所以每次训练的准确性都会提高得很少。

Answer 3

Like many people say, you cannot initialize weight parameters with zeros. 就像很多人说的那样，您不能使用零初始化权重参数。 The weights will always update with the same numerical values. 权重将始终使用相同的数值进行更新。

Therefore, we initialize with random values. 因此，我们使用随机值进行初始化。 In other comments, you ask how to do this. 在其他评论中，您询问如何执行此操作。 This piece is already in your code. 这部分已经在您的代码中。 Call your function weight_variable to obtain a weight matrix with random initialization. 调用函数weight_variable以获得具有随机初始化的权重矩阵。 Or, if you want to do it inline 或者，如果您想内联

tf.Variable(tf.truncated_normal(shape, stddev=0.1))

为什么我的网络无法学习？

问题描述

3 个解决方案

解决方案1
3 2016-05-12 01:22:00

解决方案2
1 2016-05-12 00:06:53

解决方案3
1 2016-05-12 05:41:13

为什么我的网络无法学习？

问题描述

3 个解决方案

解决方案1 3 2016-05-12 01:22:00

解决方案2 1 2016-05-12 00:06:53

解决方案3 1 2016-05-12 05:41:13

解决方案1
3 2016-05-12 01:22:00

解决方案2
1 2016-05-12 00:06:53

解决方案3
1 2016-05-12 05:41:13