简体   繁体   English

使用TensorFlow的简单前馈神经网络将无法学习

[英]Simple Feedforward Neural Network with TensorFlow won't learn

I am trying to build a simple neural network with TensorFlow. 我正在尝试使用TensorFlow构建一个简单的神经网络。 The goal is to find the center of a rectangle in a 32 pixel x 32 pixel image. 目标是在32像素x 32像素图像中找到矩形的中心。 The rectangle is described by five vectors. 矩形由五个向量描述。 The first vector is the position vector, the other four are direction vectors and make up the rectangle edges. 第一个矢量是位置矢量,另外四个是方向矢量,构成矩形边。 One vector has two values (x and y). 一个向量具有两个值(x和y)。

在此输入图像描述

The corresponding input for this image would be (2,5)(0,4)(6,0)(0,-4)(-6,0) . 该图像的相应输入将是(2,5)(0,4)(6,0)(0,-4)( - 6,0) The center (and therefore the desired output) is located at (5,7) . 中心(因此所需的输出)位于(5,7)

The code I came up with looks like the following: 我想出的代码如下所示:

import tensorflow as tf 
    import numpy as np
    import Rectangle_Records

    def init_weights(shape):
        """ Weight initialization """
        weights = tf.random_normal(shape, stddev=0.1)
        return tf.Variable(weights)

    def forwardprop(x, w_1, w_2):
        """ Forward-propagation """
        h = tf.nn.sigmoid(tf.matmul(x, w_1))
        y_predict = tf.matmul(h, w_2)
        return y_predict

    def main():
        x_size = 10
        y_size = 2
        h_1_size = 256

        # Prepare input data
        input_data = Rectangle_Records.DataSet()

        x = tf.placeholder(tf.float32, shape = [None, x_size])
        y_label = tf.placeholder(tf.float32, shape = [None, y_size])

        # Weight initializations
        w_1 = init_weights((x_size, h_1_size))
        w_2 = init_weights((h_1_size, y_size))

        # Forward propagation
        y_predict = forwardprop(x, w_1, w_2)

        # Backward propagation
        cost = tf.reduce_mean(tf.square(y_predict - y_label))

        updates = tf.train.GradientDescentOptimizer(0.01).minimize(cost)

        # Run
        sess = tf.Session()
        init = tf.global_variables_initializer()
        sess.run(init)

        for i in range(200):
            batch = input_data.next_batch(10)
            sess.run(updates, feed_dict = {x: batch[0], y_label: batch[1]})

        sess.close()

    if __name__ == "__main__":
        main()

Sadly, the network won't learn properly. 可悲的是,网络无法正常学习。 The result is too far off. 结果太过分了。 For example, [[ 3.74561882 , 3.70766664]] when it should be arround [[ 3. , 7.]]. 例如,[[3.74561882,3.70766664]]应该是[[3.,7.]]。 What am I doing wrong? 我究竟做错了什么?

The main problem is your whole training is done only for one epoch , thats not enough training. 主要问题是你的整个训练只进行了one epoch ,那就是没有足够的训练。 Try the following changes: 请尝试以下更改:

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
for j in range(30):
    input_data = Rectangle_Records.DataSet()
    for i in range(200):
        batch = input_data.next_batch(10)
        loss, _ = sess.run([cost,updates], feed_dict = {x: batch[0], y_label: batch[1]})

    pred = sess.run(y_predict, feed_dict={x: batch[0]})
    print('Cost:', loss  )
    print('pred:', pred)
    print('actual:', batch[1])
sess.close()

Change your optimizer to a momentum optimizer for faster convergence: tf.train.AdamOptimizer(0.01).minimize(cost) 将优化器更改为动量优化器以实现更快的收敛: tf.train.AdamOptimizer(0.01).minimize(cost)

You have forgotten to add bias. 你忘了添加偏见。

def init_bias(shape):
    biases = tf.random_normal(shape)
    return tf.Variable(biases)

def forwardprop(x, w_1, w_2, b_1, b_2):
    """ Forward-propagation """
    h = tf.nn.sigmoid(tf.matmul(x, w_1) + b_1)
    y_predict = tf.matmul(h, w_2) + b_2
    return y_predict

Inside main change it to this 内部主要改变它

w_1 = init_weights((x_size, h_1_size))
w_2 = init_weights((h_1_size, y_size))
b_1 = init_bias((h_1_size,))
b_2 = init_bias((y_size,))

# Forward propagation
y_predict = forwardprop(x, w_1, w_2, b_1, b_2)

This will give you much better accuracy. 这将为您提供更好的准确性。 You can then try adding more layers, try different activation functions etc. as mentioned above to improve it furthermore. 然后,您可以尝试添加更多层,尝试不同的激活功能等,如上所述,以进一步改进它。

There are lots of ways to improve the performance of a neural net. 有很多方法可以改善神经网络的性能。 try one or more of the following: 尝试以下一项或多项:

  1. add more layers, or more nodes per layer 每层添加更多图层或更多节点
  2. change your activation function (I've found relu to be quite effective) 改变你的激活功能(我发现relu非常有效)
  3. use an ensemble of NNs where each NN gets a vote weighted by its R^2 score 使用一组NN,其中每个NN通过其R ^ 2得分加权
  4. bring in more training data 带来更多的培训数据
  5. perform a grid search to optimize parameters 执行网格搜索以优化参数

The problem your network is to learn solving looks so easy that even single layer two-neuron perceptron should be able to solve. 您的网络要学习解决的问题看起来很容易,甚至单层双神经元感知器应该能够解决。 ReLU activation function could be the best as the problem is linear. ReLU激活功能可能是最好的,因为问题是线性的。

200 iterations isn't much. 200次迭代并不多。 Try more iterations, like 1000 or more. 尝试更多迭代,例如1000或更多。 Print the cost value every 100 iterations for example or gather data and plot at the end to see how was the learnig progressing. 例如,每100次迭代打印成本值,或者收集数据并在最后绘制,以了解学习进展如何。

import matplotlib.pyplot as plt
cost_history = np.arange(learning_steps, dtype=np.float)
...
for epoch in range(learning_steps):
  ...
  cost_history[epoch] = sess.run(cost, feed_dict = {y: predict, y_:label})

plt.plot(cost_history, 'r', label='Cost fn')
plt.yscale('log')
plt.show()

If the line goes down it's fine. 如果线路下降,那很好。 If it's very rough and doesn't descend, then learning speed might be too large. 如果它非常粗糙并且没有下降,那么学习速度可能太大。 In yor case the learning speed is quite low and that's why you don't have fine results after as few as 200 iterations. 在这种情况下,学习速度非常低,这就是为什么在200次迭代后你没有很好的结果。 Try larger value instead, like 0.1 or even more. 尝试更大的值,比如0.1甚至更多。 The NN may still converge. NN可能仍然趋同。 And watch the learning curve. 并观察学习曲线。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM