简体   繁体   English

使用Tensorflow的深度神经网络的准确性较低

[英]Low accuracy in Deep Neural Network with Tensorflow

I am following the third Jupyter notebook on Tensorflow examples . 我正在关注Tensorflow示例的第三个Jupyter笔记本。

Running problem 4, I tried to implement a function which builds automatically a number of hidden layers, without manually coding the configuration of each layer. 运行问题4,我尝试实现一个自动构建许多隐藏层的功能,而无需手动编码每个层的配置。

However, the model runs providing very low accuracy (10%) so I thought that maybe such function could not be compatible with the graph builder of Tensorflow. 但是,该模型运行时提供的准确性非常低(10%),因此我认为这种功能可能与Tensorflow的图形生成器不兼容。

My code is the following: 我的代码如下:

def hlayers(n_layers, n_nodes, i_size, a, r=0, keep_p=1):

  for i in range(n_layers):
    if i > 0:
      i_size = n_nodes
    w = tf.Variable(tf.truncated_normal([i_size, n_nodes]), name=f'W{i}')
    b = tf.Variable(tf.zeros([n_nodes]), name=f'b{i}')
    pa = tf.nn.relu(tf.add(tf.matmul(a, w), b))
    a = tf.nn.dropout(pa, keep_prob=keep_p, name=f'a{i}')
    r += tf.nn.l2_loss(w, name=f'r{i}')

  return a, r

batch_size = 128
num_nodes = 1024
beta = 0.01

graph = tf.Graph()
with graph.as_default():

  # Input data. For the training data, we use a placeholder that will be fed
  # at run time with a training minibatch.
  tf_train_dataset = tf.placeholder(
    tf.float32,
    shape=(batch_size, image_size * image_size),
    name='Dataset')
  tf_train_labels = tf.placeholder(
    tf.float32,
    shape=(batch_size, num_labels),
    name='Labels')
  tf_valid_dataset = tf.constant(valid_dataset)
  tf_test_dataset = tf.constant(test_dataset)

  keep_p = tf.placeholder(tf.float32, name='KeepProb')

  # Hidden layers.
  a, r = hlayers(
    n_layers=3,
    n_nodes=num_nodes,
    i_size=image_size * image_size,
    a=tf_train_dataset,
    keep_p=keep_p)

  # Output layer.
  wo = tf.Variable(tf.truncated_normal([num_nodes, num_labels]), name='Wo')
  bo = tf.Variable(tf.zeros([num_labels]), name='bo')
  logits = tf.add(tf.matmul(a, wo), bo, name='Logits')
  loss = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(
      labels=tf_train_labels, logits=logits))

  # Regularizer.
  regularizers = tf.add(r, tf.nn.l2_loss(wo))
  loss = tf.reduce_mean(loss + beta * regularizers, name='Loss')

  # Optimizer.
  optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)

  # Predictions for the training, validation, and test data.
  train_prediction = tf.nn.softmax(logits)

  a, _ = hlayers(
    n_layers=3,
    n_nodes=num_nodes,
    i_size=image_size * image_size,
    a=tf_valid_dataset)
  valid_prediction = tf.nn.softmax(tf.add(tf.matmul(a, wo), bo))

  a, _ = hlayers(
    n_layers=3,
    n_nodes=num_nodes,
    i_size=image_size * image_size,
    a=tf_test_dataset)
  test_prediction = tf.nn.softmax(tf.add(tf.matmul(a, wo), bo))

num_steps = 3001

with tf.Session(graph=graph) as session:
  tf.global_variables_initializer().run()
  print("Initialized")
  for step in range(num_steps):
    # Pick an offset within the training data, which has been randomized.
    # Note: we could use better randomization across epochs.
    offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
    # Generate a minibatch.
    batch_data = train_dataset[offset:(offset + batch_size), :]
    batch_labels = train_labels[offset:(offset + batch_size), :]
    # Prepare a dictionary telling the session where to feed the minibatch.
    # The key of the dictionary is the placeholder node of the graph to be fed,
    # and the value is the numpy array to feed to it.
    feed_dict = {
      tf_train_dataset : batch_data,
      tf_train_labels : batch_labels,
      keep_p : 0.5}
    _, l, predictions = session.run(
      [optimizer, loss, train_prediction], feed_dict=feed_dict)
    if (step % 500 == 0):
      print("Minibatch loss at step %d: %f" % (step, l))
      print("Minibatch accuracy: %.1f%%" % accuracy(predictions, batch_labels))
      print("Validation accuracy: %.1f%%" % accuracy(
        valid_prediction.eval(), valid_labels))
  print("Test accuracy: %.1f%%" % accuracy(test_prediction.eval(), test_labels))

Weight regularization is stronger with more layers. 随着层数的增加,权重调整更强。 Therefore you could try to reduce the regularization and see if the accuracy increases. 因此,您可以尝试减少正则化并查看准确性是否提高。

The problem was caused by nan in loss function and weights, as described in this question . 问题是由引起nan在损失函数和权重,如在描述这个问题

By introducing a different standard deviation for each weights tensor based on its dimensions (as described in this answer and originally in He et al. [1]) I was able to train successfully the network. 通过基于权重张量的大小为每个权重张量引入不同的标准偏差( 如此答案所述 ,最初是He 等人 [1]),我能够成功地训练网络。

[1]: He et al. [1]:He 等。 (2015) Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification (2015) 深入研究整流器:在ImageNet分类上超越人类水平的表现

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM