具有TFRecord和Dataset的Tensorflow MNIST精度低

[英]Tensorflow MNIST with TFRecord and Dataset low accuracy

I converted the MNIST dataset using the script here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/how_tos/reading_data/convert_to_records.py 我使用以下脚本转换了MNIST数据集: https : //github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/how_tos/reading_data/convert_to_records.py

Below is the code that I use to read the TFRecord, build the model, and train. 下面是我用来读取TFRecord,构建模型和训练的代码。

import tensorflow as tf

epoch = 20

n_hidden_1 = 256 # 1st layer number of neurons
n_hidden_2 = 256 # 2nd layer number of neurons
num_input = 784 # MNIST data input (img shape: 28*28)
num_classes = 10 # MNIST total classes (0-9 digits)

def parse_func(serialized_data):
    keys_to_features = {'image_raw': tf.FixedLenFeature([],tf.string),
                        'label': tf.FixedLenFeature([], tf.int64)}

    parsed_features = tf.parse_single_example(serialized_data, keys_to_features)
    prices = tf.decode_raw(parsed_features['image_raw'],tf.float32)
    label = tf.cast(parsed_features['label'], tf.int32)
    return prices,tf.one_hot(label - 1, 10)

def input_fn(filenames):
    dataset = tf.data.TFRecordDataset(filenames=filenames)
    dataset = dataset.map(parse_func,num_parallel_calls=8)
    dataset = dataset.batch(BATCH_SIZE).prefetch(50)
    # dataset = dataset.shuffle(2000)

    return dataset.make_initializable_iterator()

weights = {
    'h1': tf.Variable(tf.random_normal([num_input, n_hidden_1])),
    'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_hidden_2, num_classes]))
biases = {
    'b1': tf.Variable(tf.random_normal([n_hidden_1])),
    'b2': tf.Variable(tf.random_normal([n_hidden_2])),
    'out': tf.Variable(tf.random_normal([num_classes]))

# Create model
def neural_net(x):
    # Hidden fully connected layer with 256 neurons
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    # Hidden fully connected layer with 256 neurons
    layer_1 = tf.nn.relu(layer_1)
    layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
    layer_2 = tf.nn.relu(layer_2)
    # Output fully connected layer with a neuron for each class
    out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
    return out_layer

def inference(input):
    input = tf.reshape(input,[-1,784])
    dense = tf.layers.dense(inputs=input, units=1024, activation=tf.nn.relu)

    # Logits Layer
    output = tf.layers.dense(inputs=dense, units=10)
    return output

train_iter = input_fn('train_mnist.tfrecords')
valid_iter = input_fn('validation_mnist.tfrecords')

is_training  = tf.placeholder(shape=[],dtype=tf.bool)

img,labels = tf.cond(is_training,lambda :train_iter.get_next(),lambda :valid_iter.get_next())
# img,labels = train_iter.get_next()

logits = neural_net(img)
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=labels))
train_op = tf.train.AdamOptimizer().minimize(loss_op)

prediction = tf.nn.softmax(logits)
correct_pred = tf.equal(tf.argmax(prediction, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, "float"))

with tf.Session() as sess:
    for e in range(epoch):
        epoch_loss = 0
        count = 0
        while True:
                count +=1
                _,c = sess.run([train_op,loss_op],feed_dict={is_training:True})
                epoch_loss += c
            except tf.errors.OutOfRangeError:

        print('Epoch', e, ' completed out of ', epoch, ' Epoch loss: ',epoch_loss,' count :',count)

        total_acc = 0
        count = 0
        while True:
                count += 1
                acc = sess.run(accuracy,feed_dict={is_training:False})
                total_acc += acc
            except tf.errors.OutOfRangeError:

        print('Accuracy: ', total_acc/count,' count ',count)

I don't know if I did anything wrong, but the loss and accuracy are not improved after a few epochs. 我不知道我做错了什么,但是经过几个时期后,损失和准确性都没有得到改善。 I tested the model above with the traditional way, the feed_dict method. 我使用传统方法feed_dict方法测试了上面的模型。 Everything worked fine, I could reach 85% accuracy with that model. 一切工作正常,使用该模型我可以达到85%的准确性。 Here is the output of the code above 这是上面代码的输出

Epoch 0  completed out of  20  Epoch loss:  295472940.19140625  count : 1720
Accuracy:  0.5727848101265823  count  158
Epoch 1  completed out of  20  Epoch loss:  2170057598.328125  count : 1720
Accuracy:  0.22231012658227847  count  158
Epoch 2  completed out of  20  Epoch loss:  6578130587.9375  count : 1720
Accuracy:  0.29944620253164556  count  158
Epoch 3  completed out of  20  Epoch loss:  13321823489.0  count : 1720
Accuracy:  0.13310917721518986  count  158
Epoch 4  completed out of  20  Epoch loss:  22460952288.75  count : 1720
Accuracy:  0.20787183544303797  count  158
Epoch 5  completed out of  20  Epoch loss:  34615459125.0  count : 1720
Accuracy:  0.28560126582278483  count  158
Epoch 6  completed out of  20  Epoch loss:  50057282083.0  count : 1720
Accuracy:  0.11748417721518987  count  158  

I checked the output of the Dataset. 我检查了数据集的输出。 Everything look normal and have correct shape. 一切看起来正常,形状正确。 Can somebody point out what I did wrong here ? 有人可以指出我在这里做错了什么吗?

EDIT This is the working code, which uses the traditional feed_dict method 编辑这是工作代码,它使用传统的feed_dict方法

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

import tensorflow as tf

epoch = 5

# Network Parameters
n_hidden_1 = 256 # 1st layer number of neurons
n_hidden_2 = 256 # 2nd layer number of neurons
num_input = 784 # MNIST data input (img shape: 28*28)
num_classes = 10 # MNIST total classes (0-9 digits)

# tf Graph input
X = tf.placeholder("float", [None, num_input])
Y = tf.placeholder("float", [None, num_classes])

# Store layers weight & bias
weights = {
    'h1': tf.Variable(tf.random_normal([num_input, n_hidden_1])),
    'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_hidden_2, num_classes]))
biases = {
    'b1': tf.Variable(tf.random_normal([n_hidden_1])),
    'b2': tf.Variable(tf.random_normal([n_hidden_2])),
    'out': tf.Variable(tf.random_normal([num_classes]))

# Create model
def neural_net(x):
    # Hidden fully connected layer with 256 neurons
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    # Hidden fully connected layer with 256 neurons
    layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
    # Output fully connected layer with a neuron for each class
    out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
    return out_layer

# Construct model
logits = neural_net(X)
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=Y))
train_op = tf.train.AdamOptimizer().minimize(loss_op)

prediction = tf.nn.softmax(logits)
correct_pred = tf.equal(tf.argmax(prediction, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, "float"))

# Start training
with tf.Session() as sess:

    # Run the initializer

    for e in range(epoch):
        epoch_loss = 0
        for _ in range(int(mnist.train.num_examples / BATCH_SIZE)):
            epoch_x, epoch_y = mnist.train.next_batch(BATCH_SIZE)
            _, c = sess.run([train_op, loss_op], feed_dict={X: epoch_x, Y: epoch_y})
            epoch_loss += c

        print('Epoch', e, ' completed out of ', epoch, ' Epoch loss: ', epoch_loss)

        # Calculate accuracy for MNIST test images
        print("Testing Accuracy:",sess.run(accuracy, feed_dict={X: mnist.test.images,Y: mnist.test.labels}))

Without seeing your tfrecords files it's difficult to say for sure, but if your data is sorted according to label (ie the first 10% of labels are 0s, the second 10% are 1s etc) then failing to shuffle will have a significant effect on your results. 没有看到您的tfrecords文件,很难确定,但是,如果您的数据是根据标签排序的(即标签的前10%为0,后10%为1等),那么洗牌失败将对您的结果。 57% accuracy after a single epoch also seems quite surprising (though I've never looked at results at that point), so it's possible your evaluation metric (accuracy) isn't correct (though I can't see anything clearly wrong). 单个时期之后的57%的准确性似乎也很令人惊讶(尽管我当时从未看过结果),因此您的评估指标(准确性)可能不正确(尽管我看不到任何明显错误的地方)。

If you haven't visualized your inputs (ie the actual images and labels, not just the shape) definitely do that as a first step. 如果您没有可视化您的输入(即实际的图像和标签,而不仅仅是形状),那么绝对可以将其作为第一步。

Quite apart from your question, one clear weakness of your code is the lack of non-linearities - a linear layer followed immediately by a linear layer is equivalent to a linear layer. 除了您的问题外,代码的一个明显的弱点是缺少非线性-线性层紧随其后的是线性层,等同于线性层。 To get more complex behaviour/better results, add a non-linearity eg tf.nn.relu after each layer apart from the last, eg 为了获得更复杂的行为/更好的结果, tf.nn.relu在除最后一层之外的每一层之后添加非线性,例如tf.nn.relu ,例如

layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_1 = tf.nn.relu(layer_1)

Finally, prefetch ing a large number of dataset elements defeats the purpose of prefetching . 最后, prefetch大量数据集元素会破坏prefetching的目的。 1 or 2 is generally enough. 12通常就足够了。

@Thien , I downloaded all your files and ran them to generate the tfrecords and then load the tf records. @Thien,我下载了所有文件并运行它们以生成tfrecords,然后加载tf记录。 I inspected your tf records and the image batch returns a shape of 32,194 (which is 14x14 , not 28x28). 我检查了您的tf记录,图像批次返回了32,194的形状(这是14x14,而不是28x28)。 I then used matplotlib to look at the images and they don't look like digits at all and do not look like the original mnist data. 然后,我使用matplotlib来查看图像,它们看起来根本不像数字,也不像原始的mnist数据。 Your encoding/decoding into tfrecords is the problem. 您的编码/解码为tfrecords是问题。 Consider writing an encoding function for your tf records, a decoding function for your tf records, and then testing that tfdecode( tfencode( a ) ) == a. 考虑为您的tf记录编写一个编码函数,为您的tf记录编写一个解码函数,然后测试tfdecode(tfencode(a))== a。

    x,y = train_iter.get_next()
    a = sess.run(x)
    import matplotlib.pyplot as plt
    plt.imshow( a[0].reshape(14,14) )


I found my mistake. 我发现了我的错误。 In the parse function, I decode the label into one hot vector by using 在解析功能中,我通过使用将标签解码为一个热矢量

tf.one_hot(label - 1, 10)

It should be 它应该是

tf.one_hot(label, 10)

