訓練精度在張量流中下降

Question

我試圖創建一個用於字符識別的模型。 該模型在28 * 28數據集上工作正常，對於0-9的字符，但如果將其更改為64 * 64，字符范圍從0-9，az，AZ，則訓練精度會下降。 在精度上進行迭代時，直到0.3，然后才保持在該位置。 我也嘗試使用不同的數據集進行訓練，但是同一件事正在發生。 將學習率更改為0.001也無濟於事。 誰能告訴我這是什么問題？

import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import random as ran
import os
import tensorflow as tf

def TRAIN_SIZE(num):
    images = np.load("data/train/images64.npy").reshape([2852,4096])
    labels = np.load("data/train/labels.npy")
    print ('Total Training Images in Dataset = ' + str(images.shape))
    print ('--------------------------------------------------')
    x_train = images[:num,:]
    print ('x_train Examples Loaded = ' + str(x_train.shape))
    y_train = labels[:num,:]
    print ('y_train Examples Loaded = ' + str(y_train.shape))
    print('')
    return x_train, y_train

def TEST_SIZE(num):
    images = np.load("data/test/images64.npy").reshape([558,4096])
    labels = np.load("data/test/labels.npy")
    print ('Total testing Images in Dataset = ' + str(images.shape))
    print ('--------------------------------------------------')
    x_test = images[:num,:]
    print ('x_test Examples Loaded = ' + str(x_test.shape))
    y_test = labels[:num,:]
    print ('y_test Examples Loaded = ' + str(y_test.shape))
    print('')
    return x_test, y_test

def display_digit(num):
    # print(y_train[num])
    label = y_train[num].argmax(axis=0)
    image = x_train[num].reshape([64,64])
    # plt.axis("off")
    plt.title('Example: %d  Label: %d' % (num, label))
    plt.imshow(image, cmap=plt.get_cmap('gray_r'))
    plt.show()

def display_mult_flat(start, stop):
    images = x_train[start].reshape([1,4096])
    for i in range(start+1,stop):
        images = np.concatenate((images, x_train[i].reshape([1,4096])))
    plt.imshow(images, cmap=plt.get_cmap('gray_r'))
    plt.show()

def get_char(a):
    if(a<10):
        return a
    elif(a>=10 and a<36):
        return chr(a+55)
    else:
        return chr(a+61)

x_train, y_train = TRAIN_SIZE(2850)
x_test, y_test = TRAIN_SIZE(1900)

x = tf.placeholder(tf.float32, shape=[None, 4096])           
y_ = tf.placeholder(tf.float32, shape=[None, 62])
W = tf.Variable(tf.zeros([4096,62]))
b = tf.Variable(tf.zeros([62]))
y = tf.nn.softmax(tf.matmul(x,W) + b)

with tf.Session() as sess:

    # x_test = x_test[1400:,:]
    # y_test = y_test[1400:,:]
    x_test, y_test =TEST_SIZE(400)
    LEARNING_RATE = 0.2
    TRAIN_STEPS = 1000

    sess.run(tf.global_variables_initializer())
    cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
    training = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cross_entropy)
    correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    for i in range(TRAIN_STEPS+1):
        sess.run(training, feed_dict={x: x_train, y_: y_train}) 
        if i%100 == 0:
            print('Training Step:' + str(i) + '  Accuracy =  ' + str(sess.run(accuracy, feed_dict={x: x_test, y_: y_test})) + '  Loss = ' + str(sess.run(cross_entropy, {x: x_train, y_: y_train})))

    savedPath = tf.train.Saver().save(sess, "/tmp/model.ckpt")
    print("Model saved at: " ,savedPath)

Answer 1

您正在嘗試對62個不同的數字和字符進行分類，但是要使用一個完全連接的層來做到這一點。 您的模型沒有足夠的參數來完成該任務。 換句話說，您正在擬合數據不足。 因此，要么通過添加參數（層）來擴展您的網絡和/或使用CNN（通常對於圖像分類任務具有良好的性能）。

Answer 2

嘗試不同的CNN模式。 您使用的模型，例如Inception v1，v2，v3 alexnet等。

訓練精度在張量流中下降

問題描述

2 個解決方案

解決方案1
2 2018-06-08 10:04:04

解決方案2
0 2018-06-08 09:47:16

訓練精度在張量流中下降

問題描述

2 個解決方案

解決方案1 2 2018-06-08 10:04:04

解決方案2 0 2018-06-08 09:47:16

解決方案1
2 2018-06-08 10:04:04

解決方案2
0 2018-06-08 09:47:16