簡體   English   中英

如何使深度學習訓練錯誤收斂?

[英]How to make deep learning training error getting converge?

我的任務是創建一個深度學習模型,該模型將在插入圖像時返回具有某些特征的坐標值。 特征點在視覺上是_ | | 看起來像這樣。 我通過將下面的線段與中心線段相交的點視為真實值來實現它。

當我使用圖像處理和計算機視覺算法對其進行檢測時,它可以很好地工作,並且我開始研究並嘗試進行深度運行,但是我在這里沒有感覺,並且也沒有比我預期的更好。

近似的模型結構是:輸入圖像為[140,240],最大池化為2X2,概率下降為0.5,在本地連接的圖層中,權重設置為3×3,以便形成[8、12、32、48 ]通道,在完全連接的層中,我們將其連接到62-> 2(輸出)。 normal_random初始化,adam優化,0.001學習率。 我們總共有3500個數據,只有10%的數據用作測試集。

第一個問題是何時獲得坐標值。 我通常使用0〜1進行歸一化。 有什么問題嗎? 圖像尺寸為[140,240]。

其次 ,通常,如果訓練過程中的誤差是發散性的,那么模型結構的主要問題是什么? 我將紀元設置為30,然后將其打開,在第十個時,它至少達到了0.3,並且差異很大...

謝謝。

import croping
import tensorflow as tf
tf.set_random_seed(777)  # reproducibility

# hyper parameters
learning_rate = 0.001
training_epochs = 30
batch_size = 100

data = croping.getData()

class Model:
    def __init__(self, sess, name):
        self.sess = sess
        self.name = name
        self._build_net()

    def _build_net(self):
        with tf.variable_scope(self.name):
            # dropout (keep_prob) rate  0.7~0.5 on training, but should be 1
            # for testing
            self.keep_prob = tf.placeholder(tf.float32)

            # input place holders
            self.X = tf.placeholder(tf.float32, [None, 143, 240]) # x,y 방향 확인
            # img 28x28x1 (black/white)
            X_img = tf.reshape(self.X, [-1, 143, 240, 1])
            self.Y = tf.placeholder(tf.float32, [None, 2])

            # L1 ImgIn shape=(?, 143, 240, 1)
            W1 = tf.Variable(tf.random_normal([3, 3, 1, 8], stddev=0.01))
            #    Conv     -> (?, 143, 240, 8)
            #    Pool     -> (?, 72, 120, 8)
            L1 = tf.nn.conv2d(X_img, W1, strides=[1, 1, 1, 1], padding='SAME')
            L1 = tf.nn.relu(L1)
            L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1],
                                strides=[1, 2, 2, 1], padding='SAME')
            L1 = tf.nn.dropout(L1, keep_prob=self.keep_prob)

            # L2 ImgIn shape=(?, 72, 120, 8)
            W2 = tf.Variable(tf.random_normal([3, 3, 8, 12], stddev=0.01))
            #    Conv      ->(?, 72, 120, 12)
            #    Pool      ->(?, 36, 60, 12)
            L2 = tf.nn.conv2d(L1, W2, strides=[1, 1, 1, 1], padding='SAME')
            L2 = tf.nn.relu(L2)
            L2 = tf.nn.max_pool(L2, ksize=[1, 2, 2, 1],
                                strides=[1, 2, 2, 1], padding='SAME')
            L2 = tf.nn.dropout(L2, keep_prob=self.keep_prob)

            # L3 ImgIn shape=(?, 36, 60, 12)
            W3 = tf.Variable(tf.random_normal([3, 3, 12, 20], stddev=0.01))
            #    Conv      ->(?, 36, 60, 20)
            #    Pool      ->(?, 18, 30, 20)
            L3 = tf.nn.conv2d(L2, W3, strides=[1, 1, 1, 1], padding='SAME')
            L3 = tf.nn.relu(L3)
            L3 = tf.nn.max_pool(L3, ksize=[1, 2, 2, 1], strides=[
                1, 2, 2, 1], padding='SAME')
            L3 = tf.nn.dropout(L3, keep_prob=self.keep_prob)

            # L4 ImgIn shape=(?, 18, 30, 20)
            W4 = tf.Variable(tf.random_normal([3, 3, 20, 32], stddev=0.01))
            #    Conv      ->(?, 18, 30, 32)
            #    Pool      ->(?, 9, 15, 32)
            L4 = tf.nn.conv2d(L3, W4, strides=[1, 1, 1, 1], padding='SAME')
            L4 = tf.nn.relu(L4)
            L4 = tf.nn.max_pool(L4, ksize=[1, 2, 2, 1], strides=[
                1, 2, 2, 1], padding='SAME')
            L4 = tf.nn.dropout(L4, keep_prob=self.keep_prob)

            # L5 ImgIn shape=(?, 9, 15, 32)
            W5 = tf.Variable(tf.random_normal([3, 3, 32, 48], stddev=0.01))
            #    Conv      ->(?, 9, 15, 48)
            #    Pool      ->(?, 5, 8, 48)
            L5 = tf.nn.conv2d(L4, W5, strides=[1, 1, 1, 1], padding='SAME')
            L5 = tf.nn.relu(L5)
            L5 = tf.nn.max_pool(L5, ksize=[1, 2, 2, 1], strides=[
                1, 2, 2, 1], padding='SAME')
            L5 = tf.nn.dropout(L5, keep_prob=self.keep_prob)

            L5_flat = tf.reshape(L5, [-1, 5*8*48])

            W6 = tf.get_variable("W6", shape=[5 * 8 * 48, 64],
                                 initializer=tf.contrib.layers.xavier_initializer())
            b6 = tf.Variable(tf.random_normal([64]))
            L6 = tf.nn.relu(tf.matmul(L5_flat, W6) + b6)
            L6 = tf.nn.dropout(L6, keep_prob=self.keep_prob)

            W7 = tf.get_variable("W7", shape=[64, 2],
                                 initializer=tf.contrib.layers.xavier_initializer())
            b7 = tf.Variable(tf.random_normal([2]))
            self.logits = tf.matmul(L6, W7) + b7

            # define cost/loss & optimizer
        self.cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
          logits=self.logits, labels=self.Y))
        self.optimizer = tf.train.AdamOptimizer(
          learning_rate=learning_rate).minimize(self.cost)
        correct_prediction = tf.equal(
          tf.argmax(self.logits, 1), tf.argmax(self.Y, 1))
        self.accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    def predict(self, x_test, keep_prop=1.0):
        return self.sess.run(self.logits, feed_dict={self.X: x_test, self.keep_prob: keep_prop})

    def get_accuracy(self, x_test, y_test, keep_prop=1.0):
        return self.sess.run(self.accuracy, feed_dict={self.X: x_test, self.Y: y_test, self.keep_prob: keep_prop})

    def train(self, x_data, y_data, keep_prop=0.7):
        return self.sess.run([self.cost, self.optimizer], feed_dict={
            self.X: x_data, self.Y: y_data, self.keep_prob: keep_prop})

# initialize
sess = tf.Session()
m1 = Model(sess, "m1")

sess.run(tf.global_variables_initializer())

print('Learning Started!')


# train my model
for epoch in range(training_epochs):
    avg_cost = 0
    total_batch = int(data.num_train / batch_size)

    for i in range(total_batch):
        batch_xs, batch_ys = data.next_batch(batch_size)
        c, _ = m1.train(batch_xs, batch_ys)
        avg_cost += c / total_batch

    print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))

print('Learning Finished!')

# Test model and check accuracy
print('Accuracy:', m1.get_accuracy(data.x_label, data.y_label))

對於歸一化的齒坯,這不是問題。

首先,我不會在必須推斷出協調體的網絡中使用最大池,因為最大池會破壞許多協調體信息,而應使用跨步卷積。

其次,您使用softmax交叉熵損失,這最適合分類,但是這里您不是在進行分類,而是在進行回歸,因此您應該使用更合適的損失,例如均方誤差平方。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM