简体   繁体   English

(MINIST ML 培训)为什么我的损失一直很高并且波动很大? 我的代码有什么问题?

[英](MINIST ML training) Why I keep getting high and fluctuating loss? What is wrong with my code?

I am using a google colaboratory and try to train the machine with MINIST dataset.我正在使用谷歌合作实验室并尝试使用 MINIST 数据集训练机器。 First column of the dataset is number labels (0~9).数据集的第一列是数字标签(0~9)。 the size of the dataset is 60000 x 785 (number label 1 + (28 x 28 =784)image)数据集的大小为 60000 x 785(编号 label 1 + (28 x 28 =784)图像)

Could someone please give me an advice what is wrong with my code?有人可以给我一个建议,我的代码有什么问题吗? I think I did good, but I keep getting high and fluctuating loss我认为我做得很好,但我一直在高涨和波动的损失

''' '''

#print(x_data.shape, y_data.shape)
#(60000, 784) (60000, 1)
xy_data = np.loadtxt('/content/drive/MyDrive/Machine-Learning Study/GAN/MNIST_data/mnist_train.csv', delimiter=',', dtype=np.float32)
xy_test = np.loadtxt('/content/drive/MyDrive/Machine-Learning Study/GAN/MNIST_data/mnist_test.csv', delimiter=',', dtype=np.float32)    

# 60000 x 785 array
# first column is number label (0 ~ 9)
x_data = xy_data[:, 1:]
y_data = xy_data[:, [0]]

nb_classes = 10

X = tf.placeholder(tf.float32, shape = [None, 784])
Y = tf.placeholder(tf.int32, shape = [None, nb_classes])

# used one_hot function to convert y_data [:, [-1]] to [:, 10]
Y_one_hot = tf.one_hot(y_data, nb_classes)
Y_one_hot = tf.reshape(Y_one_hot, [-1, nb_classes])
# since feed_dict cannot take tensor array, converting tensor to array so that we can plug the array into Y
# converting using .eavl only works in Tf 1 version
y_data_array = Y_one_hot.eval(session=tf.Session())

W = tf.Variable(tf.random_normal([784, nb_classes]))
b = tf.Variable(tf.random_normal([nb_classes]))

logits = tf.matmul(X, W) + b
hypothesis = tf.nn.softmax(logits)

# element-wise product loss function
loss_i = tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = Y_one_hot)
loss = tf.reduce_mean(loss_i)

optimizer = tf.train.GradientDescentOptimizer(learning_rate = 0.1).minimize(loss)

is_correct = tf.equal(tf.arg_max(hypothesis, 1), tf.arg_max(Y_one_hot, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))

training_epochs = 101

sess = tf.Session()
sess.run(tf.global_variables_initializer())

for epoch in range(training_epochs):
  loss_val, acc, _ = sess.run([loss, accuracy, optimizer], feed_dict={X:x_data, Y:y_data_array})
  if epoch % 5 == 0:
    print("Epochs: {:}\tLoss: {:.4f}\tAcc: {:.2%}".format(epoch, loss_val, acc))

''' '''


Results:结果:

Epochs: 0 Loss: 4227.7871 Acc: 9.71%时期:0 损失:4227.7871 累计:9.71%

Epochs: 5 Loss: 17390.2520 Acc: 41.26%时期:5 损失:17390.2520 累计:41.26%

Epochs: 10 Loss: 8494.0889 Acc: 52.81%时期:10 损失:8494.0889 累计:52.81%

Epochs: 15 Loss: 1412.1642 Acc: 82.48%时代:15 损失:1412.1642 累计:82.48%

Epochs: 20 Loss: 1620.4032 Acc: 82.48%时期:20 损失:1620.4032 累计:82.48%

Epochs: 25 Loss: 1891.1475 Acc: 81.31%时代:25 损失:1891.1475 累计:81.31%

Epochs: 30 Loss: 2770.4656 Acc: 77.99%时期:30 损失:2770.4656 累计:77.99%

Epochs: 35 Loss: 1659.1884 Acc: 79.90%时代:35 损失:1659.1884 累计:79.90%

Epochs: 40 Loss: 1134.2424 Acc: 84.61%时期:40 损失:1134.2424 累计:84.61%

Epochs: 45 Loss: 2560.7073 Acc: 80.17%时期:45 损失:2560.7073 累计:80.17%

Epochs: 50 Loss: 1440.0392 Acc: 82.33%时代:50 损失:1440.0392 累计:82.33%

Epochs: 55 Loss: 1219.5104 Acc: 83.87%时代:55 损失:1219.5104 累计:83.87%

Epochs: 60 Loss: 1002.9220 Acc: 86.11%时代:60 损失:1002.9220 累计:86.11%

Epochs: 65 Loss: 635.6382 Acc: 89.84%时期:65 损失:635.6382 累计:89.84%

Epochs: 70 Loss: 574.5991 Acc: 90.13%时期:70 损失:574.5991 累计:90.13%

Epochs: 75 Loss: 544.4010 Acc: 90.15%时期:75 损失:544.4010 累计:90.15%

Epochs: 80 Loss: 2215.5605 Acc: 80.56%时代:80 损失:2215.5605 累计:80.56%

Epochs: 85 Loss: 4700.1890 Acc: 77.99%时代:85 损失:4700.1890 累计:77.99%

Epochs: 90 Loss: 3243.2017 Acc: 78.18%时代:90 损失:3243.2017 Acc:78.18%

Epochs: 95 Loss: 1040.0907 Acc: 85.05%时代:95 损失:1040.0907 累计:85.05%

Epochs: 100 Loss: 1999.5754 Acc: 82.24%时代:100 损失:1999.5754 累计:82.24%

Your code is fine, the problem is with your high learning rate.你的代码很好,问题在于你的高学习率。

I worked out for lr=0.005 and monitored for 150 epochs, it works as you expect.我计算出lr=0.005并监控了 150 个 epoch,它按您的预期工作。

Epochs: 0   Loss: 3659.2244 Acc: 4.97%
Epochs: 5   Loss: 1218.3916 Acc: 30.38%
Epochs: 10  Loss: 767.9141  Acc: 46.95%
Epochs: 15  Loss: 582.4928  Acc: 55.63%
Epochs: 20  Loss: 480.8191  Acc: 61.28%
Epochs: 25  Loss: 416.9088  Acc: 65.28%
Epochs: 30  Loss: 372.9733  Acc: 68.19%
Epochs: 35  Loss: 340.5632  Acc: 70.34%
Epochs: 40  Loss: 315.6934  Acc: 72.09%
Epochs: 45  Loss: 296.0419  Acc: 73.57%
Epochs: 50  Loss: 280.1195  Acc: 74.72%
Epochs: 55  Loss: 266.9192  Acc: 75.74%
Epochs: 60  Loss: 255.7594  Acc: 76.58%
Epochs: 65  Loss: 246.1218  Acc: 77.29%
Epochs: 70  Loss: 237.6666  Acc: 77.91%
Epochs: 75  Loss: 230.2098  Acc: 78.47%
Epochs: 80  Loss: 223.5687  Acc: 79.02%
Epochs: 85  Loss: 217.6027  Acc: 79.42%
Epochs: 90  Loss: 212.1969  Acc: 79.81%
Epochs: 95  Loss: 207.2774  Acc: 80.16%
Epochs: 100 Loss: 202.7701  Acc: 80.53%
Epochs: 105 Loss: 198.6335  Acc: 80.86%
Epochs: 110 Loss: 194.8041  Acc: 81.12%
Epochs: 115 Loss: 191.2343  Acc: 81.38%
Epochs: 120 Loss: 187.8969  Acc: 81.59%
Epochs: 125 Loss: 184.7562  Acc: 81.78%
Epochs: 130 Loss: 181.7817  Acc: 81.98%
Epochs: 135 Loss: 178.9837  Acc: 82.20%
Epochs: 140 Loss: 176.3420  Acc: 82.36%
Epochs: 145 Loss: 173.8274  Acc: 82.53%

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在训练自动编码器时损失很高且恒定,我在这里做错了什么 - while training the autoencoders loss is high and constant, what am I doing wrong here 波动训练损失背后的直觉 - Intuition behind fluctuating training loss 我在训练(和测试)我的自动编码器时损失很大 - I am getting a very high loss in training (& testing) my auto-encoder 为什么训练准确率会波动? - Why is the training accuracy fluctuating? 该代码有什么问题,为什么该代码的损失没有减少? - What is wrong with this code, why the loss in this code is not reducing? 为什么我的决策树 ML 算法训练越来越完美? - Why am I getting perfect on my decision tree ML algorithm training? 我的纸浆(使用线性编程进行批量调整)代码中不断出现错误,这是什么问题? - I keep getting an error in my pulp (lot sizing with linear programming) code, what is wrong? 文本二元分类训练期间的波动损失 - Fluctuating loss during training for text binary classification 我的代码有什么问题? 我不断收到“字符串索引必须是整数” - What wrong with my code? I keep getting "string indices must be integers" 这个程序有什么问题? 为什么我在下面的 python 代码中出现运行时错误? - What's wrong with this program? Why I getting runtime error in my following python code?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM