![](/img/trans.png)
[英]Tensorflow 2: Why the same `name` of `W`s and `b`s in Multi-Layer Perceptron Neural Network?
[英]tensorflow neural network multi layer perceptron for regression example
我正在嘗試用TensorFlow寫一個MLP (我剛開始學習,為代碼道歉!)多變量回歸 (請不要MNIST)。 這是我的MWE,我選擇使用sklearn中的linnerud數據集。 (實際上我使用的是更大的數據集,在這里我只使用一層,因為我想讓MWE更小,但如果需要我可以添加)。 順便說一下,我在train_test_split
使用shuffle = False
,因為實際上我正在使用時間序列數據集。
MWE
######################### import stuff ##########################
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.datasets import load_linnerud
from sklearn.model_selection import train_test_split
######################## prepare the data ########################
X, y = load_linnerud(return_X_y = True)
X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle = False, test_size = 0.33)
######################## set learning variables ##################
learning_rate = 0.0001
epochs = 100
batch_size = 3
######################## set some variables #######################
x = tf.placeholder(tf.float32, [None, 3], name = 'x') # 3 features
y = tf.placeholder(tf.float32, [None, 3], name = 'y') # 3 outputs
# input-to-hidden layer1
W1 = tf.Variable(tf.truncated_normal([3,300], stddev = 0.03), name = 'W1')
b1 = tf.Variable(tf.truncated_normal([300]), name = 'b1')
# hidden layer1-to-output
W2 = tf.Variable(tf.truncated_normal([300,3], stddev = 0.03), name= 'W2')
b2 = tf.Variable(tf.truncated_normal([3]), name = 'b2')
######################## Activations, outputs ######################
# output hidden layer 1
hidden_out = tf.nn.relu(tf.add(tf.matmul(x, W1), b1))
# total output
y_ = tf.nn.relu(tf.add(tf.matmul(hidden_out, W2), b2))
####################### Loss Function #########################
mse = tf.losses.mean_squared_error(y, y_)
####################### Optimizer #########################
optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(mse)
###################### Initialize, Accuracy and Run #################
# initialize variables
init_op = tf.global_variables_initializer()
# accuracy for the test set
accuracy = tf.reduce_mean(tf.square(tf.subtract(y, y_))) # or could use tf.losses.mean_squared_error
#run
with tf.Session() as sess:
sess.run(init_op)
total_batch = int(len(y_train) / batch_size)
for epoch in range(epochs):
avg_cost = 0
for i in range(total_batch):
batch_x, batch_y = X_train[i*batch_size:min(i*batch_size + batch_size, len(X_train)), :], y_train[i*batch_size:min(i*batch_size + batch_size, len(y_train)), :]
_, c = sess.run([optimizer, mse], feed_dict = {x: batch_x, y: batch_y})
avg_cost += c / total_batch
print('Epoch:', (epoch+1), 'cost =', '{:.3f}'.format(avg_cost))
print(sess.run(mse, feed_dict = {x: X_test, y:y_test}))
這打印出這樣的東西
...
Epoch: 98 cost = 10992.617
Epoch: 99 cost = 10992.592
Epoch: 100 cost = 10992.566
11815.1
顯然有一些錯誤。 我懷疑問題是成本函數/准確性或我使用批次的方式,但我無法弄明白..
據我所知,該模型正在學習。 我嘗試調整一些超參數(最重要的是 - 學習率和隱藏的圖層大小)並獲得更好的結果。 這是完整的代碼:
######################### import stuff ##########################
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.datasets import load_linnerud
from sklearn.model_selection import train_test_split
######################## prepare the data ########################
X, y = load_linnerud(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, shuffle=False)
######################## set learning variables ##################
learning_rate = 0.0005
epochs = 2000
batch_size = 3
######################## set some variables #######################
x = tf.placeholder(tf.float32, [None, 3], name='x') # 3 features
y = tf.placeholder(tf.float32, [None, 3], name='y') # 3 outputs
# hidden layer 1
W1 = tf.Variable(tf.truncated_normal([3, 10], stddev=0.03), name='W1')
b1 = tf.Variable(tf.truncated_normal([10]), name='b1')
# hidden layer 2
W2 = tf.Variable(tf.truncated_normal([10, 3], stddev=0.03), name='W2')
b2 = tf.Variable(tf.truncated_normal([3]), name='b2')
######################## Activations, outputs ######################
# output hidden layer 1
hidden_out = tf.nn.relu(tf.add(tf.matmul(x, W1), b1))
# total output
y_ = tf.nn.relu(tf.add(tf.matmul(hidden_out, W2), b2))
####################### Loss Function #########################
mse = tf.losses.mean_squared_error(y, y_)
####################### Optimizer #########################
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(mse)
###################### Initialize, Accuracy and Run #################
# initialize variables
init_op = tf.global_variables_initializer()
# accuracy for the test set
accuracy = tf.reduce_mean(tf.square(tf.subtract(y, y_))) # or could use tf.losses.mean_squared_error
# run
with tf.Session() as sess:
sess.run(init_op)
total_batch = int(len(y_train) / batch_size)
for epoch in range(epochs):
avg_cost = 0
for i in range(total_batch):
batch_x, batch_y = X_train[i * batch_size:min(i * batch_size + batch_size, len(X_train)), :], \
y_train[i * batch_size:min(i * batch_size + batch_size, len(y_train)), :]
_, c = sess.run([optimizer, mse], feed_dict={x: batch_x, y: batch_y})
avg_cost += c / total_batch
if epoch % 10 == 0:
print 'Epoch:', (epoch + 1), 'cost =', '{:.3f}'.format(avg_cost)
print sess.run(mse, feed_dict={x: X_test, y: y_test})
輸出:
Epoch: 1901 cost = 173.914
Epoch: 1911 cost = 171.928
Epoch: 1921 cost = 169.993
Epoch: 1931 cost = 168.110
Epoch: 1941 cost = 166.277
Epoch: 1951 cost = 164.492
Epoch: 1961 cost = 162.753
Epoch: 1971 cost = 161.061
Epoch: 1981 cost = 159.413
Epoch: 1991 cost = 157.808
482.433
我認為你可以進一步調整它,但由於數據太小,它沒有意義。 我沒有嘗試過正規化,但我確信你需要L2 reg或dropout以避免過度擬合。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.