张量流神经网络多层感知器用于回归实例

Question

I am trying to write a MLP with TensorFlow (which I just started to learn, so apologies for the code!) for multivariate REGRESSION (no MNIST, please). 我正在尝试用TensorFlow写一个MLP （我刚开始学习，为代码道歉！）多变量回归（请不要MNIST）。 Here is my MWE, where I chose to use the linnerud dataset from sklearn. 这是我的MWE，我选择使用sklearn中的linnerud数据集。 (In reality I am using a much larger dataset, also here I am only using one layer because I wanted to make the MWE smaller, but I can add, if necessary). （实际上我使用的是更大的数据集，在这里我只使用一层，因为我想让MWE更小，但如果需要我可以添加）。 By the way I am using shuffle = False in train_test_split just because in reality I am working with a time series dataset. 顺便说一下，我在train_test_split使用shuffle = False ，因为实际上我正在使用时间序列数据集。

MWE MWE

######################### import stuff ##########################
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.datasets import load_linnerud
from sklearn.model_selection import train_test_split


######################## prepare the data ########################
X, y = load_linnerud(return_X_y = True)
X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle = False, test_size = 0.33)


######################## set learning variables ##################
learning_rate = 0.0001
epochs = 100
batch_size = 3


######################## set some variables #######################
x = tf.placeholder(tf.float32, [None, 3], name = 'x')   # 3 features
y = tf.placeholder(tf.float32, [None, 3], name = 'y')   # 3 outputs

# input-to-hidden layer1
W1 = tf.Variable(tf.truncated_normal([3,300], stddev = 0.03), name = 'W1')
b1 = tf.Variable(tf.truncated_normal([300]), name = 'b1')  

# hidden layer1-to-output
W2 = tf.Variable(tf.truncated_normal([300,3], stddev = 0.03), name=  'W2')    
b2 = tf.Variable(tf.truncated_normal([3]), name = 'b2')   


######################## Activations, outputs ######################
# output hidden layer 1
hidden_out = tf.nn.relu(tf.add(tf.matmul(x, W1), b1))   

# total output
y_ = tf.nn.relu(tf.add(tf.matmul(hidden_out, W2), b2)) 


####################### Loss Function  #########################
mse = tf.losses.mean_squared_error(y, y_)


####################### Optimizer      #########################
optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(mse)  


###################### Initialize, Accuracy and Run #################
# initialize variables
init_op = tf.global_variables_initializer()

# accuracy for the test set
accuracy = tf.reduce_mean(tf.square(tf.subtract(y, y_))) # or could use tf.losses.mean_squared_error

#run
with tf.Session() as sess:
     sess.run(init_op)
     total_batch = int(len(y_train) / batch_size)  
     for epoch in range(epochs):
         avg_cost = 0
         for i in range(total_batch):
              batch_x, batch_y =  X_train[i*batch_size:min(i*batch_size + batch_size, len(X_train)), :], y_train[i*batch_size:min(i*batch_size + batch_size, len(y_train)), :] 
              _, c = sess.run([optimizer, mse], feed_dict = {x: batch_x, y: batch_y}) 
              avg_cost += c / total_batch
         print('Epoch:', (epoch+1), 'cost =', '{:.3f}'.format(avg_cost))
     print(sess.run(mse, feed_dict = {x: X_test, y:y_test}))

This prints out something like this 这打印出这样的东西

...
Epoch: 98 cost = 10992.617
Epoch: 99 cost = 10992.592
Epoch: 100 cost = 10992.566
11815.1

So obviously there is something wrong. 显然有一些错误。 I am suspecting that the problem is either in the cost function/accuracy or in the way I am using batches, but I can't quite figure it out.. 我怀疑问题是成本函数/准确性或我使用批次的方式，但我无法弄明白..

Answer 1

As far as I can see, the model is learning. 据我所知，该模型正在学习。 I tried to tune some of hyperparameters (most significantly - the learning rate and hidden layer size) and got much better results. 我尝试调整一些超参数（最重要的是 - 学习率和隐藏的图层大小）并获得更好的结果。 Here's the full code: 这是完整的代码：

######################### import stuff ##########################
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.datasets import load_linnerud
from sklearn.model_selection import train_test_split

######################## prepare the data ########################
X, y = load_linnerud(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, shuffle=False)

######################## set learning variables ##################
learning_rate = 0.0005
epochs = 2000
batch_size = 3

######################## set some variables #######################
x = tf.placeholder(tf.float32, [None, 3], name='x')  # 3 features
y = tf.placeholder(tf.float32, [None, 3], name='y')  # 3 outputs

# hidden layer 1
W1 = tf.Variable(tf.truncated_normal([3, 10], stddev=0.03), name='W1')
b1 = tf.Variable(tf.truncated_normal([10]), name='b1')

# hidden layer 2
W2 = tf.Variable(tf.truncated_normal([10, 3], stddev=0.03), name='W2')
b2 = tf.Variable(tf.truncated_normal([3]), name='b2')

######################## Activations, outputs ######################
# output hidden layer 1
hidden_out = tf.nn.relu(tf.add(tf.matmul(x, W1), b1))

# total output
y_ = tf.nn.relu(tf.add(tf.matmul(hidden_out, W2), b2))

####################### Loss Function  #########################
mse = tf.losses.mean_squared_error(y, y_)

####################### Optimizer      #########################
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(mse)

###################### Initialize, Accuracy and Run #################
# initialize variables
init_op = tf.global_variables_initializer()

# accuracy for the test set
accuracy = tf.reduce_mean(tf.square(tf.subtract(y, y_)))  # or could use tf.losses.mean_squared_error

# run
with tf.Session() as sess:
  sess.run(init_op)
  total_batch = int(len(y_train) / batch_size)
  for epoch in range(epochs):
    avg_cost = 0
    for i in range(total_batch):
      batch_x, batch_y = X_train[i * batch_size:min(i * batch_size + batch_size, len(X_train)), :], \
                         y_train[i * batch_size:min(i * batch_size + batch_size, len(y_train)), :]
      _, c = sess.run([optimizer, mse], feed_dict={x: batch_x, y: batch_y})
      avg_cost += c / total_batch
    if epoch % 10 == 0:
      print 'Epoch:', (epoch + 1), 'cost =', '{:.3f}'.format(avg_cost)
  print sess.run(mse, feed_dict={x: X_test, y: y_test})

Output: 输出：

Epoch: 1901 cost = 173.914
Epoch: 1911 cost = 171.928
Epoch: 1921 cost = 169.993
Epoch: 1931 cost = 168.110
Epoch: 1941 cost = 166.277
Epoch: 1951 cost = 164.492
Epoch: 1961 cost = 162.753
Epoch: 1971 cost = 161.061
Epoch: 1981 cost = 159.413
Epoch: 1991 cost = 157.808
482.433

I think you can tune it even further, but it doesn't make sense since the data is so small. 我认为你可以进一步调整它，但由于数据太小，它没有意义。 I didn't experiment with regularization though, but I'm sure you'll need it L2 reg or dropout to avoid overfitting. 我没有尝试过正规化，但我确信你需要L2 reg或dropout以避免过度拟合。

张量流神经网络多层感知器用于回归实例

问题描述

1 个解决方案

解决方案1
3 已采纳 2017-10-19 15:00:29

张量流神经网络多层感知器用于回归实例

问题描述

1 个解决方案

解决方案1 3 已采纳 2017-10-19 15:00:29

解决方案1
3 已采纳 2017-10-19 15:00:29