Tensorflow始终预测相同的输出

Question

So, I'm trying to learn tensorflow and, for that, I try to create a classifier for something that, I think, is not so hard. 所以，我正在努力学习张量流，为此，我尝试为某些东西创建一个分类器，我认为并不那么难。 I'd like to predict if a number is odd or even. 我想预测一个数字是奇数还是偶数。 The problem is that Tensorflow always predict the same output, I searched answers the last days but nothing helped me... I saw the following answers : - Tensorflow predicts always the same result 问题是Tensorflow总是预测相同的输出，我搜索了最后几天的答案，但没有任何帮助我...我看到以下答案： - Tensorflow预测总是相同的结果

- TensorFlow always converging to same output for all items after training - TensorFlow在训练后总是收敛到所有项目的相同输出

- TensorFlow always return same result - TensorFlow始终返回相同的结果

Here's my code: 这是我的代码：

in: 在：

df
    nb  y1
0   1   0
1   2   1
2   3   0
3   4   1
4   5   0
...
19  20  1

inputX = df.loc[:, ['nb']].as_matrix()
inputY = df.loc[:, ['y1']].as_matrix()
print(inputX.shape)
print(inputY.shape)

out: 出：

(20, 1) (20, 1) （20,1）（20,1）

in: 在：

# Parameters
learning_rate = 0.00000001
training_epochs = 2000
display_step = 50
n_samples = inputY.size


x = tf.placeholder(tf.float32, [None, 1])   
W = tf.Variable(tf.zeros([1, 1]))           
b = tf.Variable(tf.zeros([1]))            
y_values = tf.add(tf.matmul(x, W), b)      
y = tf.nn.relu(y_values)                 
y_ = tf.placeholder(tf.float32, [None,1])  

# Cost function: Mean squared error
cost = tf.reduce_sum(tf.pow(y_ - y, 2))/(2*n_samples)
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initialize variabls and tensorflow session
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

for i in range(training_epochs):  
    sess.run(optimizer, feed_dict={x: inputX, y_: inputY}) # Take a gradient descent step using our inputs and labels

    # Display logs per epoch step
    if (i) % display_step == 0:
        cc = sess.run(cost, feed_dict={x: inputX, y_:inputY})
        print("Training step:", '%04d' % (i), "cost=", "{:.9f}".format(cc)) #, \"W=", sess.run(W), "b=", sess.run(b)

print ("Optimization Finished!")
training_cost = sess.run(cost, feed_dict={x: inputX, y_: inputY})
print ("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

out: 出：

Training step: 0000 cost= 0.250000000
Training step: 0050 cost= 0.250000000
Training step: 0100 cost= 0.250000000
...
Training step: 1800 cost= 0.250000000
Training step: 1850 cost= 0.250000000
Training step: 1900 cost= 0.250000000
Training step: 1950 cost= 0.250000000
Optimization Finished!
Training cost= 0.25 W= [[ 0.]] b= [ 0.]

in: 在：

sess.run(y, feed_dict={x: inputX })

out: 出：

array([[ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.]], dtype=float32)

I tried to play with my Hyper parameters like, the learning rate or the number of training epochs. 我尝试使用我的Hyper参数，例如学习率或训练时期的数量。 I changed the activation function from softmax to relu. 我将激活功能从softmax更改为relu。 I changed my dataframe to have more examples but nothing happened. 我更改了我的数据框以获得更多示例但没有任何反应。 I also tried to add random for my Weights, but nothing changed, the cost was just starting to a higher value. 我也尝试为我的权重添加随机数，但没有任何改变，成本只是开始更高的价值。

Answer 1

From giving a quick look at the code, it looks ok to me (maybe a part initializing the weights to zero, usually you want a small number different from zero to avoid a trivial solution), while I don't think that you can fit the problem of the parity of integers with a linear regression. 从快速查看代码，它看起来没问题（可能是一个部分将权重初始化为零，通常你想要一个不同于零的小数字来避免一个简单的解决方案），而我认为你不能适应用线性回归求整数的奇偶性问题。

The point is that you are trying to fit 关键是你想要适应

x % 2

with predictions of the form 预测形式

activation(x * w + b)

and there is no way to find good w and b to solve this problem. 并且没有办法找到好的w和b来解决这个问题。

Another way to understand this is to plot your data: the scatter plot of the parity of x are two lines of points, and the only way to fit them with a line is with a flat line (that will have a high cost anyway). 理解这一点的另一种方法是绘制数据： x的奇偶校验的散点图是两条点，并且用线条拟合它们的唯一方法是使用扁平线（无论如何都会有很高的成本）。

I think it would be better to change data to start with, but if you want to address this problem, you should obtain some result using a sine or a cosine as activation function. 我认为最好先改变数据，但是如果你想解决这个问题，你应该使用正弦或余弦作为激活函数来获得一些结果。

Answer 2

The main problem that I see is that you initialize your weights in the W matrix with 0s. 我看到的主要问题是你用0来初始化W矩阵中的权重。 The operation that you have in the linear layer is basically Wx + b. 您在线性层中的操作基本上是Wx + b。 Hence the gradient with respect to x is W. If you start now with zeros for W then the gradient is 0 as well and you are not able to learn anything. 因此，相对于x的渐变是W.如果现在从零开始为W，那么渐变也是0并且您无法学习任何东西。 Try to use random initial values as stated on tensorflow.org 尝试使用tensorflow.org上所述的随机初始值

# Create two variables.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
                      name="weights")
biases = tf.Variable(tf.zeros([200]), name="biases")

Answer 3

first of all I have to admit that I never used tensorflow. 首先，我必须承认我从未使用过tensorflow。 But I think you have a modelling problem here. 但我认为你在这里有一个建模问题。

You are using the simplest network architecture possible (a 1-dimensional perceptron ). 您正在使用最简单的网络架构（一维感知器）。 You have two variables (w and b) which you want to learn and your decision rule for the output looks like 您有两个要学习的变量（w和b）以及输出的决策规则

if you subtract the b and divide by w you get 如果你减去b并除以w得到

So you are basically looking for a threshold to seperate odd and even numbers. 所以你基本上在寻找一个分隔奇数和偶数的门槛。 No matter how you choose w and b you will always misclassify half of the numbers. 无论你如何选择w和b，你总是将一半的数字错误分类。

Although decinding if a number is odd or even seems to be a super trivial task for us humans it is not for a single perceptron. 虽然对于我们人类而言，如果一个数字是奇数或甚至是一个非常重要的任务，那么它不是一个感知器。

Tensorflow始终预测相同的输出

问题描述

3 个解决方案

解决方案1
3 已采纳 2017-06-06 09:26:59

解决方案2
3 2017-06-06 09:31:36

解决方案3
2 2017-06-06 09:53:46

Tensorflow始终预测相同的输出

问题描述

3 个解决方案

解决方案1 3 已采纳 2017-06-06 09:26:59

解决方案2 3 2017-06-06 09:31:36

解决方案3 2 2017-06-06 09:53:46

解决方案1
3 已采纳 2017-06-06 09:26:59

解决方案2
3 2017-06-06 09:31:36

解决方案3
2 2017-06-06 09:53:46