简体   繁体   English

Tensorflow始终预测相同的输出

[英]Tensorflow always predict the same output

So, I'm trying to learn tensorflow and, for that, I try to create a classifier for something that, I think, is not so hard. 所以,我正在努力学习张量流,为此,我尝试为某些东西创建一个分类器,我认为并不那么难。 I'd like to predict if a number is odd or even. 我想预测一个数字是奇数还是偶数。 The problem is that Tensorflow always predict the same output, I searched answers the last days but nothing helped me... I saw the following answers : - Tensorflow predicts always the same result 问题是Tensorflow总是预测相同的输出,我搜索了最后几天的答案,但没有任何帮助我...我看到以下答案: - Tensorflow预测总是相同的结果

- TensorFlow always converging to same output for all items after training - TensorFlow在训练后总是收敛到所有项目的相同输出

- TensorFlow always return same result - TensorFlow始终返回相同的结果

Here's my code: 这是我的代码:

in: 在:

df
    nb  y1
0   1   0
1   2   1
2   3   0
3   4   1
4   5   0
...
19  20  1

inputX = df.loc[:, ['nb']].as_matrix()
inputY = df.loc[:, ['y1']].as_matrix()
print(inputX.shape)
print(inputY.shape)

out: 出:

(20, 1) (20, 1) (20,1)(20,1)

in: 在:

# Parameters
learning_rate = 0.00000001
training_epochs = 2000
display_step = 50
n_samples = inputY.size


x = tf.placeholder(tf.float32, [None, 1])   
W = tf.Variable(tf.zeros([1, 1]))           
b = tf.Variable(tf.zeros([1]))            
y_values = tf.add(tf.matmul(x, W), b)      
y = tf.nn.relu(y_values)                 
y_ = tf.placeholder(tf.float32, [None,1])  

# Cost function: Mean squared error
cost = tf.reduce_sum(tf.pow(y_ - y, 2))/(2*n_samples)
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initialize variabls and tensorflow session
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

for i in range(training_epochs):  
    sess.run(optimizer, feed_dict={x: inputX, y_: inputY}) # Take a gradient descent step using our inputs and labels

    # Display logs per epoch step
    if (i) % display_step == 0:
        cc = sess.run(cost, feed_dict={x: inputX, y_:inputY})
        print("Training step:", '%04d' % (i), "cost=", "{:.9f}".format(cc)) #, \"W=", sess.run(W), "b=", sess.run(b)

print ("Optimization Finished!")
training_cost = sess.run(cost, feed_dict={x: inputX, y_: inputY})
print ("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

out: 出:

Training step: 0000 cost= 0.250000000
Training step: 0050 cost= 0.250000000
Training step: 0100 cost= 0.250000000
...
Training step: 1800 cost= 0.250000000
Training step: 1850 cost= 0.250000000
Training step: 1900 cost= 0.250000000
Training step: 1950 cost= 0.250000000
Optimization Finished!
Training cost= 0.25 W= [[ 0.]] b= [ 0.]

in: 在:

sess.run(y, feed_dict={x: inputX })

out: 出:

array([[ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.]], dtype=float32)

I tried to play with my Hyper parameters like, the learning rate or the number of training epochs. 我尝试使用我的Hyper参数,例如学习率或训练时期的数量。 I changed the activation function from softmax to relu. 我将激活功能从softmax更改为relu。 I changed my dataframe to have more examples but nothing happened. 我更改了我的数据框以获得更多示例但没有任何反应。 I also tried to add random for my Weights, but nothing changed, the cost was just starting to a higher value. 我也尝试为我的权重添加随机数,但没有任何改变,成本只是开始更高的价值。

From giving a quick look at the code, it looks ok to me (maybe a part initializing the weights to zero, usually you want a small number different from zero to avoid a trivial solution), while I don't think that you can fit the problem of the parity of integers with a linear regression. 从快速查看代码,它看起来没问题(可能是一个部分将权重初始化为零,通常你想要一个不同于零的小数字来避免一个简单的解决方案),而我认为你不能适应用线性回归求整数的奇偶性问题。

The point is that you are trying to fit 关键是你想要适应

x % 2

with predictions of the form 预测形式

activation(x * w + b)

and there is no way to find good w and b to solve this problem. 并且没有办法找到好的wb来解决这个问题。

Another way to understand this is to plot your data: the scatter plot of the parity of x are two lines of points, and the only way to fit them with a line is with a flat line (that will have a high cost anyway). 理解这一点的另一种方法是绘制数据: x的奇偶校验的散点图是两条点,并且用线条拟合它们的唯一方法是使用扁平线(无论如何都会有很高的成本)。

I think it would be better to change data to start with, but if you want to address this problem, you should obtain some result using a sine or a cosine as activation function. 我认为最好先改变数据,但是如果你想解决这个问题,你应该使用正弦或余弦作为激活函数来获得一些结果。

The main problem that I see is that you initialize your weights in the W matrix with 0s. 我看到的主要问题是你用0来初始化W矩阵中的权重。 The operation that you have in the linear layer is basically Wx + b. 您在线性层中的操作基本上是Wx + b。 Hence the gradient with respect to x is W. If you start now with zeros for W then the gradient is 0 as well and you are not able to learn anything. 因此,相对于x的渐变是W.如果现在从零开始为W,那么渐变也是0并且您无法学习任何东西。 Try to use random initial values as stated on tensorflow.org 尝试使用tensorflow.org上所述的随机初始值

# Create two variables.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
                      name="weights")
biases = tf.Variable(tf.zeros([200]), name="biases")

first of all I have to admit that I never used tensorflow. 首先,我必须承认我从未使用过tensorflow。 But I think you have a modelling problem here. 但我认为你在这里有一个建模问题。

You are using the simplest network architecture possible (a 1-dimensional perceptron ). 您正在使用最简单的网络架构(一维感知器 )。 You have two variables (w and b) which you want to learn and your decision rule for the output looks like 您有两个要学习的变量(w和b)以及输出的决策规则

pereceptron的决定公式

if you subtract the b and divide by w you get 如果你减去b并除以w得到

被激活的决策规则

So you are basically looking for a threshold to seperate odd and even numbers. 所以你基本上在寻找一个分隔奇数和偶数的门槛。 No matter how you choose w and b you will always misclassify half of the numbers. 无论你如何选择w和b,你总是将一半的数字错误分类。

Although decinding if a number is odd or even seems to be a super trivial task for us humans it is not for a single perceptron. 虽然对于我们人类而言,如果一个数字是奇数或甚至是一个非常重要的任务,那么它不是一个感知器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM