简体   繁体   English

设置简单的Tensorflow线性回归并返回NaN值?

[英]Setting Up Simple Tensorflow Linear Regression Returning NaN Values?

I'm new to Tensorflow and I want to know why I'm getting nan values for cost, W and b at each epoch? 我是Tensorflow的新手,我想知道为什么在每个时期都得到成本,W和b的nan值? I'm setting up a traffic game and I'd like to train a model to be able to predict what the best duration of a green light would be, based on previous rewards and previous green light durations. 我正在设置一个交通游戏,我想训练一个模型,以便能够根据以前的奖励和以前的绿灯持续时间来预测最佳的绿灯持续时间。 I tried following this guide to set it up, but doesn't seem to be working. 我尝试按照本指南进行设置,但似乎没有用。 Any ideas? 有任何想法吗? This could should replicate the issue I'm having, and I've added in lots of prints to be able to help someone more experienced than I. Thanks. 这应该可以复制我遇到的问题,并且我添加了许多印刷品,以便能够帮助比我更有经验的人。谢谢。

import numpy as np
import random
import matplotlib.pyplot as plt
import tensorflow as tf
import warnings

warnings.simplefilter(action='once', category=FutureWarning) # future warnings annoy me

# add in a couple of rewards and light durations
current_reward = [-1000,-900,-950]
current_green = [10,12,12]

current_reward = np.array(current_reward)
current_green = np.array(current_green)

# Pass in reward and green_light
def green_light_duration_new(current_reward, current_green):
    # Predicting the best light duration based on previous rewards.
    # predict the best duration based on previous step's reward value, using simple linear regression model
    x = current_reward
    y = current_green
    n = len(x)
    # Plot of Training Data  
    plt.scatter(x, y) 
    plt.xlabel('Reward') 
    plt.ylabel('Green Light Duration') 
    plt.title("Training Data") 
    plt.show() 

    X = tf.placeholder("float") 
    Y = tf.placeholder("float") 
    W = tf.Variable(np.random.randn(), name = "W") 
    b = tf.Variable(np.random.randn(), name = "b") 
    learning_rate = 0.01
    training_epochs = 500
    # Hypothesis 
    y_pred = tf.add(tf.multiply(X, W), b) 
    print('y_pred : ', y_pred)
    print('y_pred dtype : ', y_pred.dtype)
    # Mean Squared Error Cost Function 
    cost = tf.reduce_sum(tf.pow(y_pred-Y, 2)) / (2 * n)
    print('cost : ', cost)
    print('cost dtype: ', cost.dtype)
    # Gradient Descent Optimizer 
    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)    
    # Global Variables Initializer 
    init = tf.global_variables_initializer()
    # Starting the Tensorflow Session 
    with tf.Session() as sess: 
        # Initializing the Variables 
        sess.run(init) 
        # Iterating through all the epochs 
        for epoch in range(training_epochs): 
            # Feeding each data point into the optimizer using Feed Dictionary 
            for (_x, _y) in zip(x, y): 
                print('_x : ',_x)
                print('_y : ',_y)
                sess.run(optimizer, feed_dict = {X : _x, Y : _y}) 
            # Displaying the result after every 50 epochs 
            if (epoch + 1) % 50 == 0: 
                # Calculating the cost a every epoch 
                c = sess.run(cost, feed_dict = {X : x, Y : y}) 
                print('c : ', c)
                print('c dtype : ', c.dtype)
                print("Epoch", (epoch + 1), ": cost =", c, "W =", sess.run(W), "b =", sess.run(b)) 
        # Storing necessary values to be used outside the Session 
        training_cost = sess.run(cost, feed_dict ={X: x, Y: y}) 
        print('training_cost : ', training_cost)
        print('training_cost dtype : ', training_cost.dtype)
        weight = sess.run(W)
        print('weight : ', weight)
        print('weight : ', weight.dtype)
        bias = sess.run(b)
        print('bias : ', bias)
        print('biad dtype : ', bias.dtype)
    # Calculating the predictions 
    green_light_duration_new = weight * x + bias 
    print("Training cost =", training_cost, "Weight =", weight, "bias =", bias, '\n')
    # Plotting the Results 
    plt.plot(x, y, 'ro', label ='Original data') 
    plt.plot(x, green_light_duration_new, label ='Fitted line') 
    plt.title('Linear Regression Result') 
    plt.legend() 
    plt.show() 
    return green_light_duration_new

# Go to the training function
new_green_dur = green_light_duration_new(current_reward, current_green)

# Append the predicted green light to its list
current_green.append(new_green_dur)

# Go on to run the rest of the simulation with the new green light duration,
# and append its subsequent reward to current_reward list to run again later.

UPDATE WITH PICTURES FROM BELOW SOLUTION With the solution provided below, it's only plotting one data point, not the three I input, and there's no line of best fit, and the axis coordinates on the bottom of the 2nd plot are not reflective of where the one data point truly is. 从下面的解决方案中更新图片使用下面提供的解决方案,它仅绘制一个数据点,而不是三个I输入,并且没有最佳拟合线,并且第二个图底部的轴坐标不能反映出哪里一个数据点确实是。

Also, when you print(current_green) at the very end after the concat'ing, the array is just 3 zeros? 另外,当您在连接后最后print(current_green)时,该数组只是3个零? Shouldn't it be 4? 不应该是4吗? The first inputted 3 and then the latest predicted one? 首先输入3,然后输入最新的预测值?

I don't understand what's happening here. 我不明白这里发生了什么。 Why scale the data? 为什么要缩放数据? What I want is to be able to feed this regressor with a new list of X values (the rewards) from previous runs and have it return/predict the best possible green light duration between 10 and 120 seconds, in the same scale as it went in. After that, it should add that duration to the current_green list. 我想要的是能够向回归器提供先前运行的X值(奖励)的新列表,并使其返回/预测10到120秒之间的最佳绿灯持续时间,且缩放比例与原值相同in。之后,应将该持续时间添加到current_green列表中。 Thanks a lot, I'm still new. 非常感谢,我还是新手。 The plotting is a nice feature, but it's not entirely necessary, I just wanted to see that it was working how it was supposed to. 绘图是一个不错的功能,但并非完全必要,我只是想看看它在按预期的方式工作。

绘图屏幕1

在此处输入图片说明

There are two error first please use MinMaxScaler to scale your data. 首先有两个错误,请使用MinMaxScaler缩放数据。 During calculations when number go out of range NAN pops up 2. Append doesnt work in numpy array. 在计算期间,当数字超出范围时,将弹出NAN2。追加在numpy数组中不起作用。

Here's complete solutions to your problem: 这是您问题的完整解决方案:

import numpy as np
import random
import matplotlib.pyplot as plt
import tensorflow as tf
import warnings
from sklearn.preprocessing import MinMaxScaler

warnings.simplefilter(action='once', category=FutureWarning) # future warnings annoy me

# add in a couple of rewards and light durations
current_reward = [[-1000,-900,-950]]
current_green = [[10,12,12]]

current_reward = np.array(current_reward)
current_green = np.array(current_green)





scaler = MinMaxScaler()
scaler.fit(current_reward)
current_reward= scaler.transform(current_reward)

scaler.fit(current_green)
current_green=scaler.transform(current_green)

# Pass in reward and green_light
def green_light_duration_new(current_reward, current_green):
    # Predicting the best light duration based on previous rewards.
    # predict the best duration based on previous step's reward value, using simple linear regression model
    x = current_reward
    y = current_green
    n = len(x)




    # Plot of Training Data  
    plt.scatter(x, y) 
    plt.xlabel('Reward') 
    plt.ylabel('Green Light Duration') 
    plt.title("Training Data") 
    plt.show() 

    X = tf.placeholder("float") 
    Y = tf.placeholder("float") 
    W = tf.Variable(np.random.randn(), name = "W") 
    b = tf.Variable(np.random.randn(), name = "b") 
    learning_rate = 0.01
    training_epochs = 500
    # Hypothesis 
    y_pred = tf.add(tf.multiply(X, W), b) 
    print('y_pred : ', y_pred)
    print('y_pred dtype : ', y_pred.dtype)
    # Mean Squared Error Cost Function 
    cost = tf.reduce_sum(tf.pow(y_pred-Y, 2)) / (2 * n)
    print('cost : ', cost)
    print('cost dtype: ', cost.dtype)
    # Gradient Descent Optimizer 
    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)    
    # Global Variables Initializer 
    init = tf.global_variables_initializer()
    # Starting the Tensorflow Session 
    with tf.Session() as sess: 
        # Initializing the Variables 
        sess.run(init) 
        # Iterating through all the epochs 
        for epoch in range(training_epochs): 
            # Feeding each data point into the optimizer using Feed Dictionary 
            for (_x, _y) in zip(x, y): 
                print('_x : ',_x)
                print('_y : ',_y)
                sess.run(optimizer, feed_dict = {X : _x, Y : _y}) 
            # Displaying the result after every 50 epochs 
            if (epoch + 1) % 50 == 0: 
                # Calculating the cost a every epoch 
                c = sess.run(cost, feed_dict = {X : x, Y : y}) 
                print('c : ', c)
                print('c dtype : ', c.dtype)
                print("Epoch", (epoch + 1), ": cost =", c, "W =", sess.run(W), "b =", sess.run(b)) 
        # Storing necessary values to be used outside the Session 
        training_cost = sess.run(cost, feed_dict ={X: x, Y: y}) 
        print('training_cost : ', training_cost)
        print('training_cost dtype : ', training_cost.dtype)
        weight = sess.run(W)
        print('weight : ', weight)
        print('weight : ', weight.dtype)
        bias = sess.run(b)
        print('bias : ', bias)
        print('biad dtype : ', bias.dtype)
    # Calculating the predictions 
    green_light_duration_new = weight * x + bias 
    print("Training cost =", training_cost, "Weight =", weight, "bias =", bias, '\n')
    # Plotting the Results 
    plt.plot(x, y, 'ro', label ='Original data') 
    plt.plot(x, green_light_duration_new, label ='Fitted line') 
    plt.title('Linear Regression Result') 
    plt.legend() 
    plt.show() 
    return green_light_duration_new

# Go to the training function
new_green_dur = green_light_duration_new(current_reward, current_green)

# Append the predicted green light to its list
np.concatenate((current_green, new_green_dur))
#current_green.append(new_green_dur)

# Go on to run the rest of the simulation with the new green light duration,
# and append its subsequent reward to current_reward list to run again later.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM