简体   繁体   中英

Tensorflow - Linear Regression

I code tensorflow program for linear regression. I am using Gradient Descent algorithm for optimizing(Minimising) loss function. But value of loss function is increasing while executing the program. My program and output is in follow.

    import tensorflow as tf
    W = tf.Variable([.3],dtype=tf.float32)
    b = tf.Variable([-.3],dtype=tf.float32)
    X = tf.placeholder(tf.float32)
    Y = tf.placeholder(tf.float32)
    sess = tf.Session()
    init = init = tf.global_variables_initializer()
   sess.run(init)
   lm = W*X + b
   delta = tf.square(lm-Y)
   loss = tf.reduce_sum(delta)
   optimizer = tf.train.GradientDescentOptimizer(0.01)
   train = optimizer.minimize(loss)
   for i in range(8):
      print(sess.run([W, b]))
      print("loss= %f" %sess.run(loss,{X:[10,20,30,40],Y:[1,2,3,4]}))  
      sess.run(train, {X: [10,20,30,40],Y: [1,2,3,4]})
   sess.close()

Output for my program is

2017-12-07 14:50:10.517685: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.

[array([ 0.30000001], dtype=float32), array([-0.30000001],dtype=float32)]
loss= 108.359993

[array([-11.09999943], dtype=float32), array([-0.676], dtype=float32)]
loss= 377836.000000

[array([ 662.25195312], dtype=float32), array([ 21.77807617],  dtype=float32)]
loss= 1318221568.000000

[array([-39110.421875], dtype=float32), array([-1304.26794434],  dtype=float32)]
loss= 4599107289088.000000

[array([ 2310129.25], dtype=float32), array([ 77021.109375],  dtype=float32)]
loss= 16045701465112576.000000
[array([ -1.36451664e+08], dtype=float32), array([-4549399.],  dtype=float32)]
loss= 55981405829796462592.000000

[array([  8.05974733e+09], dtype=float32), array([  2.68717856e+08],  dtype=float32)]
loss= 195312036582209632600064.000000

Please provide me a answer why value of loss is increasing instead of decreasing.

Did you try changing the learning rate? Using a lower running rate (~1e-4) and more iterations should work.

More justification as to why a lower learning rate might be required. Note that your loss function is

L = \\sum (Wx+bY)^2

and dL/dW = \\sum 2(Wx+bY)*x

and hessian d^2L/d^2W = \\sum 2x*x

Now, your loss is diverging because learning rate is more than inverse of hessian which there will be roughly 1/(2*2900). So you should try and decrease the learning rate here.

Note: I wasn't sure how to add math to StackOverflow answer so I had to add it this way.

To do a linear regression this is the code i've been using with numpy:

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import pandas as pd
print(tf.__version__) 

%matplotlib inline
plt.rcParams['figure.figsize'] = (10, 6)

x = np.arange(start=0.0, stop=5.0, step=0.1)

##You can adjust the slope and intercept to verify the changes in the graph
W=1
b=0

# We define de linear ecuation
y= W*x + b 

# And plot it thanks to matplotlib
plt.plot(x,y) 
plt.ylabel('Dependent Variable')
plt.xlabel('Indepdendent Variable')
plt.show()

在此处输入图片说明

With TensorFlow you can use something similar to the code bellow to do a linear regression:

    def graph_formula_vs_data(formula, x_vector, y_vector): 
        """
        This function graphs a formula in the form of a line, vs. data points
        """
        x = np.array(range(0, int(max(x_vector))))  
        y = eval(formula)
        plt.plot(x, y)
        plt.plot(x_vector, y_vector, "ro")
        plt.show()

df=pd.read_csv('./linear_reg_exam_dataset.csv',usecols = [0,1],skiprows = [0],header=None)
d = df.values
data = np.float32(d)

dataset = pd.DataFrame({'x': data[:, 0], 'y': data[:, 1]})

# Number of epochs (times we make the model go through all the data)
n_epochs = 100

# Model parameters
W = tf.Variable([0.], tf.float32)
b = tf.Variable([0.], tf.float32)

y = dataset['y'] # define the target variable (dependent variable) as y
x = dataset['x']
msk = np.random.rand(len(df)) < 0.8

# Model input and output
x_train = x[msk].values.tolist()
y_train = y[msk].values.tolist()

# Validation data (with this we validate that the model has learned to generalize the problem)
x_val = x[~msk].values.tolist()
y_val = y[~msk].values.tolist()


# Model definition
@tf.function
def linear_model(x, W, b):
    return W*x + b


# Cost function
loss = lambda: tf.reduce_sum(tf.math.squared_difference(y_train,linear_model(x_train, W, b)))
# optimizer to do the gradient descent
optimizer = tf.optimizers.SGD(0.0000000000001)

# We perform n_epochs training iterations
for i in range(n_epochs):
    optimizer.minimize(loss, var_list=[W, b])

    # Every 10 epochs we print the data of how W, b evolve and the amount of error there is
    if i % 10 == 0 or i == n_epochs-1:
        print("Epoch {}".format(i))
        print("W: {}".format(W.numpy()))
        print("b: {}".format(b.numpy()))
        print("loss: {}".format(loss()))
        # This formula represents w * x + b in string form to be able to graph it
        stringfied_formula=str(W.numpy()) + "*x +" + str(b.numpy())
        graph_formula_vs_data(formula=stringfied_formula, x_vector=x_train, y_vector=y_train)
        print("\n")

Epoch 99 W: [0.39189553] b: [0.00059491] loss: 1458421628928.0 在此处输入图片说明

# Evaluation of the model with validation data
stringfied_formula=str(W.numpy()) + "*x +" + str(b.numpy())
graph_formula_vs_data(formula=stringfied_formula, x_vector=x_val, y_vector=y_val)
loss = lambda: tf.reduce_sum(tf.math.squared_difference(y_val,linear_model(x_val, W, b)))
print("\nValidation: ")
print("W: {}".format(W.numpy()))
print("b: {}".format(b.numpy()))
print("loss: {}".format(loss()))
graph_formula_vs_data(formula=stringfied_formula, x_vector=x_val, y_vector=y_val)

Validation: W: [75.017586] b: [0.11139687] loss: 8863.4775390625

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM