Tensorflow 中的简单线性回归产生接近零的系数

Question

I am attempting a simple linear regression in Tensorflow with only one independent variable.我正在尝试在Tensorflow中进行简单的线性回归，只有一个自变量。 A plot of my data shows the coefficient should be near 1, and in fact if I run it using sklearn.linear_model.LinearRegression I get a sensible result of about 0.90.我的数据的 plot 显示系数应该接近 1，事实上，如果我使用sklearn.linear_model.LinearRegression运行它，我会得到大约 0.90 的合理结果。

However running it in Tensorflow using this tutorial produces a coefficient of very near zero.然而，使用本教程在Tensorflow中运行它会产生非常接近零的系数。 I was able to get a rational result from the Tensorflow using randomized numbers.我能够使用随机数字从Tensorflow获得合理的结果。 I have tried adjusting the learning rate or number of epochs without any meaningful effect.我尝试调整学习率或时期数，但没有任何有意义的影响。

The MRE includes actual data, and should produce a coefficient of 0.8975 from sklearn but 0.00045 from Tensorflow . MRE 包含实际数据，并且应该从sklearn产生 0.8975 的系数，但从Tensorflow产生 0.00045 的系数。 I have considered that it is getting caught at a local minimum, but none of the examples I can find of such a problem work for my issue.我认为它在局部最低限度内被捕获，但我能找到的有关此类问题的示例都不适用于我的问题。

import numpy as np
import tensorflow as tf
from sklearn import linear_model

learning_rate = 0.1
epochs = 100

x_train = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348, 
                0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182, 
                -0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268, 
                -0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212, 
                -0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534, 
                0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158, 
                0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231, 
                0.00159, -0.00463, 0.00174, 0, -0.0029, 
                -0.00349, 0.01372, -0.00302])

y_train = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441, 
                0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416, 
                -0.00191, -0.00607, 0.00161, 0.00289, -0.00416, 
                0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032, 
                -0.00387, -0.00162, -0.00292, -0.01367, 0.00198, 
                0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164, 
                0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
                -0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196, 
                -0.00065, -0.00391, -0.0108, 0.01291, -0.00098])

regr = linear_model.LinearRegression()
regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
print ('Coefficients: ', regr.coef_)

weight = tf.Variable(0.)
bias = tf.Variable(0.)

for e in range(epochs):
    with tf.GradientTape() as tape:
        y_pred = weight*x_train + bias
        loss = tf.reduce_mean(tf.square(y_pred - y_train))
        gradients = tape.gradient(loss, [weight,bias])
        weight.assign_sub(gradients[0]*learning_rate)
        bias.assign_sub(gradients[1]*learning_rate)

print(weight.numpy(), 'weight', bias.numpy(), 'bias')

Answer 1

in the posted example, the training dataset x and y values are very small, which causes gradients to be very small, so while the model is training correctly on the data, it might take a few million iterations,在发布的示例中，训练数据集 x 和 y 值非常小，这导致梯度非常小，因此虽然 model 在数据上正确训练，但可能需要几百万次迭代，

the scikit learn linear regression model uses least squares curve fitting so it can fit the dataset infinitely fast. scikit 学习线性回归 model 使用最小二乘曲线拟合，因此它可以无限快地拟合数据集。

a suggestions to bring the result down to a managable 1000 iterations is to apply MinMaxScaler to have the x and y dataset between 0 and 1, which will improve gradients and reach a trained model, however you should inverse transform the results back after training, as shown in the modified code below.将结果降低到可管理的 1000 次迭代的建议是应用MinMaxScaler使 x 和 y 数据集介于 0 和 1 之间，这将改善梯度并达到经过训练的 model，但是您应该在训练后将结果逆变换回来，因为显示在下面的修改代码中。

    import numpy as np
    import tensorflow as tf
    from sklearn import linear_model
    from sklearn.preprocessing import MinMaxScaler
    import matplotlib.pyplot as plt
    learning_rate = 0.1
    epochs = 1000
    
    x_train0 = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348,
                    0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182,
                    -0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268,
                    -0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212,
                    -0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534,
                    0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158,
                    0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231,
                    0.00159, -0.00463, 0.00174, 0, -0.0029,
                    -0.00349, 0.01372, -0.00302])
    scaler1 = MinMaxScaler()
    x_train = scaler1.fit_transform(x_train0.reshape(-1,1))
    y_train0 = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441,
                    0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416,
                    -0.00191, -0.00607, 0.00161, 0.00289, -0.00416,
                    0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032,
                    -0.00387, -0.00162, -0.00292, -0.01367, 0.00198,
                    0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164,
                    0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
                    -0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196,
                    -0.00065, -0.00391, -0.0108, 0.01291, -0.00098])
    scaler2 = MinMaxScaler()
    y_train = scaler2.fit_transform(y_train0.reshape(-1,1))
    
    regr = linear_model.LinearRegression()
    regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
    print ('Coefficients: ', regr.coef_, ' intercept ',regr.intercept_, )
    
    weight = tf.Variable(0.)
    bias = tf.Variable(0.)
    
    for e in range(epochs):
        with tf.GradientTape() as tape:
            y_pred = weight*x_train + bias
            loss = tf.reduce_mean(tf.square(y_pred - y_train))
            gradients = tape.gradient(loss, [weight,bias])
            weight.assign_sub(gradients[0]*learning_rate)
            bias.assign_sub(gradients[1]*learning_rate)
    
    
    print(weight.numpy(), 'weight', bias.numpy(), 'bias')
    
    import matplotlib.pyplot as plt
    plt.plot(x_train0,scaler2.inverse_transform(y_pred.numpy()).flatten(),'r',label='model output')
    plt.scatter(x_train0,y_train0,label='training dataset')
    plt.legend()
    plt.show()

Coefficients: [[0.97913471]] intercept [-0.00420121]系数：[[0.97913471]] 截距 [-0.00420121]

0.96772194 weight 0.0018798028 bias 0.96772194 权重 0.0018798028 偏差

Tensorflow 中的简单线性回归产生接近零的系数

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-11-27 22:47:39

Tensorflow 中的简单线性回归产生接近零的系数

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-11-27 22:47:39

解决方案1
1 已采纳 2021-11-27 22:47:39