[英]Simple linear regression in Tensorflow produces near zero coefficient
I am attempting a simple linear regression in Tensorflow
with only one independent variable.我正在尝试在
Tensorflow
中进行简单的线性回归,只有一个自变量。 A plot of my data shows the coefficient should be near 1, and in fact if I run it using sklearn.linear_model.LinearRegression
I get a sensible result of about 0.90.我的数据的 plot 显示系数应该接近 1,事实上,如果我使用
sklearn.linear_model.LinearRegression
运行它,我会得到大约 0.90 的合理结果。
However running it in Tensorflow
using this tutorial produces a coefficient of very near zero.然而,使用本教程在
Tensorflow
中运行它会产生非常接近零的系数。 I was able to get a rational result from the Tensorflow
using randomized numbers.我能够使用随机数字从
Tensorflow
获得合理的结果。 I have tried adjusting the learning rate or number of epochs without any meaningful effect.我尝试调整学习率或时期数,但没有任何有意义的影响。
The MRE includes actual data, and should produce a coefficient of 0.8975 from sklearn
but 0.00045 from Tensorflow
. MRE 包含实际数据,并且应该从
sklearn
产生 0.8975 的系数,但从Tensorflow
产生 0.00045 的系数。 I have considered that it is getting caught at a local minimum, but none of the examples I can find of such a problem work for my issue.我认为它在局部最低限度内被捕获,但我能找到的有关此类问题的示例都不适用于我的问题。
import numpy as np
import tensorflow as tf
from sklearn import linear_model
learning_rate = 0.1
epochs = 100
x_train = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348,
0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182,
-0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268,
-0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212,
-0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534,
0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158,
0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231,
0.00159, -0.00463, 0.00174, 0, -0.0029,
-0.00349, 0.01372, -0.00302])
y_train = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441,
0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416,
-0.00191, -0.00607, 0.00161, 0.00289, -0.00416,
0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032,
-0.00387, -0.00162, -0.00292, -0.01367, 0.00198,
0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164,
0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
-0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196,
-0.00065, -0.00391, -0.0108, 0.01291, -0.00098])
regr = linear_model.LinearRegression()
regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
print ('Coefficients: ', regr.coef_)
weight = tf.Variable(0.)
bias = tf.Variable(0.)
for e in range(epochs):
with tf.GradientTape() as tape:
y_pred = weight*x_train + bias
loss = tf.reduce_mean(tf.square(y_pred - y_train))
gradients = tape.gradient(loss, [weight,bias])
weight.assign_sub(gradients[0]*learning_rate)
bias.assign_sub(gradients[1]*learning_rate)
print(weight.numpy(), 'weight', bias.numpy(), 'bias')
in the posted example, the training dataset x and y values are very small, which causes gradients to be very small, so while the model is training correctly on the data, it might take a few million iterations,在发布的示例中,训练数据集 x 和 y 值非常小,这导致梯度非常小,因此虽然 model 在数据上正确训练,但可能需要几百万次迭代,
the scikit learn linear regression model uses least squares curve fitting so it can fit the dataset infinitely fast. scikit 学习线性回归 model 使用最小二乘曲线拟合,因此它可以无限快地拟合数据集。
a suggestions to bring the result down to a managable 1000 iterations is to apply MinMaxScaler to have the x and y dataset between 0 and 1, which will improve gradients and reach a trained model, however you should inverse transform the results back after training, as shown in the modified code below.将结果降低到可管理的 1000 次迭代的建议是应用MinMaxScaler使 x 和 y 数据集介于 0 和 1 之间,这将改善梯度并达到经过训练的 model,但是您应该在训练后将结果逆变换回来,因为显示在下面的修改代码中。
import numpy as np
import tensorflow as tf
from sklearn import linear_model
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
learning_rate = 0.1
epochs = 1000
x_train0 = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348,
0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182,
-0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268,
-0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212,
-0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534,
0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158,
0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231,
0.00159, -0.00463, 0.00174, 0, -0.0029,
-0.00349, 0.01372, -0.00302])
scaler1 = MinMaxScaler()
x_train = scaler1.fit_transform(x_train0.reshape(-1,1))
y_train0 = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441,
0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416,
-0.00191, -0.00607, 0.00161, 0.00289, -0.00416,
0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032,
-0.00387, -0.00162, -0.00292, -0.01367, 0.00198,
0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164,
0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
-0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196,
-0.00065, -0.00391, -0.0108, 0.01291, -0.00098])
scaler2 = MinMaxScaler()
y_train = scaler2.fit_transform(y_train0.reshape(-1,1))
regr = linear_model.LinearRegression()
regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
print ('Coefficients: ', regr.coef_, ' intercept ',regr.intercept_, )
weight = tf.Variable(0.)
bias = tf.Variable(0.)
for e in range(epochs):
with tf.GradientTape() as tape:
y_pred = weight*x_train + bias
loss = tf.reduce_mean(tf.square(y_pred - y_train))
gradients = tape.gradient(loss, [weight,bias])
weight.assign_sub(gradients[0]*learning_rate)
bias.assign_sub(gradients[1]*learning_rate)
print(weight.numpy(), 'weight', bias.numpy(), 'bias')
import matplotlib.pyplot as plt
plt.plot(x_train0,scaler2.inverse_transform(y_pred.numpy()).flatten(),'r',label='model output')
plt.scatter(x_train0,y_train0,label='training dataset')
plt.legend()
plt.show()
Coefficients: [[0.97913471]] intercept [-0.00420121]
系数:[[0.97913471]] 截距 [-0.00420121]
0.96772194 weight 0.0018798028 bias
0.96772194 权重 0.0018798028 偏差
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.