简体   繁体   中英

About locally weighted linear regression problem

One problem with linear regression is that it tends to underfit the data and one way to solve this problem is a technique known as locally weighted linear regression. I have read about this technique in CS229 Lecture notes by Andrew Ng and I have also tried to make the following script:

trX = np.linspace(0, 1, 100) 
trY= trX + np.random.normal(0,1,100)

sess = tf.Session()
xArr = []
yArr = []
for i in range(len(trX)):
    xArr.append([1.0,float(trX[i])])
    yArr.append(float(trY[i]))

xMat = mat(xArr); 
yMat = mat(yArr).T

A_tensor = tf.constant(xMat)
b_tensor = tf.constant(yMat)

m = shape(xMat)[0]
weights = mat(eye((m)))
k = 1.0
for j in range(m):
    for i in range(m):
        diffMat = xMat[i]- xMat[j,:]
        weights[j,j] = exp(diffMat*diffMat.T/(-2.0*k**2))

weights_tensor = tf.constant(weights)
# Matrix inverse solution
wA = tf.matmul(weights_tensor, A_tensor)
tA_A = tf.matmul(tf.transpose(A_tensor), wA)
tA_A_inv = tf.matrix_inverse(tA_A)
product = tf.matmul(tA_A_inv, tf.transpose(A_tensor))
solution = tf.matmul(product, b_tensor)

solution_eval = sess.run(solution)

# Extract coefficients
slope = solution_eval[0][0]
y_intercept = solution_eval[1][0]

print('slope: ' + str(slope))
print('y_intercept: ' + str(y_intercept))

# Get best fit line

best_fit = []
for i in xArr:
  best_fit.append(slope*i+y_intercept)

# Plot the results
plt.plot(xArr, yArr, 'o', label='Data')
plt.plot(xArr, best_fit, 'r-', label='Best fit line', linewidth=3)
plt.legend(loc='upper left')
plt.show()

When I run the script above, an error occured: TypeError: 'numpy.float64' object cannot be interpreted as an integer . This error is thrown by statement:

best_fit.append(slope*i+y_intercept)

I have tried to fixed this one but I have not still found solution. Please help me.

In the loop, i is a list, eg [1.0, 1.0] . You need to decide what value to take from the list to multiply slope*i . For instance:

best_fit = []
for i in xArr:
    best_fit.append(slope*i[0]+y_intercept)

The first element in the list seems to always equal 1.

...
[1.0, 0.24242424242424243]
[1.0, 0.25252525252525254]
[1.0, 0.26262626262626265]
[1.0, 0.27272727272727276]
[1.0, 0.2828282828282829]
[1.0, 0.29292929292929293]
[1.0, 0.30303030303030304]
[1.0, 0.31313131313131315]
[1.0, 0.32323232323232326]
[1.0, 0.33333333333333337]
[1.0, 0.3434343434343435]
[1.0, 0.3535353535353536]
...

So I think you might look for the second element in the list (the weights?)...

best_fit = []
for i in xArr:
    best_fit.append(slope*i[1]+y_intercept)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM