简体   繁体   中英

Linear regression model shapes - ValueError: x and y must have same first dimension, but have shapes (5,) and (1, 5)

I'm following this example https://www.analyticsvidhya.com/blog/2020/03/polynomial-regression-python/

I am trying to fit a linear line of best fit to my matplotlib graph. I keep getting the error that x and y do not have the same first dimension. But they both have lengths of 5? What am I doing wrong?

ValueError: x and y must have same first dimension, but have shapes (5,) and (1, 5)

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.metrics import mean_squared_error
from sklearn.linear_model import LinearRegression

df = pd.read_csv('head_london_pm25vspm10.csv').dropna()
x = df['pm25_ugm3'].values
y = df['pm10'].values

# Training Model
lm = LinearRegression().fit(x.reshape(1, -1), y.reshape(1, -1))
y_pred = lm.predict(x.reshape(1, -1))

# plotting dataset
plt.figure(figsize=(10, 5))
plt.scatter(x, y, s=15)
plt.plot(x, y_pred, color='r')
plt.xlabel('pm25', fontsize=16)
plt.ylabel('pm10', fontsize=16)
plt.show()

print('RMSE for Linear Regression=>', np.sqrt(mean_squared_error(y, y_pred)))

CSV file - 'head_london_pm25vspm10.csv'

pm25_ugm3,pm10
3.8,7.9
4.1,10.5
4.2,10.5
4.5,10.9
4.7,11.2

LinearRegression works with arrays. As your data are only vectors you have to reshape (reshape(1, -1)) them into arrays to work with LinearRegression.

The output of the LinearRegression is then again an array. But your input x is still a vector. For the plot function both inputs need to have the same shape though.

You can reshape the output back from LinearRegression to a vector so it matches again the shape of the x vector

y_pred = lm.predict(x.reshape(1, -1)).reshape(-1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM