Ihave created an SVR model using SCIKIT-LEARN, I am trying to plot my data but for some reason I am receiving the error:
ValueError: x and y must have same first dimension, but have shapes (4200,) and (16800, 1)
I have split my data into training and testing data, training the model and making a prediction. My code is:
X_feature = wind_speed
X_feature = X_feature.reshape(-1, 1)## Reshaping array to be 1D from 2D
y_label = Power
y_label = y_label.reshape(-1,1)
timeseries_split = TimeSeriesSplit(n_splits=3) ## Splitting training testing data into 3 splits
for train_index, test_index in timeseries_split.split(X_feature):## for loop to obtain print the training and splitting of the data
print("Training data:",train_index, "Testing data test:", test_index)#
X_train, X_test = X_feature[train_index], X_feature[test_index]
y_train, y_test = y_label[train_index], y_label [test_index]
timeseries_split = TimeSeriesSplit(n_splits=3) ## Splitting training testing data into 3 splits
scaler =pre.MinMaxScaler(feature_range=(0,1)).fit(X_train)## Data is being preprocessed then standard deviation
scaled_wind_speed_train = scaler.transform(X_train)## Wind speed training data is being scaled and then transformed
scaled_wind_speed_test = scaler.transform(X_test)## Wind speed test data is being scaled and then transformed
SVR_model = svm.SVR(kernel='rbf',C=100,gamma=.001).fit(scaled_wind_speed_train,y_train)
y_prediction = SVR_model.predict(scaled_wind_speed_test)
SVR_model.score(scaled_wind_speed_test,y_test)
rmse=numpy.sqrt(mean_squared_error(y_label,y_prediction))
print("RMSE:",rmse)
fig, bx = plt.subplots(figsize=(19,8))
bx.plot(y_prediction, X_feature,'bs')
fig.suptitle('Wind Power Prediction v Wind Speed', fontsize=20)
plt.xlabel('Wind Power Data')
plt.ylabel('Predicted Power')
plt.xticks(rotation=30)
plt.show()
fig, bx = plt.subplots(figsize=(19,8))
bx.plot( y_prediction, y_label)
fig.suptitle('Wind Power Prediction v Measured Wind Power ', fontsize=20)
plt.xlabel('Wind Power Data')
plt.ylabel('Predicted Power')
fig, bx = plt.subplots(figsize=(19,8))
bx.plot(y_prediction)
fig.suptitle('Wind Power Prediction v Measured Wind Power ', fontsize=20)
plt.xlabel('Wind Power Data')
plt.ylabel('Predicted Power')
I believe this code is being genrated when I am trying to obtain the rmse in the line:
rmse=numpy.sqrt(mean_squared_error(y_label,y_prediction))
This error also occurs when I comment this line out and try to plot my data..
My traceback error message is:
ValueError Traceback (most recent call last)
<ipython-input-57-ed11a9ca7fd8> in <module>()
79
80 fig, bx = plt.subplots(figsize=(19,8))
---> 81 bx.plot( y_prediction, y_label)
82 fig.suptitle('Wind Power Prediction v Measured Wind Power ', fontsize=20)
83 plt.xlabel('Wind Power Data')
~/anaconda3_501/lib/python3.6/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
1715 warnings.warn(msg % (label_namer, func.__name__),
1716 RuntimeWarning, stacklevel=2)
-> 1717 return func(ax, *args, **kwargs)
1718 pre_doc = inner.__doc__
1719 if pre_doc is None:
~/anaconda3_501/lib/python3.6/site-packages/matplotlib/axes/_axes.py in plot(self, *args, **kwargs)
1370 kwargs = cbook.normalize_kwargs(kwargs, _alias_map)
1371
-> 1372 for line in self._get_lines(*args, **kwargs):
1373 self.add_line(line)
1374 lines.append(line)
~/anaconda3_501/lib/python3.6/site-packages/matplotlib/axes/_base.py in _grab_next_args(self, *args, **kwargs)
402 this += args[0],
403 args = args[1:]
--> 404 for seg in self._plot_args(this, kwargs):
405 yield seg
406
~/anaconda3_501/lib/python3.6/site-packages/matplotlib/axes/_base.py in _plot_args(self, tup, kwargs)
382 x, y = index_of(tup[-1])
383
--> 384 x, y = self._xy_from_xy(x, y)
385
386 if self.command == 'plot':
~/anaconda3_501/lib/python3.6/site-packages/matplotlib/axes/_base.py in _xy_from_xy(self, x, y)
241 if x.shape[0] != y.shape[0]:
242 raise ValueError("x and y must have same first dimension, but "
--> 243 "have shapes {} and {}".format(x.shape, y.shape))
244 if x.ndim > 2 or y.ndim > 2:
245 raise ValueError("x and y can be no greater than 2-D, but have "
ValueError: x and y must have same first dimension, but have shapes (4200,) and (16800, 1)
I think you have mixed the arguements for mean_squared_error
, it should be
rmse=numpy.sqrt(mean_squared_error(y_test,y_prediction))
Update : as per the latest error, try this
fig, bx = plt.subplots(figsize=(19,8))
bx.plot(y_prediction, scaled_wind_speed_test,'bs')
fig.suptitle('Wind Power Prediction v Wind Speed', fontsize=20)
plt.xlabel('Wind Power Data')
plt.ylabel('Predicted Power')
plt.xticks(rotation=30)
plt.show()
Update 2 In case you get error on the other plot try this
fig, bx = plt.subplots(figsize=(19,8))
bx.plot( y_prediction, y_test)
fig.suptitle('Wind Power Prediction v Measured Wind Power ', fontsize=20)
plt.xlabel('Wind Power Data')
plt.ylabel('Predicted Power')
Numpy's function mean_squared_error
expects two arrays of the same size. The error you are getting implies that these two do not have the same size.
You can check your array sizes by
print(array_1.shape)
print(array_2.shape)
if the output you get is
output:
> (4200,)
> (4200, 1)
you can fix by doing
new_array_2 = array_2.transpose()[0]
and then
mean_squared_error(array_1, new_array_2)
if the two input arguments, whatever they are give you the following shapes
print(array_1.shape)
print(array_2.shape)
output:
> (4200,)
> (16800, 1)
try
new_array_1 = scalar.transform(array_1)
or
new_array_2 = scalar.transform(array_2)
until you get arrays with the same number whether it's 16800 or 4200. Once you have two of the same size, but the one or both still comes with an extra dimension,
then again do
new_new_array_1 = scalar.transform(new_array_1)[0]
and feed these to mean_squared_error
, eg
mean_squared_error(new_new_array_1, new_array_2)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.