ValueError：x和y必须具有相同的第一尺寸，但形状为（4200，）和（16800，1）

Question

Ihave created an SVR model using SCIKIT-LEARN, I am trying to plot my data but for some reason I am receiving the error: 我已经使用SCIKIT-LEARN创建了SVR模型，我试图绘制数据，但是由于某些原因，我收到了错误消息：

ValueError: x and y must have same first dimension, but have shapes (4200,) and (16800, 1) ValueError：x和y必须具有相同的第一尺寸，但形状为（4200，）和（16800，1）

I have split my data into training and testing data, training the model and making a prediction. 我将数据分为训练和测试数据，训练模型并做出预测。 My code is: 我的代码是：

X_feature = wind_speed

X_feature = X_feature.reshape(-1, 1)## Reshaping array to be 1D from 2D

y_label = Power
y_label = y_label.reshape(-1,1)

    timeseries_split = TimeSeriesSplit(n_splits=3) ## Splitting training testing data into 3 splits
    for train_index, test_index in timeseries_split.split(X_feature):## for loop to obtain print the training and splitting of the data 
    print("Training data:",train_index, "Testing data test:", test_index)#
    X_train, X_test = X_feature[train_index], X_feature[test_index]
    y_train, y_test = y_label[train_index], y_label [test_index]



    timeseries_split = TimeSeriesSplit(n_splits=3) ## Splitting training testing data into 3 splits






    scaler =pre.MinMaxScaler(feature_range=(0,1)).fit(X_train)## Data is being preprocessed then standard deviation 


    scaled_wind_speed_train = scaler.transform(X_train)## Wind speed training data is being scaled and then transformed 

    scaled_wind_speed_test = scaler.transform(X_test)## Wind speed test data is being scaled and then transformed

    SVR_model = svm.SVR(kernel='rbf',C=100,gamma=.001).fit(scaled_wind_speed_train,y_train)



    y_prediction = SVR_model.predict(scaled_wind_speed_test)

    SVR_model.score(scaled_wind_speed_test,y_test)


    rmse=numpy.sqrt(mean_squared_error(y_label,y_prediction))
    print("RMSE:",rmse)


    fig, bx = plt.subplots(figsize=(19,8))
    bx.plot(y_prediction, X_feature,'bs')
    fig.suptitle('Wind Power Prediction v Wind Speed', fontsize=20)
    plt.xlabel('Wind Power Data')
    plt.ylabel('Predicted Power')
    plt.xticks(rotation=30)
    plt.show() 


     fig, bx = plt.subplots(figsize=(19,8))
     bx.plot( y_prediction, y_label)
     fig.suptitle('Wind Power Prediction v Measured Wind Power ', fontsize=20)
     plt.xlabel('Wind Power Data')
     plt.ylabel('Predicted Power')


     fig, bx = plt.subplots(figsize=(19,8))
     bx.plot(y_prediction)
     fig.suptitle('Wind Power Prediction v Measured Wind Power ', fontsize=20)
     plt.xlabel('Wind Power Data')
     plt.ylabel('Predicted Power')

I believe this code is being genrated when I am trying to obtain the rmse in the line: 我认为当我尝试在行中获取rmse时，正在生成此代码：

rmse=numpy.sqrt(mean_squared_error(y_label,y_prediction))

This error also occurs when I comment this line out and try to plot my data.. 当我注释掉该行并尝试绘制数据时，也会发生此错误。

My traceback error message is: 我的回溯错误消息是：

ValueError                                Traceback (most recent call last)
<ipython-input-57-ed11a9ca7fd8> in <module>()
     79 
     80     fig, bx = plt.subplots(figsize=(19,8))
---> 81     bx.plot( y_prediction, y_label)
     82     fig.suptitle('Wind Power Prediction v Measured Wind Power ', fontsize=20)
     83     plt.xlabel('Wind Power Data')

~/anaconda3_501/lib/python3.6/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
   1715                     warnings.warn(msg % (label_namer, func.__name__),
   1716                                   RuntimeWarning, stacklevel=2)
-> 1717             return func(ax, *args, **kwargs)
   1718         pre_doc = inner.__doc__
   1719         if pre_doc is None:

~/anaconda3_501/lib/python3.6/site-packages/matplotlib/axes/_axes.py in plot(self, *args, **kwargs)
   1370         kwargs = cbook.normalize_kwargs(kwargs, _alias_map)
   1371 
-> 1372         for line in self._get_lines(*args, **kwargs):
   1373             self.add_line(line)
   1374             lines.append(line)

~/anaconda3_501/lib/python3.6/site-packages/matplotlib/axes/_base.py in _grab_next_args(self, *args, **kwargs)
    402                 this += args[0],
    403                 args = args[1:]
--> 404             for seg in self._plot_args(this, kwargs):
    405                 yield seg
    406 

~/anaconda3_501/lib/python3.6/site-packages/matplotlib/axes/_base.py in _plot_args(self, tup, kwargs)
    382             x, y = index_of(tup[-1])
    383 
--> 384         x, y = self._xy_from_xy(x, y)
    385 
    386         if self.command == 'plot':

~/anaconda3_501/lib/python3.6/site-packages/matplotlib/axes/_base.py in _xy_from_xy(self, x, y)
    241         if x.shape[0] != y.shape[0]:
    242             raise ValueError("x and y must have same first dimension, but "
--> 243                              "have shapes {} and {}".format(x.shape, y.shape))
    244         if x.ndim > 2 or y.ndim > 2:
    245             raise ValueError("x and y can be no greater than 2-D, but have "

ValueError: x and y must have same first dimension, but have shapes (4200,) and (16800, 1)

Answer 1

I think you have mixed the arguements for mean_squared_error , it should be 我认为您已经对mean_squared_error的mean_squared_error ，应该是

rmse=numpy.sqrt(mean_squared_error(y_test,y_prediction))

Update : as per the latest error, try this 更新：根据最新的错误，请尝试此

fig, bx = plt.subplots(figsize=(19,8))
bx.plot(y_prediction, scaled_wind_speed_test,'bs')
fig.suptitle('Wind Power Prediction v Wind Speed', fontsize=20)
plt.xlabel('Wind Power Data')
plt.ylabel('Predicted Power')
plt.xticks(rotation=30)
plt.show()

Update 2 In case you get error on the other plot try this 更新2，以防万一您在另一个情节上遇到错误，请尝试此操作

fig, bx = plt.subplots(figsize=(19,8))
bx.plot( y_prediction, y_test)
fig.suptitle('Wind Power Prediction v Measured Wind Power ', fontsize=20)
plt.xlabel('Wind Power Data')
plt.ylabel('Predicted Power')

Answer 2

Numpy's function mean_squared_error expects two arrays of the same size. Numpy的函数mean_squared_error期望两个大小相同的数组。 The error you are getting implies that these two do not have the same size. 您得到的错误意味着这两个大小不相同。

You can check your array sizes by 您可以通过以下方式检查数组大小

print(array_1.shape)
print(array_2.shape)

if the output you get is 如果您得到的输出是

output:
> (4200,)
> (4200, 1)

you can fix by doing 你可以通过做来解决

new_array_2 = array_2.transpose()[0]

and then 接着

mean_squared_error(array_1, new_array_2)

if the two input arguments, whatever they are give you the following shapes 如果有两个输入参数，无论它们是什么，它们都会为您提供以下形状

print(array_1.shape)
print(array_2.shape)

output:
> (4200,)
> (16800, 1)

try 尝试

new_array_1 = scalar.transform(array_1)

or 要么

new_array_2 = scalar.transform(array_2)

until you get arrays with the same number whether it's 16800 or 4200. Once you have two of the same size, but the one or both still comes with an extra dimension, 直到获得具有相同编号（无论是16800还是4200）的阵列。一旦两个具有相同的大小，但是一个或两个仍然具有额外的尺寸，

then again do 然后再做一次

new_new_array_1 = scalar.transform(new_array_1)[0]

and feed these to mean_squared_error , eg 并将它们提供给mean_squared_error ，例如

mean_squared_error(new_new_array_1, new_array_2)

ValueError：x和y必须具有相同的第一尺寸，但形状为（4200，）和（16800，1）

问题描述

2 个解决方案

解决方案1
2 已采纳 2018-06-26 08:09:42

解决方案2
0 2018-06-26 08:13:04

ValueError：x和y必须具有相同的第一尺寸，但形状为（4200，）和（16800，1）

问题描述

2 个解决方案

解决方案1 2 已采纳 2018-06-26 08:09:42

解决方案2 0 2018-06-26 08:13:04

解决方案1
2 已采纳 2018-06-26 08:09:42

解决方案2
0 2018-06-26 08:13:04