简体   繁体   English

Python 中的随机森林 [r2_score 中的错误]

[英]Random Forest In Python [Error in r2_score]

I am new to Machine Learning and to Python.我是机器学习和 Python 的新手。 I am trying to build a Random Forest model in order to predict cement strength.我正在尝试构建一个随机森林 model 以预测水泥强度。 There are two .csv files: train_data.csv and test_data.csv .有两个.csv文件: train_data.csvtest_data.csv

This is what I have done.这就是我所做的。 I am trying to predict the r2_score here.我试图在这里预测r2_score

df=pd.read_csv("train_data(1).csv")
X=df.drop('strength',axis=1)
y=df['strength']
model=RandomForestRegressor()
model.fit(X,y)
X_test=pd.read_csv("test_data.csv")
y_pred=model.predict(X_test)
acc_R=metrics.r2_score(y,y_pred)
acc_R

The problem here is that the shape of y and y_pred is different.这里的问题是yy_pred的形状不同。 So I get this error:所以我得到这个错误:

ValueError: Found input variables with inconsistent numbers of samples: [721, 309]

How do I correct this?我该如何纠正? Can someone explain to me what I am doing wrong?有人可以向我解释我做错了什么吗?

You need to compare y_pred with y_test .您需要将y_predy_test进行比较。 Not y which you used to train the model:不是你用来训练y的:

acc_R=metrics.r2_score(y_test,y_pred)

There should be another list of labels for the y_test in test_data.csv. test_data.csv 中应该有另一个 y_test 的标签列表。

Try the following:尝试以下操作:

df=pd.read_csv("train_data(1).csv")
X=df.drop('strength',axis=1)
y=df['strength']
model=RandomForestRegressor()
model.fit(X,y)
df1=pd.read_csv("test_data.csv") # we read the csv data from test
X_test=df1.drop('strength',axis=1) # get the fields that we will predict
y_test=df1['strength'] # get the correct labels for X_test
y_pred=model.predict(X_test) # get the predicted results
acc_R=metrics.r2_score(y_test,y_pred) # compare
acc_R
df_train = pd.read_csv("train_data(1).csv")
X_train = df.drop('strength',axis=1)
y_train = df['strength']
model=RandomForestRegressor()
model.fit(X_train,y_train)
df_test = pd.read_csv("test_data.csv")
X_test = df.drop('strength',axis=1) # if your test data consists of 'strength' 
y_test = df['strength'] # if your test data consists of 'strength' 
y_pred = model.predict(X_test)
acc_R = metrics.r2_score(y_test,y_pred)
acc_R

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM