[英]Random Forest In Python [Error in r2_score]
I am new to Machine Learning and to Python.我是机器学习和 Python 的新手。 I am trying to build a Random Forest model in order to predict cement strength.
我正在尝试构建一个随机森林 model 以预测水泥强度。 There are two
.csv
files: train_data.csv
and test_data.csv
.有两个
.csv
文件: train_data.csv
和test_data.csv
。
This is what I have done.这就是我所做的。 I am trying to predict the
r2_score
here.我试图在这里预测
r2_score
。
df=pd.read_csv("train_data(1).csv")
X=df.drop('strength',axis=1)
y=df['strength']
model=RandomForestRegressor()
model.fit(X,y)
X_test=pd.read_csv("test_data.csv")
y_pred=model.predict(X_test)
acc_R=metrics.r2_score(y,y_pred)
acc_R
The problem here is that the shape of y
and y_pred
is different.这里的问题是
y
和y_pred
的形状不同。 So I get this error:所以我得到这个错误:
ValueError: Found input variables with inconsistent numbers of samples: [721, 309]
How do I correct this?我该如何纠正? Can someone explain to me what I am doing wrong?
有人可以向我解释我做错了什么吗?
You need to compare y_pred
with y_test
.您需要将
y_pred
与y_test
进行比较。 Not y
which you used to train the model:不是你用来训练
y
的:
acc_R=metrics.r2_score(y_test,y_pred)
There should be another list of labels for the y_test in test_data.csv. test_data.csv 中应该有另一个 y_test 的标签列表。
Try the following:尝试以下操作:
df=pd.read_csv("train_data(1).csv")
X=df.drop('strength',axis=1)
y=df['strength']
model=RandomForestRegressor()
model.fit(X,y)
df1=pd.read_csv("test_data.csv") # we read the csv data from test
X_test=df1.drop('strength',axis=1) # get the fields that we will predict
y_test=df1['strength'] # get the correct labels for X_test
y_pred=model.predict(X_test) # get the predicted results
acc_R=metrics.r2_score(y_test,y_pred) # compare
acc_R
df_train = pd.read_csv("train_data(1).csv")
X_train = df.drop('strength',axis=1)
y_train = df['strength']
model=RandomForestRegressor()
model.fit(X_train,y_train)
df_test = pd.read_csv("test_data.csv")
X_test = df.drop('strength',axis=1) # if your test data consists of 'strength'
y_test = df['strength'] # if your test data consists of 'strength'
y_pred = model.predict(X_test)
acc_R = metrics.r2_score(y_test,y_pred)
acc_R
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.