[英]Have I made a mistake in my For loop in python code? Model accuracy is too high so double checking
[英]Why is my predictive model accuracy is too high?
這是我的 train_test_split 代碼
from sklearn.model_selection import train_test_split
X_train,X_test, y_train, y_test = train_test_split(X,y, test_size= 0.20, random_state = 40)
print("x_train ",X_train.shape)
print("x_test ",X_test.shape)
print("y_train ",y_train.shape)
print("y_test ",y_test.shape)
x_train (32408, 29)
x_test (8103, 29)
y_train (32408,)
y_test (8103,)
from sklearn.ensemble import RandomForestClassifier
classifier = RandomForestClassifier(random_state=0, n_estimators=100,\
criterion = 'entropy', max_leaf_nodes=30,n_jobs=-1)
model_RF = classifier.fit(X_train, y_train)
acc_train_rf = round(classifier.score(X_train, y_train),2)*100
print(" Model accuracy within training data is : " + str(acc_train_rf) +"%")
Model accuracy within training data is: 100.0%
您正在使用您的訓練數據來計算分數。 使用您的測試數據。
改變
acc_train_rf = round(classifier.score(X_train, y_train),2)*100
至
acc_train_rf = round(classifier.score(X_test, y_test),2)*100
我以前遇到過同樣的問題。 我的 model 以極快的速度完成了默認參數的訓練,並且無論在訓練集、有效集還是測試集上,所有內容(准確率/召回率/精度/F1 分數)始終為 100%。
我的建議:
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.