Below operations concern Logistic Regression in Python scikit-learn
I give you the most important sample of the code:
predictions = logistic_regression.predict(X_test)
prediction=logistic_regression.predict_proba(X_test)[:,:]
prediction=pd.DataFrame(data=predictions,
columns=['Prob of Bad credit (0)','Prob of Good credit (1)'])
prediction.head(10)
And yesterday I had result of this code which was in line with my expectations: (not the same table title but the same result)
But today, I absolute do not have idea why, when I wanted to run this code again I have an Error:
ValueError: Shape of passed values is (300, 1), indices imply (300, 2)
How it is possible that yesterday it worked and today not ? What can I do ? Screen of full error below:
sample of predictions is like that:
print(predictions)
[1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
and I do not want to have 1 or 0 in table I would like to have in percent probaility of 1 or 0 as in example in screen
Look at the same table at the end of prediction from below source, there is the same code and it works: https://www.kaggle.com/neisha/heart-disease-prediction-using-logistic-regression
I think the error occurs because prediction has just one row, and you have two column names:
prediction=pd.DataFrame(data=predictions,
columns=['Prob of Bad credit (0)','Prob of Good credit (1)'])
Based on the codes on kaggle you provide:
y_pred_prob=logreg.predict_proba(x_test)[:,:]
y_pred_prob_df=pd.DataFrame(data=y_pred_prob, columns=['Prob of no heart disease (0)','Prob of Heart Disease (1)'])
y_pred_prob_df.head()
I think you should change your code to:
prediction_df = pd.DataFrame(data=prediction,
columns=['Prob of Bad credit (0)','Prob of Good credit (1)'])
Be careful it should be prediction, not predictions.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.