[英]dtype='numeric' is not compatible with arrays of bytes/strings.Convert your data to numeric values explicitly instead
I have these data I want to use for a logistic regression problem.我有这些数据想用于逻辑回归问题。 shape of the data:
数据形状:
((108, 2),##train input
(108,),##train output
(35, 2), ##val input
(35,),##val output
(28, 2),##test input
(28,),##test output
(171, 3), ## all data
I did this:我这样做了:
'''
X = X_train.reshape(-2,2)
y = y_train.reshape(-1,1)
model_lr = LogisticRegression()
res = model_lr.fit(X,y)
X_test = np.array(X_test,dtype = float)
test = X_test.reshape(-2,2)
test = np.array(test,dtype = float)
pred = model_lr.predict(test)
from sklearn.metrics import roc_auc_score
from sklearn.metrics import roc_curve
output_test = y_test.reshape(-1,1)
output_test = np.array(output_test,dtype = float)
logit_roc_auc = roc_auc_score(output_test, model_lr.predict(test))
'''
and I have this error message:我有这个错误信息:
logit_roc_auc = roc_auc_score(output_test, model_lr.predict(test))
ValueError: dtype='numeric' is not compatible with arrays of bytes/strings.Convert your data to numeric values explicitly instead.
can anybody help?有人可以帮忙吗? thanks
谢谢
I tried reshaping the output variable, but I didn't succeed.我尝试重塑输出变量,但没有成功。
roc_auc_score
should be able to handle an array of strings. roc_auc_score
应该能够处理字符串数组。 But computing an ROC curve generally requires y_pred
to be an array of floats.但是计算 ROC 曲线通常需要
y_pred
是一个浮点数组。
Print your output_test
and model_lr.predict(test)
and make sure they look like the following—you'll probably see you need to switch to model_lr.predict_proba(test)
:打印你的
output_test
和model_lr.predict(test)
并确保它们看起来像下面这样——你可能会看到你需要切换到model_lr.predict_proba(test)
:
from sklearn.metrics import roc_auc_score
y_true = ["A", "A", "A", "B", "B", "B"]
y_pred = [0.2, 0.3, 0.6, 0.4, 0.7, 0.8]
print(roc_auc_score(y_true, y_pred))
# 0.8888
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.