[英]How can I drop the index of a Pandas Series (pandas.core.series.Series) to return a numpy.ndarray?
I'm trying to show a confusion matrix for predicted test data (binary text classification).我正在尝试显示预测测试数据(二进制文本分类)的混淆矩阵。 But I can't get y_pred
to match y_test
after running model.predict()
.但是在运行model.predict()
后,我无法让y_pred
匹配y_test
。
First, let's look at the test/true data:首先,让我们看一下测试/真实数据:
y_test = (y_test > 0.5)
print(y_test)
print(type(y_test))
Output:输出:
2 False
17 True
18 True
...
4980 True
4986 False
4990 True
pandas.core.series.Series
The missing indexes are contained in the training set.缺失的索引包含在训练集中。
Here's what happens when we predict based on test data:当我们根据测试数据进行预测时,会发生以下情况:
y_pred = model.predict(data_test)
y_pred = (y_pred > 0.5)
print(y_pred)
print(type(y_pred))
Output:输出:
[[ True]
[ True]
[ True]
[False]
...
[ True]
[ True]
[ True]]
numpy.ndarray
Test/True data:测试/真实数据:
y_test = (y_test > 0.5)
print(y_test)
Output:输出:
2 False
17 True
18 True
...
4980 True
4986 False
4990 True
Ultimately I'm looking to build a confusion matrix, but the data isn't the same format.最终,我希望构建一个混淆矩阵,但数据格式不同。
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
What do you recommend?你有什么建议吗?
Attempts so far:迄今为止的尝试:
y_test_np = y_test.values
Output:输出:
[False True True ... True False True]
Closer, but it looks like I need each item to also be an array (eg [[ True] [False] [ True]]
).更接近,但看起来我需要每个项目也是一个数组(例如[[ True] [False] [ True]]
)。 How can I align the arrays?如何对齐数组?
Just for illustration let's create some sample data.只是为了说明,让我们创建一些示例数据。
y_test = pd.Series([True, False])
y_pred = np.array([[True], [False]])
You can convert the pandas Series y_test
to a numpy array您可以将熊猫系列y_test
转换为 numpy 数组
y_test.values
and squeeze
the numpy array y_pred
to obtain the same shape并squeeze
numpy 数组y_pred
以获得相同的形状
numpy.squeeze(y_pred)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.