如何删除 Pandas 系列 (pandas.core.series.Series) 的索引以返回 numpy.ndarray？

Question

I'm trying to show a confusion matrix for predicted test data (binary text classification).我正在尝试显示预测测试数据（二进制文本分类）的混淆矩阵。 But I can't get y_pred to match y_test after running model.predict() .但是在运行model.predict()后，我无法让y_pred匹配y_test 。

First, let's look at the test/true data:首先，让我们看一下测试/真实数据：

y_test = (y_test > 0.5)
print(y_test)
print(type(y_test))

Output:输出：

2       False
17       True
18       True
...
4980     True
4986    False
4990     True
pandas.core.series.Series

The missing indexes are contained in the training set.缺失的索引包含在训练集中。

Here's what happens when we predict based on test data:当我们根据测试数据进行预测时，会发生以下情况：

y_pred = model.predict(data_test)
y_pred = (y_pred > 0.5)
print(y_pred)
print(type(y_pred))

Output:输出：

[[ True]
 [ True]
 [ True]
 [False]
 ...
 [ True]
 [ True]
 [ True]]
numpy.ndarray

Test/True data:测试/真实数据：

y_test = (y_test > 0.5)
print(y_test)

Output:输出：

2       False
17       True
18       True
...
4980     True
4986    False
4990     True

Ultimately I'm looking to build a confusion matrix, but the data isn't the same format.最终，我希望构建一个混淆矩阵，但数据格式不同。

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)

What do you recommend?你有什么建议吗？

Attempts so far:迄今为止的尝试：

y_test_np = y_test.values

Output:输出：

[False  True  True ... True False  True]

Closer, but it looks like I need each item to also be an array (eg [[ True] [False] [ True]] ).更接近，但看起来我需要每个项目也是一个数组（例如[[ True] [False] [ True]] ）。 How can I align the arrays?如何对齐数组？

Answer 1

Just for illustration let's create some sample data.只是为了说明，让我们创建一些示例数据。

y_test = pd.Series([True, False])
y_pred = np.array([[True], [False]])

You can convert the pandas Series y_test to a numpy array您可以将熊猫系列y_test转换为 numpy 数组

y_test.values

and squeeze the numpy array y_pred to obtain the same shape并squeeze numpy 数组y_pred以获得相同的形状

numpy.squeeze(y_pred)

如何删除 Pandas 系列 (pandas.core.series.Series) 的索引以返回 numpy.ndarray？

问题描述

1 个解决方案

解决方案1
0 2018-10-06 19:49:10

如何删除 Pandas 系列 (pandas.core.series.Series) 的索引以返回 numpy.ndarray？

问题描述

1 个解决方案

解决方案1 0 2018-10-06 19:49:10

解决方案1
0 2018-10-06 19:49:10