I'm trying to show a confusion matrix for predicted test data (binary text classification). But I can't get y_pred
to match y_test
after running model.predict()
.
First, let's look at the test/true data:
y_test = (y_test > 0.5)
print(y_test)
print(type(y_test))
Output:
2 False
17 True
18 True
...
4980 True
4986 False
4990 True
pandas.core.series.Series
The missing indexes are contained in the training set.
Here's what happens when we predict based on test data:
y_pred = model.predict(data_test)
y_pred = (y_pred > 0.5)
print(y_pred)
print(type(y_pred))
Output:
[[ True]
[ True]
[ True]
[False]
...
[ True]
[ True]
[ True]]
numpy.ndarray
Test/True data:
y_test = (y_test > 0.5)
print(y_test)
Output:
2 False
17 True
18 True
...
4980 True
4986 False
4990 True
Ultimately I'm looking to build a confusion matrix, but the data isn't the same format.
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
What do you recommend?
Attempts so far:
y_test_np = y_test.values
Output:
[False True True ... True False True]
Closer, but it looks like I need each item to also be an array (eg [[ True] [False] [ True]]
). How can I align the arrays?
Just for illustration let's create some sample data.
y_test = pd.Series([True, False])
y_pred = np.array([[True], [False]])
You can convert the pandas Series y_test
to a numpy array
y_test.values
and squeeze
the numpy array y_pred
to obtain the same shape
numpy.squeeze(y_pred)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.