BERT model 的准确率和混淆矩阵（基于教程）

Question

我正在使用本教程https://skimai.com/fine-tuning-bert-for-sentiment-analysis/来微调预训练的 bert model。 我在我的问题中采用了文章的代码，它有 8 个类。

我使用以下代码来获取测试集中的概率

# Predictions on the testing set 
# Run `preprocessing_for_bert` on the test set
print('Tokenizing data...')
test_inputs, test_masks = preprocessing_for_bert(texts_test)

# Create the DataLoader for our test set
test_dataset = TensorDataset(test_inputs, test_masks)
test_sampler = SequentialSampler(test_dataset)
test_dataloader = DataLoader(test_dataset, sampler=test_sampler, batch_size=32)

# Compute predicted probabilities on the test set
probs = bert_predict(bert_classifier, test_dataloader)

我有两个问题：

如何获得测试集的准确性？

我尝试这样做，但出现以下错误（评估 function 可以在文章内部找到）

如何获得测试集的混淆矩阵？

Answer 1

你能想到的最简单的方法就是选择最有可能的 class 作为 label，然后为所欲为！ 假设变量probs具有test_samples * class_number的形状，而y_test变量（AKA 实际）具有test_samples * 1的形状。 所以唯一必须做的就是为每个样本分配最可能的 class（我会使用numpy.argmax函数）。 然后你有两个 arrays 形状为test_samples * 1 （比如说y_pred和y_true ）。 有很多内置函数可以计算各种类型的分数。

from sklearn.metrics import accuracy_score
y_pred = [0, 2, 1, 3]
y_true = [0, 1, 2, 3]
accuracy_score(y_true, y_pred)
# output 0.5

from sklearn.metrics import confusion_matrix
y_true = [2, 0, 2, 2, 0, 1]
y_pred = [0, 0, 2, 2, 0, 2]
confusion_matrix(y_true, y_pred)
# output array([[2, 0, 0],
                [0, 0, 1],
                [1, 0, 2]])

BERT model 的准确率和混淆矩阵（基于教程）

问题描述

1 个解决方案

解决方案1
0 2022-01-16 06:53:50

BERT model 的准确率和混淆矩阵（基于教程）

问题描述

1 个解决方案

解决方案1 0 2022-01-16 06:53:50

解决方案1
0 2022-01-16 06:53:50