简体   繁体   English

改变神经网络中 Sigmoid 激活的阈值

[英]Changing thresholds in the Sigmoid Activation in Neural Networks

Hi I am new to machine learning and I have a query about changing thresholds for sigmoid function.嗨,我是机器学习的新手,我有一个关于更改 sigmoid function 阈值的问题。

I know Sigmoid function's value is in the range [0;1], 0.5 is taken as a threshold, if h(theta) < 0.5 we assume that it's value is 0, if h(theta) >= 0.5 then it's 1.我知道 Sigmoid 函数的值在 [0;1] 范围内,将 0.5 作为阈值,如果 h(theta) < 0.5 我们假设它的值为 0,如果 h(theta) >= 0.5 那么它是 1。

Thresholds are used only on the output layer of the network and it's only when classifying.阈值仅用于网络的 output 层,并且仅在分类时使用。 So, if you're trying to classify between 3 classes can you give different thresholds for each class (0.2,0.4,0.4 - for each class)?那么,如果您尝试在 3 个类别之间进行分类,您能否为每个 class (0.2、0.4、0.4 - 每个类别)提供不同的阈值? Or can you specify a different threshold overall, like 0.8?或者您可以指定一个不同的总体阈值,例如 0.8? I am unsure how to define this in the code below.我不确定如何在下面的代码中定义它。 Any guidance is appreciated.任何指导表示赞赏。

# Hyper Parameters
input_size = 14
hidden_size = 40
hidden_size2 = 30
num_classes = 3
num_epochs = 600
batch_size = 34
learning_rate = 0.01


class Net(torch.nn.Module):
    def __init__(self, n_input, n_hidden, n_hidden2, n_output):
        super(Net, self).__init__()
        # define linear hidden layer output
        self.hidden = torch.nn.Linear(n_input, n_hidden)
        self.hidden2 = torch.nn.Linear(n_hidden, n_hidden2)
        # define linear output layer output
        self.out = torch.nn.Linear(n_hidden, n_output)

    def forward(self, x):
        """
            In the forward function we define the process of performing
            forward pass, that is to accept a Variable of input
            data, x, and return a Variable of output data, y_pred.
        """
        # get hidden layer input
        h_input1 = self.hidden(x)
        # define activation function for hidden layer
        h_output1 = torch.sigmoid(h_input1)

        # get hidden layer input
        h_input2 = self.hidden2(h_output1)
        # define activation function for hidden layer
        h_output2 = torch.sigmoid(h_input2)

        # get output layer output
        out = self.out(h_output2)

        return out


net = Net(input_size, hidden_size, hidden_size, num_classes)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)

all_losses = []

for epoch in range(num_epochs):
    total = 0
    correct = 0
    total_loss = 0
    for step, (batch_x, batch_y) in enumerate(train_loader):
        X = batch_x
        Y = batch_y.long()

    # Forward + Backward + Optimize
    optimizer.zero_grad()  # zero the gradient buffer
    outputs = net(X)
    loss = criterion(outputs, Y)
    all_losses.append(loss.item())
    loss.backward()
    optimizer.step()

    if epoch % 50 == 0:
        _, predicted = torch.max(outputs, 1)
        # calculate and print accuracy
        total = total + predicted.size(0)
        correct = correct + sum(predicted.data.numpy() == Y.data.numpy())
        total_loss = total_loss + loss
    if epoch % 50 == 0:
        print(
            "Epoch [%d/%d], Loss: %.4f, Accuracy: %.2f %%"
            % (epoch + 1, num_epochs, total_loss, 100 * correct / total)
        )

train_input = train_data.iloc[:, :input_size]
train_target = train_data.iloc[:, input_size]

inputs = torch.Tensor(train_input.values).float()
targets = torch.Tensor(train_target.values - 1).long()

outputs = net(inputs)
_, predicted = torch.max(outputs, 1)

You can use any threshold you find suitable.您可以使用任何您认为合适的阈值。

Neural networks are known to be often over-confident (eg applying 0.95 to one of 50 classes), so it may be beneficial to use different threshold in your case.众所周知,神经网络通常过于自信(例如,将0.95应用于50类别之一),因此在您的情况下使用不同的阈值可能是有益的。

Your training is fine, but you should change predictions (last two lines) and use torch.nn.softmax like this:您的训练很好,但您应该更改预测(最后两行)并像这样使用torch.nn.softmax

outputs = net(inputs) 
probabilities = torch.nn.functional.softmax(outputs, 1)

As mentioned in other answer you will get each row with probabilities summing to 1 (previously you had unnormalized probabilities aka logits).如其他答案中所述,您将获得概率总和为1的每一行(以前您有未归一化的概率,即 logits)。

Now, just use your desired threshold on those probabilities:现在,只需对这些概率使用您想要的阈值:

predictions = probabilities > 0.8

Please notice you may get only zeros in some cases (eg [0.2, 0.3, 0.5] ).请注意,在某些情况下您可能只得到零(例如[0.2, 0.3, 0.5] )。

This would mean neural network isn't confident enough according to your standards and would probably drop number of incorrect positive predictions (abstract, but say you are predicting whether a patient doesn't have one of mutually exclusive 3 diseases. It's better to say so only if you are really sure).这意味着根据您的标准,神经网络不够自信,并且可能会减少不正确的阳性预测的数量(抽象,但假设您正在预测患者是否没有相互排斥的3种疾病之一。最好这么说只有当你真的确定时)。

Different thresholds for each class每个 class 的不同阈值

This could be done as well like this:这也可以像这样完成:

thresholds = torch.tensor([0.1, 0.1, 0.8]).unsqueeze(0)
predictions = probabilities > thresholds

Final comments最后的评论

Please notice in case of softmax only one class should be the answer (as pointed out in another answer) and this approach (and mention of sigmoid) may indicate you are after multilabel classification .请注意,在softmax的情况下,只有一个 class 应该是答案(正如在另一个答案中指出的那样),这种方法(并提到 sigmoid)可能表明您在multilabel classification之后。

If you want to train your network so it can simultaneously predict classes you should use sigmoid and change your loss to torch.nn.BCEWithLogitsLoss .如果您想训练您的网络以便它可以同时预测类,您应该使用sigmoid并将您的损失更改为torch.nn.BCEWithLogitsLoss

In a multi class classification you should have an output for each class.在多 class 分类中,每个 class 应该有一个 output。 Then you can use the softmax function to normalize the output, so the sum of all of them is 1. The output with the biggest value is the one chosen as the classification.然后你可以使用softmax function对output进行归一化,所以它们的总和为1。选择最大值的output作为分类。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM