简体   繁体   English

神经网络的最后一层应该有多少个神经元?

[英]How many neurons should be in the last layer of the neural network?

I use the following code to classify movie reviews into three classes (negative as -1, neutral as 0, and 1 as positive).我使用以下代码将电影评论分为三类(负面为 -1,中性为 0,1 为正面)。 But is it true that there is only one output neuron in the last layer for the three-class classification problem?但是对于三类分类问题,最后一层真的只有一个 output 神经元吗?

import tensorflow as tf
import numpy as np
import pandas as pd
import numpy as K

csvfilename_train = 'train(cleaned).csv'
csvfilename_test = 'test(cleaned).csv'

# Read .csv files as pandas dataframes
df_train = pd.read_csv(csvfilename_train)
df_test = pd.read_csv(csvfilename_test)

train_sentences  = df_train['Comment'].values
test_sentences  = df_test['Comment'].values

# Extract labels from dataframes
train_labels = df_train['Sentiment'].values
test_labels = df_test['Sentiment'].values

vocab_size = 10000
embedding_dim = 16
max_length = 30
trunc_type = 'post'
oov_tok = '<OOV>'

from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

tokenizer = Tokenizer(num_words = vocab_size, oov_token = oov_tok)
tokenizer.fit_on_texts(train_sentences)
word_index = tokenizer.word_index
sequences = tokenizer.texts_to_sequences(train_sentences)
padded = pad_sequences(sequences, maxlen = max_length, truncating = trunc_type)

test_sequences = tokenizer.texts_to_sequences(test_sentences)
test_padded = pad_sequences(test_sequences, maxlen = max_length)

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length = max_length),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(6, activation = 'relu'),
    tf.keras.layers.Dense(1, activation = 'sigmoid'),
])
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])

num_epochs = 10
model.fit(padded, train_labels, epochs = num_epochs, validation_data = (test_padded, test_labels))

When I changes tf.keras.layers.Dense(1, activation = 'sigmoid') to tf.keras.layers.Dense(2, activation = 'sigmoid') it gives me the following error :当我将tf.keras.layers.Dense(1, activation = 'sigmoid')更改为tf.keras.layers.Dense(2, activation = 'sigmoid')时,它给了我以下错误

---> 10 model.fit(padded, train_labels, epochs = num_epochs, validation_data = (test_padded,test_labels))
     ValueError: logits and labels must have the same shape ((None, 2) vs (None, 1))

You should have 3 neurons if you are classifying between 3 categories.如果您在 3 个类别之间进行分类,您应该有 3 个神经元。

Also, you should use the 'softmax' activation for your final layer, assuming that all observations are in one class only.此外,您应该为最后一层使用'softmax'激活,假设所有观察都在一个 class 中。

Next, you should use 'sparse_categorical_crossentropy' since your input is not one-hot encoded.接下来,您应该使用'sparse_categorical_crossentropy'因为您的输入不是一次性编码的。 Targets like [0,0,1], [0,1,0], [1,0,0] are optional, you can also have [1, 2, 0, 1, 2, 1, 0] . [0,0,1], [0,1,0], [1,0,0]等目标是可选的,您也可以有[1, 2, 0, 1, 2, 1, 0]

Finally, your targets should be [0, 1, 2] and not [-1, 0, 1] so I suggest you add 1 to your labels.最后,您的目标应该是[0, 1, 2]而不是[-1, 0, 1]所以我建议您在标签中添加 1。

test_labels = df_test['Sentiment'].values + 1

This is what happens if labels are [-1, 0, 1] instead of [0, 1, 2] :如果标签是[-1, 0, 1]而不是[0, 1, 2]会发生这种情况:

import tensorflow as tf

sparse_entropy = tf.losses.SparseCategoricalCrossentropy()

a = tf.convert_to_tensor([[-1., 0., 1.]]) #+ 1
b = tf.convert_to_tensor([[.4, .2, .4], [.1, .7, .2], [.8, .1, .1]])

sparse_entropy(a, b)
nan

If you uncomment the +1 , which transforms the labels into [0, 1, 2] , it works:如果您取消注释+1 ,它将标签转换为[0, 1, 2] ,它可以工作:

<tf.Tensor: shape=(), dtype=float32, numpy=1.1918503>

Short answer:简短的回答:

One hot encode your train labels and use categorical crossentropy as loss function.一个热编码您的火车标签并使用分类交叉熵作为损失 function。

Cause:原因:

  1. Your logits have shape (n,2) but labels have (n,1).您的 logits 具有形状 (n,2),但标签具有 (n,1)。
  2. Your logits and labels should be of shape (n,3) if youre using crossentropy(unless it is sparse).如果您使用交叉熵(除非它是稀疏的),您的 logits 和标签应该是形状 (n,3)。

Solution:解决方案:

  1. One hot encode the train labels and you'll get train labels shape (n,3)对火车标签进行一次热编码,您将获得火车标签形状 (n,3)
  2. Use categorical crossentropy with final dense neuron having 3 outputs, then you'll get logits shape(n,3)使用具有 3 个输出的最终密集神经元的分类交叉熵,然后您将获得 logits shape(n,3)

Your model will start learning after this.您的 model 将在此之后开始学习。

You got 3 classes -> num_classes=3 Your last layer should look like this:你有 3 个类 -> num_classes=3 你的最后一层应该是这样的:

tf.keras.layers.Dense(num_classes, activation = 'sigmoid'),

You will receive a np.array with 3 probabilities as output.您将收到一个具有 3 个概率的 np.array,即 output。 Moreover, you should change your class to categorical_crossentropy because you are not solving a binary problem.此外,您应该将 class 更改为 categorical_crossentropy,因为您没有解决二进制问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 一个神经网络的输出层应该有多少个神经元用于分类? - How many neurons should be there in the output layer of a neural network for classification? 具有两个神经元的神经网络 - Neural Network with two neurons 我应该如何更改我的神经网络模型以适合最后的致密(2,激活)层? - How should I change my neural network model to make fit the last dense(2 ,activation) layer? 实施偏向神经元神经网络 - Implement bias neurons neural network 对于神经网络(火炬)中的每一层,应该有多少偏差? - For each layer in neural networks (pytorch), how many biases should be there? PyTorch-如何设置神经元的激活规则以提高神经网络的效率? - PyTorch - How to set Activation Rules of neurons to increase efficiency of Neural Network? 训练我的神经网络后,如何从最后一个解码器层“登录”中提取图像? - how can i extract images from the last decoder layer “logits” after training my neural network? 如何丢弃神经网络中的整个隐藏层? - How to dropout entire hidden layer in a neural network? 如何创建多层神经网络 - How to Create Multi-layer Neural Network 如何在训练后向神经网络 model 添加更多神经元/过滤器? - How can I add more neurons / filters to a neural network model after training?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM