Tensorflow - 文本分类 - 形状 (None,) 和 (None, 250, 100) 不兼容错误

Question

I want to classify text with multiple labels.我想对具有多个标签的文本进行分类。 I use TextVectorization layer and CategoricalCrossEntropy function. Here is my model code:我使用 TextVectorization 层和 CategoricalCrossEntropy function。这是我的 model 代码：

Text Vectorizer:文本向量器：

def custom_standardization(input_data):
  print(input_data[:5])
  lowercase = tf.strings.lower(input_data)
  stripped_html = tf.strings.regex_replace(lowercase, '<br />', ' ')
  return tf.strings.regex_replace(stripped_html,
                                  '[%s]' % re.escape(string.punctuation),
                                  '')

max_features = 10000
sequence_length = 250

vectorize_layer = layers.TextVectorization(
    standardize=custom_standardization,
    max_tokens=max_features,
    output_mode='int',
    output_sequence_length=sequence_length)

Model generation: Model代：

MAX_TOKENS_NUM = 5000  # Maximum vocab size.
MAX_SEQUENCE_LEN = 40  # Sequence length to pad the outputs to.
EMBEDDING_DIMS = 100

model = tf.keras.models.Sequential()
model.add(tf.keras.Input(shape=(1,), dtype=tf.string))
model.add(vectorize_layer)
model.add(tf.keras.layers.Embedding(MAX_TOKENS_NUM + 1, EMBEDDING_DIMS))
model.summary()
model.compile(loss=losses.CategoricalCrossentropy(from_logits=True),
              optimizer='adam',
              metrics=tf.metrics.CategoricalAccuracy())

FIT:合身：

epochs = 10
history = model.fit(
    x_train,
    y=y_train,
    epochs=epochs)

x_train is a list of texts like ['This is a text about science.', 'This is a text about art',...] x_train是一个文本列表，例如['This is a text about science.', 'This is a text about art',...]

y_train also is a list of texts like ['Science','Art',...] y_train也是一个文本列表，例如['Science','Art',...]

When I try to run fitting code it gives the following error:当我尝试运行拟合代码时，出现以下错误：

ValueError: Shapes (None,) and (None, 250, 100) are incompatible

What am i doing wrong?我究竟做错了什么？ And also I'd like to learn if it's a good approach/model for classifying test with multiple labels?而且我还想了解它是否是对具有多个标签的测试进行分类的好方法/模型？

EDIT :编辑：

I edited my code according to Frightera's answer.我根据 Frightera 的回答编辑了我的代码。 Here is my model:这是我的 model：

MAX_TOKENS_NUM = 5000  # Maximum vocab size.
MAX_SEQUENCE_LEN = 40  # Sequence length to pad the outputs to.
EMBEDDING_DIMS = 100

model = tf.keras.models.Sequential()
model.add(tf.keras.Input(shape=(1,), dtype=tf.string))
model.add(vectorize_layer)
model.add(tf.keras.layers.Embedding(MAX_TOKENS_NUM + 1, EMBEDDING_DIMS))
model.add(layers.Dropout(0.2))
model.add(layers.GlobalAveragePooling1D())
model.add(layers.Dropout(0.2))
model.add(layers.Dense(len(labels)))
model.summary()
model.compile(loss=losses.SparseCategoricalCrossentropy(from_logits=True),
              optimizer='adam',
              metrics=tf.metrics.SparseCategoricalAccuracy())

And I pass y_train_int instead of y_train by converting categories to indexes with y_train_int = [get_label_index(label) for label in y_train]我通过使用y_train_int = [get_label_index(label) for label in y_train]将类别转换为索引来传递y_train_int而不是y_train

epochs = 10
history = model.fit(
    x_train,
    y=y_train_int,
    epochs=epochs)

Now the model fits, but when I check loss function with plt.plot(history.history['loss']) it's an all zero line like below:现在 model 适合，但是当我用plt.plot(history.history['loss'])检查损失 function 时，它是一条全零线，如下所示：

Is this model good for classification.这个model好不好分类。 Do I need those layers between input layer and final Dense Layer(Embedding etc.)?我是否需要输入层和最终密集层（嵌入等）之间的那些层？ What am I doing wrong?我究竟做错了什么？

EDIT 2: I have the above model now.编辑 2：我现在有上面的 model。 I am using SparseCategoricalEntropy and passing to the last Dense layer length of labels which is 78 and now it fits the model.我正在使用 SparseCategoricalEntropy 并将标签的最后一个密集层长度传递给 78，现在它适合 model。

Now when I use model.predict(x_test) , it gives following results:现在，当我使用model.predict(x_test)时，它会给出以下结果：

array([[ 1.3232083 ,  3.4263668 ,  0.3206688 , ..., -1.9279423 ,
        -0.83103067, -5.3442082 ],
       [ 0.11507592, -2.0753977 , -0.07149621, ..., -0.27729607,
        -1.132122  , -2.4074485 ],
       [ 0.87828857, -0.5063573 ,  1.5770453 , ...,  0.72519284,
         0.50958884,  3.7006462 ],
       ...,
       [ 0.35316354, -3.1919005 , -0.25520897, ..., -1.648859  ,
        -2.2707412 , -4.321298  ],
       [ 0.89357865,  1.3001428 ,  0.17324057, ..., -0.8185719 ,
        -1.4108973 , -3.674326  ],
       [ 1.6258209 , -0.59622926,  0.7382731 , ..., -0.8473997 ,
        -0.90670204, -4.043623  ]], dtype=float32)

How can I convert these to labels?如何将这些转换为标签？

Answer 1

I resolved this according to the comments as follows for text classification:我根据文本分类的评论解决了这个问题：

Use Dense layer with number of unique labels in the end of the model.在 model 的末尾使用具有多个唯一标签的密集层。
Convert string category labels to indexes and use SparseCategoricalCrossEntropy and SparseCategoricalAccuracy in the model.将字符串类别标签转换为索引并使用 model 中的 SparseCategoricalCrossEntropy 和 SparseCategoricalAccuracy。
When converting results to string labels, get the max valued output and get index of it in the labels list.将结果转换为字符串标签时，获取最大值 output 并在标签列表中获取它的索引。

Tensorflow - 文本分类 - 形状 (None,) 和 (None, 250, 100) 不兼容错误

问题描述

1 个解决方案

解决方案1
0 2023-01-08 18:45:52

Tensorflow - 文本分类 - 形状 (None,) 和 (None, 250, 100) 不兼容错误

问题描述

1 个解决方案

解决方案1 0 2023-01-08 18:45:52

解决方案1
0 2023-01-08 18:45:52