[英]Tensorflow - Text Classification - Shapes (None,) and (None, 250, 100) are incompatible error
I want to classify text with multiple labels.我想对具有多个标签的文本进行分类。 I use TextVectorization layer and CategoricalCrossEntropy function. Here is my model code:
我使用 TextVectorization 层和 CategoricalCrossEntropy function。这是我的 model 代码:
Text Vectorizer:文本向量器:
def custom_standardization(input_data):
print(input_data[:5])
lowercase = tf.strings.lower(input_data)
stripped_html = tf.strings.regex_replace(lowercase, '<br />', ' ')
return tf.strings.regex_replace(stripped_html,
'[%s]' % re.escape(string.punctuation),
'')
max_features = 10000
sequence_length = 250
vectorize_layer = layers.TextVectorization(
standardize=custom_standardization,
max_tokens=max_features,
output_mode='int',
output_sequence_length=sequence_length)
Model generation: Model代:
MAX_TOKENS_NUM = 5000 # Maximum vocab size.
MAX_SEQUENCE_LEN = 40 # Sequence length to pad the outputs to.
EMBEDDING_DIMS = 100
model = tf.keras.models.Sequential()
model.add(tf.keras.Input(shape=(1,), dtype=tf.string))
model.add(vectorize_layer)
model.add(tf.keras.layers.Embedding(MAX_TOKENS_NUM + 1, EMBEDDING_DIMS))
model.summary()
model.compile(loss=losses.CategoricalCrossentropy(from_logits=True),
optimizer='adam',
metrics=tf.metrics.CategoricalAccuracy())
FIT:合身:
epochs = 10
history = model.fit(
x_train,
y=y_train,
epochs=epochs)
x_train
is a list of texts like ['This is a text about science.', 'This is a text about art',...]
x_train
是一个文本列表,例如['This is a text about science.', 'This is a text about art',...]
y_train
also is a list of texts like ['Science','Art',...]
y_train
也是一个文本列表,例如['Science','Art',...]
When I try to run fitting code it gives the following error:当我尝试运行拟合代码时,出现以下错误:
ValueError: Shapes (None,) and (None, 250, 100) are incompatible
What am i doing wrong?我究竟做错了什么? And also I'd like to learn if it's a good approach/model for classifying test with multiple labels?
而且我还想了解它是否是对具有多个标签的测试进行分类的好方法/模型?
EDIT :编辑:
I edited my code according to Frightera's answer.我根据 Frightera 的回答编辑了我的代码。 Here is my model:
这是我的 model:
MAX_TOKENS_NUM = 5000 # Maximum vocab size.
MAX_SEQUENCE_LEN = 40 # Sequence length to pad the outputs to.
EMBEDDING_DIMS = 100
model = tf.keras.models.Sequential()
model.add(tf.keras.Input(shape=(1,), dtype=tf.string))
model.add(vectorize_layer)
model.add(tf.keras.layers.Embedding(MAX_TOKENS_NUM + 1, EMBEDDING_DIMS))
model.add(layers.Dropout(0.2))
model.add(layers.GlobalAveragePooling1D())
model.add(layers.Dropout(0.2))
model.add(layers.Dense(len(labels)))
model.summary()
model.compile(loss=losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer='adam',
metrics=tf.metrics.SparseCategoricalAccuracy())
And I pass y_train_int
instead of y_train
by converting categories to indexes with y_train_int = [get_label_index(label) for label in y_train]
我通过使用
y_train_int = [get_label_index(label) for label in y_train]
将类别转换为索引来传递y_train_int
而不是y_train
epochs = 10
history = model.fit(
x_train,
y=y_train_int,
epochs=epochs)
Now the model fits, but when I check loss function with plt.plot(history.history['loss'])
it's an all zero line like below:现在 model 适合,但是当我用
plt.plot(history.history['loss'])
检查损失 function 时,它是一条全零线,如下所示:
Is this model good for classification.这个model好不好分类。 Do I need those layers between input layer and final Dense Layer(Embedding etc.)?
我是否需要输入层和最终密集层(嵌入等)之间的那些层? What am I doing wrong?
我究竟做错了什么?
EDIT 2: I have the above model now.编辑 2:我现在有上面的 model。 I am using SparseCategoricalEntropy and passing to the last Dense layer length of labels which is 78 and now it fits the model.
我正在使用 SparseCategoricalEntropy 并将标签的最后一个密集层长度传递给 78,现在它适合 model。
Now when I use model.predict(x_test)
, it gives following results:现在,当我使用
model.predict(x_test)
时,它会给出以下结果:
array([[ 1.3232083 , 3.4263668 , 0.3206688 , ..., -1.9279423 ,
-0.83103067, -5.3442082 ],
[ 0.11507592, -2.0753977 , -0.07149621, ..., -0.27729607,
-1.132122 , -2.4074485 ],
[ 0.87828857, -0.5063573 , 1.5770453 , ..., 0.72519284,
0.50958884, 3.7006462 ],
...,
[ 0.35316354, -3.1919005 , -0.25520897, ..., -1.648859 ,
-2.2707412 , -4.321298 ],
[ 0.89357865, 1.3001428 , 0.17324057, ..., -0.8185719 ,
-1.4108973 , -3.674326 ],
[ 1.6258209 , -0.59622926, 0.7382731 , ..., -0.8473997 ,
-0.90670204, -4.043623 ]], dtype=float32)
How can I convert these to labels?如何将这些转换为标签?
I resolved this according to the comments as follows for text classification:我根据文本分类的评论解决了这个问题:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.