[英]CRF layer ValueError: Dimensions must be equal, but are 75 and 8 for
I am using BiLSTM-CRF for the NER problem when I build the layers and successfully generate the summary, however, when I try to train the model it gives me the Dimension error.当我构建层并成功生成摘要时,我使用 BiLSTM-CRF 来解决 NER 问题,但是,当我尝试训练 model 时,它给了我维度错误。 It was working fine when I am using Keras and Keras-contrib packages however, these packages won't work in python3.8.当我使用 Keras 和 Keras-contrib 包时它工作正常,但是,这些包在 python3.8 中不起作用。 Therefore, I have to move tensorflow for BiLSTM and tenorflow-addons for CRF.因此,我必须为 BiLSTM 移动 tensorflow 和为 CRF 移动 Tenorflow-addons。 Unfortunately, these packages me me unknown errors.不幸的是,这些包给了我未知的错误。 I am trying for about 3 weeks I couldn't find any solution please help me.我尝试了大约 3 周,但找不到任何解决方案,请帮助我。
The following are the layers for my code:以下是我的代码的层:
from keras.models import Model, Input
from keras.layers import LSTM, Embedding, Dense, TimeDistributed, Dropout, Bidirectional
from tensorflow_addons.layers import CRF
input = Input(shape=(max_len,))
model = Embedding(input_dim=n_words + 1, output_dim=20, input_length=max_len, mask_zero=True)(input) # 20-dim embedding
model = Bidirectional(LSTM(units=75, return_sequences=True, recurrent_dropout=0.1))(model)
model = TimeDistributed(Dense(75, activation="relu"))(model)
crf = CRF(n_tags) # CRF layer
out = crf(model) # output
model = Model(input,out)
model.compile('rmsprop', loss='mean_absolute_error', metrics=['accuracy'])
model.summary()
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param
=================================================================
input_2 (InputLayer) [(None, 75)] 0
embedding_1 (Embedding) (None, 75, 20) 259240
bidirectional_1 (Bidirectio (None, 75, 150) 57600
nal)
time_distributed_1 (TimeDis (None, 75, 75) 11325
tributed)
crf_1 (CRF) [(None, 75), 688
(None, 75, 8),
(None,),
(8, 8)]
=================================================================
Total params: 328,853
Trainable params: 328,853
Non-trainable params: 0
_________________________________________________________________
import numpy as np
history = model.fit(X_tr, np.array(y_tr), batch_size=22, epochs=20, validation_split=0.1, verbose=1)
**Error:**
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
C:\Users\BLACKP~1\AppData\Local\Temp/ipykernel_11164/2422502856.py in <module>
----> 1 history = model.fit(X_tr, np.array(y_tr), batch_size=22, epochs=20, validation_split=0.1, verbose=1)
c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\utils\traceback_utils.py in error_handler(*args, **kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb
c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\tensorflow\python\framework\func_graph.py in autograph_handler(*args, **kwargs)
1127 except Exception as e: # pylint:disable=broad-except
1128 if hasattr(e, "ag_error_metadata"):
-> 1129 raise e.ag_error_metadata.to_exception(e)
1130 else:
1131 raise
ValueError: in user code:
File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\engine\training.py", line 878, in train_function *
return step_function(self, iterator)
File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\engine\training.py", line 867, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\engine\training.py", line 860, in run_step **
outputs = model.train_step(data)
File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\engine\training.py", line 809, in train_step
loss = self.compiled_loss(
File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\engine\compile_utils.py", line 201, in __call__
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\losses.py", line 141, in __call__
losses = call_fn(y_true, y_pred)
File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\losses.py", line 245, in call **
return ag_fn(y_true, y_pred, **self._fn_kwargs)
File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\losses.py", line 1332, in mean_absolute_error
return backend.mean(tf.abs(y_pred - y_true), axis=-1)
ValueError: Dimensions must be equal, but are 75 and 8 for '{{node mean_absolute_error/sub}} = Sub[T=DT_INT32](model_1/crf_1/ReverseSequence_1, mean_absolute_error/Cast)' with input shapes: [?,75], [?,75,8].
According to the docs , the CRF
layer has 4 outputs and you are using a list of all four outputs as your model output.根据文档, CRF
层有 4 个输出,您正在使用所有四个输出的列表作为 model output。 So try something like this:所以尝试这样的事情:
from keras.models import Model, Input
from keras.layers import LSTM, Embedding, Dense, TimeDistributed, Dropout, Bidirectional
from tensorflow_addons.layers import CRF
input = Input(shape=(75,))
model = Embedding(input_dim=100 + 1, output_dim=20, input_length=75, mask_zero=True)(input) # 20-dim embedding
model = Bidirectional(LSTM(units=75, return_sequences=True, recurrent_dropout=0.1))(model)
model = TimeDistributed(Dense(75, activation="relu"))(model)
crf = CRF(75) # CRF layer
decoded_sequence, potentials, sequence_length, chain_kernel = crf(model) # output
model = Model(input, decoded_sequence)
model.compile('rmsprop', loss='mean_absolute_error', metrics=['accuracy'])
model.summary()
x = tf.random.uniform((5, 75), dtype=tf.int32, maxval=100)
print(model(x).shape)
(5, 75)
On a side note, the loss function mean_absolute_error
makes little sense when it comes to integer sequences, which you are using.附带说明一下,对于您正在使用的 integer 序列,损失 function mean_absolute_error
几乎没有意义。
Seems like problem with input shape.似乎输入形状有问题。
I modified your code.我修改了你的代码。 working sample code工作示例代码
import tensorflow as tf
from tensorflow_addons.layers import CRF
inputs = tf.keras.Input(shape=(10, 128))
conv_2d_layer = tf.keras.layers.Dense(64, activation="relu")
outputs = tf.keras.layers.TimeDistributed(conv_2d_layer)(inputs)
layer = CRF(4)
decoded_sequence, potentials, sequence_length, chain_kernel = layer(outputs)
print(decoded_sequence.shape)
print(potentials.shape)
print(sequence_length.shape)
print(chain_kernel.shape)
Output Output
(None, 10)
(None, 10, 4)
(None,)
(4, 4)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.