使用預訓練的 BERT 嵌入作為 CNN 的輸入，使用 tensorflow.keras 導致 ValueError

Question

我是 NLP 和深度學習的新手，所以我遇到了（可能）一個非常基本的問題。

我正在嘗試創建一個基於預先訓練的 BERT 嵌入作為特征的二進制分類器。 到目前為止，我已經成功創建了嵌入，並使用 tensorflow.keras 構建了一個簡單的 Sequential() model。 下面的代碼有效：

model = tf.keras.Sequential([
    Dense(4, activation = 'relu', input_shape = (768,)),
    Dense(4, activation = 'relu'),
    Dense(1, activation = 'sigmoid')])

model.compile(optimizer = 'adam',
              loss = 'binary_crossentropy',
              metrics = ['accuracy'])

我想做的是將這段代碼改編為現在的 CNN。 但是，當我添加卷積層時，出現錯誤：

model = tf.keras.Sequential([
    Conv1D(filters = 250, kernel_size = 3, padding='valid', activation='relu', strides=1, input_shape = (768,)),
    GlobalMaxPooling1D(),
    Dense(4, activation = 'relu'),
    Dense(1, activation = 'sigmoid')])

model.compile(optimizer = 'adam',
              loss = 'binary_crossentropy',
              metrics = ['accuracy'])

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-23-59695050a94e> in <module>()
      3     GlobalMaxPooling1D(),
      4     Dense(4, activation = 'relu'),
----> 5     Dense(1, activation = 'sigmoid')])
      6 
      7 model.compile(optimizer = 'adam',

5 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
    178                          'expected ndim=' + str(spec.ndim) + ', found ndim=' +
    179                          str(ndim) + '. Full shape received: ' +
--> 180                          str(x.shape.as_list()))
    181     if spec.max_ndim is not None:
    182       ndim = x.shape.ndims

ValueError: Input 0 of layer conv1d_2 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 768]

這是我使用的數據的樣子。

特征：

train_features[0]

array([-4.97862399e-01,  1.49541467e-01,  5.81708886e-02,  1.63668215e-01,
       -2.77605206e-01,  3.57868642e-01,  1.70950562e-01,  2.69330859e-01,
       -3.29369396e-01,  2.12891083e-02, -4.02462274e-01, -1.98120754e-02,
       -2.18944401e-01,  4.34780568e-01, -2.75409579e-01,  2.03015730e-01,...

train_features[0].shape
(768,)

標簽：

train_labels.iloc[0:3]
turnout       
0        73446    0
1        53640    1
         16895    1
Name: turnout, dtype: int64

非常感謝任何建議。 太感謝了！

Answer 1

2D 卷積需要 4D 輸入： (batch_size, width1, width2, channels) 。

您的數據是具有形狀(batch_size, 768)的單個數組。 如果您真的想使用卷積（如果您認為數據中可能存在空間關系），則需要在將其輸入 model 之前對其進行適當的整形。

一維卷積需要 3D 輸入： (batch_size, length, channels) 。

使用預訓練的 BERT 嵌入作為 CNN 的輸入，使用 tensorflow.keras 導致 ValueError

問題描述

1 個解決方案

解決方案1
1 已采納 2020-04-06 15:49:58

使用預訓練的 BERT 嵌入作為 CNN 的輸入，使用 tensorflow.keras 導致 ValueError

問題描述

1 個解決方案

解決方案1 1 已采納 2020-04-06 15:49:58

解決方案1
1 已采納 2020-04-06 15:49:58