简体   繁体   中英

Tensorflow / Keras : Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=2

I'm trying to implement a Federated training Keras / Tensorflow model for detecting fake news in text articles , but I have trouble with the model. When I try to run the code I get the following error:

 ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 50]

And the following warning:

WARNING:tensorflow:Model was constructed with shape (None, 400) for input Tensor("embedding_input:0", shape=(None, 400), dtype=float32), but it was called on an input with incompatible shape (None,).

Intuitively I understand that Embedding Layer Output should be of shape (None, 400, 50) but for some reason it is fed just a 2d input, or that the Layer is expecting a 3d Tensor but is provided just a 2d one. However, I don't know how to fix it, or how to change the input/output shape so that they will match. I've stayed a couple of days on this issue. I'm still new in the field of ML and Neural Networks. Any suggestions are apreciated, thank you very much in advance.

The model used:

max_words = 2000
max_len = 400
embed_dim = 50
lstm_out = 64
batch_size = 32

def getTextModel():
    model = Sequential()
    model.add(Embedding(max_words, embed_dim, input_length = max_len, input_shape=preprocessed_sample_dataset.element_spec))
    model.add(LSTM(lstm_out))
    model.add(Dense(256))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(1, name='out_layer'))
    model.add(Activation('sigmoid'))
return model

Model summary:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 400, 50)           100000    
_________________________________________________________________
lstm (LSTM)                  (None, 64)                29440     
_________________________________________________________________
dense (Dense)                (None, 256)               16640     
_________________________________________________________________
activation (Activation)      (None, 256)               0         
_________________________________________________________________
dropout (Dropout)            (None, 256)               0         
_________________________________________________________________
out_layer (Dense)            (None, 1)                 257       
_________________________________________________________________
activation_1 (Activation)    (None, 1)                 0         
=================================================================
Total params: 146,337
Trainable params: 146,337
Non-trainable params: 0

Other info:

Preprocessing of the data:

def preprocess(dataset):

  def batch_format_fn(element):
    """Flatten a batch `pixels` and return the features as an `OrderedDict`."""
    print(element['features'])
    return collections.OrderedDict(
        x=element['features'],
        y=tf.reshape(element['label'], [-1, 1])
    )
  return dataset.repeat(NUM_EPOCHS).shuffle(SHUFFLE_BUFFER).batch(
      BATCH_SIZE).map(batch_format_fn).prefetch(PREFETCH_BUFFER)

preprocessed_sample_dataset = preprocess(sample_dataset)


def make_federated_data(client_data, client_ids):
    return [preprocess(client_data.create_tf_dataset_for_client(x)) for x in client_ids]

federated_train_data = make_federated_data(train_dataset, train_dataset.client_ids)

print('Number of client datasets: {l}'.format(l=len(federated_train_data)))
print('First dataset: {d}'.format(d=federated_train_data[0]))

Dataset format:

Number of client datasets: 4
First dataset: <PrefetchDataset shapes: OrderedDict([(x, (None,)), (y, (None, 1))]), types: OrderedDict([(x, tf.string), (y, tf.int64)])>

Code where the function is called:

def model_fn():

  keras_model = getTextModel() #create_keras_model()
  input_spec_aux = preprocessed_sample_dataset.element_spec
  return tff.learning.from_keras_model(
      keras_model,
      input_spec= input_spec_aux,
      loss=tf.keras.losses.SparseCategoricalCrossentropy(),
      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])

#Error occurs in iterative_process
iterative_process = tff.learning.build_federated_averaging_process(
    model_fn,
    client_optimizer_fn=lambda: tf.keras.optimizers.Adam(learning_rate=client_lr),
    server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=server_lr))

print(str(iterative_process.initialize.type_signature))

state = iterative_process.initialize()

The dataset format is saying the shape of the input x is (None,) (ndim/rank, = 1) and dtype tf.string) . The None comes from the fact that the dataset may yield batches that aren't "full", so in practice the first dimension is in the range [1, BATCH_SIZE] . This shape means we have a batch of single scalar strings. This may be where the problem lies, typically in LSTMs we want batches of sequences of strings, eg a shape like (None, SEQUENCE_LENGTH) .

The embedding layer will project the last dimension into the embedding dimension z , eg take a shape (x, y) and produce a shape (x, y, z) . So our input after the embedding layer will be (None, 50) (or ndim/rank = 2). Recall the LSTM wants sequences, and Keras wants batches, the error message is saying the desired shape was (None, SEQUENCE_LENGTH, 50) (ndim/rank = 3).

I would suggest going back to the dataset and determine what the format of element['features'] is. It seems like it this case it might be a full sentence and need to be tokenized into a sequence of words (eg for English split on spaces).

A word of warning though: even after fixing the shapes, I suspect Keras will next complain that the dtype of tf.string cannot be used in the Embedding layer. The sequences will first need to be converted to integer ids, likely using something from tf.lookup or something from tf_text .

Some resources that might be helpful:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM