简体   繁体   中英

Dimensionality of Keras Dense layer

I've got a Keras model, with sizes as follows:

________________________________________________________________________________
Layer (type)              Output Shape              Param #
================================================================================
stft (InputLayer)         (None, 1, 16384)          0
________________________________________________________________________________
static_stft (Spectrogram) (None, 1, 65, 256)        16640
________________________________________________________________________________
conv2d_1 (Conv2D)         (None, 38, 5, 9)          12882
________________________________________________________________________________
dense_1 (Dense)           (None, 38, 5, 512)        5120
________________________________________________________________________________
predictions (Dense)       (None, 38, 5, 368)        188784
================================================================================

I'm confused about the dimensionality of the Dense layers at the end. I was hoping to have (None,512) and (None,368) respectively. This is suggested by answers like: Keras lstm and dense layer

They final dense layers are created as follows:

x = keras.layers.Dense(512)(x)
outputs = keras.layers.Dense(
        368, activation='sigmoid', name='predictions')(x)

So why do they have more than 512 outputs? And how can I change this?

Depending on your application you could flatten after the Conv2D layer:

input_layer = Input((1, 1710))
x = Reshape((38, 5, 9))(input_layer)
x = Flatten()(x)
x = Dense(512)(x)
x = Dense(368)(x)

Layer (type)                 Output Shape              Param #   
_________________________________________________________________
input_1 (InputLayer)         [(None, 1, 1710)]         0         
_________________________________________________________________
reshape (Reshape)            (None, 38, 5, 9)          0         
_________________________________________________________________
flatten (Flatten)            (None, 1710)              0         
_________________________________________________________________
dense (Dense)                (None, 512)               876032    
_________________________________________________________________
dense_1 (Dense)              (None, 368)               188784    

It's the Conv2D layer. The convolutional layer is producing 38x5 outputs of length 9, and then your Dense layer is taking each of the 38x5 length 9 sequences as input and converting it to a length 512 sequence as output.

To get rid of the spatial dependence, you'll want to use something like a pooling layer, possibly a GlobalMaxPool2D . This will consolidate the data into only the channel dimension, and produce a (None, 9) shaped output, which will lead to your expected shapes from the Dense layers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM