简体   繁体   中英

Accuracy too low with very simple CNN model using Keras in Tensorflow

I'm new to TensorFlow so help is appreciated. My output for the model is the same as the input, only in a different shape so I expect an accuracy of 1 but am instead getting 0.0062.

Inputs

Each input of my dataset is in the shape of (19, 19, 1). For each of these inputs, only a random single value is set to 1 while the rest are 0. Example but with a (4, 4, 1):

# [[0, 0, 0, 0],
#  [0, 1, 0, 0],
#  [0, 0, 0, 0],
#  [0, 0, 0, 0]

Outputs

Each output has a shape of (361) and is essentially the flattened version of its input so it shouldn't be a problem to reach an accuracy of 1 in theory. Example:

# [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

My dataset consists of 2404 of these samples.

Here's my code. Note that I've tried a combination of different loss functions and optimizers:

model = models.Sequential()
model.add(layers.Conv2D(1, (1, 1), activation='relu', padding='same', input_shape=(19, 19, 1)))
model.add(layers.Flatten())
model.add(layers.Dense(19 * 19, activation='softmax'))
model.compile(
    optimizer='adam',
    loss=tf.keras.losses.Huber(),
    metrics=['accuracy']
)

dataset = Dataset()
inputs = dataset.input
outputs = dataset.output

model.fit(
    inputs, # (2404, 19, 19, 1)
    outputs, # (2404, 361)
    epochs=1000,
    shuffle=True,
    verbose=1
)

Result

It quickly reaches 0.0062 and remains there.

Epoch 10/1000
76/76 [==============================] - 0s 2ms/step - loss: 0.0014 - accuracy: 0.0062

Update 1 - Slightly Better

Thanks for the help. After removing uses of random in my code and disabling shuffling it started hitting an accuracy of 1.00 50% of the time I ran the code. The other 50% was peaking at an accuracy of 0.0046. When I tried to initialize weights and biases at 0, it peaked at 0.0046 100% of the time. Updating all of my TF packages almost fixed the problem, with it now being successful 90% of the time.

Coming out of comments to an answer. You're convolving a 1x1 kernel, then passing that to a dense layer. The ideal parameters that you want the network to learn is for all the weights in the dense layer to be the inverse of the kernel value. What's most important here though is that you're mostly passing zeroes. Any weight value in the dense layer, applied to a zero results in another zero, so the zeroes are causing your gradient to vanish.

When you initialize your weights as zeros, this turns your input vector to all zeroes, all zeros always ends learning. Can't backpropagate. When you don't do anything to the initialization, TF uses a normal distribution to initialize, centered around 0. Half the time, that initialized kernel value is negative. After convolving, you have all zeroes and a negative number. After relu you have all zeroes. Half the time it can learn - because by chance it initialized with a positive kernel value, and half the time it can't.

Try this:

initializer = tf.keras.initializer.Ones()
model.add(layers.Conv2D(1, (1, 1), kernel_initializer=initializer, activation='relu', padding='same', input_shape=(19, 19, 1)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM