简体   繁体   中英

How do I structure a Keras model for a custom image regression problem?

I'm attempting to develop a regression model using Tensorflow 2 and the keras API using a custom data set of png images. However, I'm not entirely sure what layers I should be using and how. I put together what I thought was a very simple model as a starting point however when I attempt to train the model the loss and accuracy values printed out are consistently 0. This leads me to believe my loss calculations are not working but I have no idea why. Below is a snippet of my source code, the full project for which can be found here :

import tensorflow as tf
import os
import random
import pathlib

AUTOTUNE = tf.data.experimental.AUTOTUNE
TRAINING_DATA_DIR = r'specgrams'

def gen_model():
    model = tf.keras.models.Sequential([
      tf.keras.layers.Flatten(input_shape=(256, 128, 3)),
      tf.keras.layers.Dense(64, activation='relu'),
      tf.keras.layers.Dense(1)
    ])

    model.compile(optimizer=tf.keras.optimizers.Adam(),
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])

    return model


def fetch_batch(batch_size=1000):
    all_image_paths = []
    all_image_labels = []

    data_root = pathlib.Path(TRAINING_DATA_DIR)
    files = data_root.iterdir()

    for file in files:
        file = str(file)
        all_image_paths.append(os.path.abspath(file))
        label = file[:-4].split('-')[2:3]
        label = float(label[0]) / 200
        all_image_labels.append(label)

    def preprocess_image(path):
        img_raw = tf.io.read_file(path)
        image = tf.image.decode_png(img_raw, channels=3)
        image = tf.image.resize(image, [256, 128])
        image /= 255.0
        return image

    def preprocess(path, label):
        return preprocess_image(path), label

    path_ds = tf.data.Dataset.from_tensor_slices(all_image_paths)
    image_ds = path_ds.map(preprocess_image, num_parallel_calls=AUTOTUNE)
    label_ds = tf.data.Dataset.from_tensor_slices(all_image_labels)
    ds = tf.data.Dataset.zip((image_ds, label_ds))
    ds = ds.shuffle(buffer_size=len(os.listdir(TRAINING_DATA_DIR)))
    ds = ds.repeat()
    ds = ds.batch(batch_size)
    ds = ds.prefetch(buffer_size=AUTOTUNE)

    return ds

ds = fetch_batch()
model = gen_model()
model.fit(ds, epochs=1, steps_per_epoch=10)

The code above is supposed to read in some spectrograms stored as 256 x 128 px png files, convert them to tensors and fit them so a regression model to predict a value (in this case the BPM of the music used to generate the spectrogram). The image file names contain the BPM which is divided by 200 to produce a value between 0 and 1 as the label.

As stated before, this code does run successfully but after each training step the loss and accuracy values printed out are always exactly 0.00000 and do not change.

It's also worth noting that I actually want my model to predict multiple values, not just a single BPM value but this is a separate issue and as such I have posted a separate question for that here .

Anyway for the answer. Regression model requires loss function related such as 'mean_squared_error', 'mean_absolut_error', 'mean_absolute_percentage_error' and 'mean_squared_logarithmic_error.

def gen_model():
    model = tf.keras.models.Sequential([
      tf.keras.layers.Flatten(input_shape=(256, 128, 3)),
      tf.keras.layers.Dense(512, activation='relu'),
      tf.keras.layers.Dense(512, activation='relu'),        
      tf.keras.layers.Dense(1)
    ])

    model.compile(optimizer=tf.keras.optimizers.Adam(),
                  loss='mean_squared_error',
                  metrics=['accuracy'])

    return model

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM