Using NumPy Arrays that represent images to create a Tensorflow Machine Learning Model

Question

I am currently working on creating Face Detection Software as part of the development of a Facial Recognition Project.

I have struck an issue which I do not know how to resolve. Essentially I am converting Images into 250x250 resolution, and then converting the image into a Flattened NumPy Array.

The Arrays are exported to CSV Files.

img = PIL.Image.open('tmp/images/train/cropped/image (' + str(convert_count) + ').jpg').convert('L')
width, height = img.size

img_size = 25, 25
img = img.resize(img_size)
imgarr = np.array(img)

pixels = list(img.getdata())
width, height = img.size
pixels = [pixels[i * width:(i + 1) * width] for i in range(height)]

pixels = np.concatenate(pixels).ravel().tolist()

with open('tmp/csv/train/train (' + str(convert_count) +').csv', 'w') as csvfile:
    fieldnames = ['array']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

    writer.writeheader()
    writer.writerow({'array': pixels})

I would assumed that the arrays would all have the same number of elements in them, as they are converted from 25x250 images. However this is not the case. Instead my first 2 arrays (images) contain 74898 and 73682 Elements.

I was wondering, why is this happening? As Tensorflow will not let me train a model when the input sizes differ. Code below:

import numpy as np
import tensorflow as tf
from tensorflow import keras
import csv


count = 1
remaining_images = 3
number_images = 3
image_array = {}
image_array[1] = {}
image_array[2] = {}

while remaining_images > count:
    with open('tmp/csv/train/train (' + str(count) + ').csv', 'r') as csvfile:
        reader = csv.reader(csvfile)
        row = [r for r in reader]
    image_array[count] = row[2]
    #print(image_array[count])
    count = count + 1

image_array[1] = str(image_array[1])
image_array[2] = str(image_array[2])

features = np.array([image_array[1], image_array[2]
])


labels = np.array([1, 0])

#Example of the number of Elements in Arrays
array_size = len(features[0])
print(array_size)
array_size = len(features[1])
print(array_size)

batch_size = 2

dataset = tf.data.Dataset.from_tensor_slices((features, labels)).batch(batch_size)

model = keras.Sequential([
    keras.layers.Dense(5, activation=tf.nn.relu, input_shape=((array_size),)),
    keras.layers.Dense(3, activation=tf.nn.softmax)
])
model.compile(
    optimizer=keras.optimizers.Adam(),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

model.fit(dataset, epochs=100, batch_size=batch_size, verbose=1)

Answer 1

I'm curious if the problem is coming from how you save to and get from the csv.

In your while loop (2nd code block), if you directly open the image file with PIL, resize, and then use the resulting image array (as you do in the 1st code block), does that resolve the size issue?

Also, since you resize to 25 x 25 = 625 pixels, I think you should just have 625 elements in each image array (rather than 74898 and 73682 elements).

Using NumPy Arrays that represent images to create a Tensorflow Machine Learning Model

Question

1 answers

solution1
0 2020-07-03 19:49:12

Using NumPy Arrays that represent images to create a Tensorflow Machine Learning Model

Question

1 answers

solution1 0 2020-07-03 19:49:12

solution1
0 2020-07-03 19:49:12