I have a training set of 89 images of 6 different domino tiles plus one "control" group of a baby - all divided over 7 groups. The output y is thus 7. Each image is 100x100 and is black and white, resulting in an X of 100.000.
I am using the 1 hidden layer neural network-code from Andrew Ng's coursera course using Octave. It has been slightly modified.
I first tried this with 3 different groups (two domino tiles, one baby) and it managed to get a near 100% accuracy. I have now increased it to 7 different image groups. The accuracy has dropped WAY down and it is hardly getting anything right but the baby photos (which differ highly from the domino tiles).
I have tried 10 different lambda values, 10 different neuron numbers between 5-20 as well as trying different amount of iterations and plotted it against cost and accuracy in order to find the best fit.
I also tried feature normalization (commented out in the code below) but it didn't help.
This is the code I am using:
% Initialization
clear ; close all; clc; more off;
pkg load image;
fprintf('Running Domino Identifier ... \n');
%iteration_vector = [100, 300, 1000, 3000, 10000, 30000];
%accuracies = [];
%costs = [];
%for iterations_i = 1:length(iteration_vector)
# INPUTS
input_layer_size = 10000; % 100x100 Input Images of Digits
hidden_layer_size = 50; % Hidden units
num_labels = 7; % Number of different outputs
iterations = 100000; % Number of iterations during training
lambda = 0.13;
%hidden_layer_size = hidden_layers(hidden_layers_i);
%lambda = lambdas(lambda_i)
%iterations = %iteration_vector(iterations_i)
[X,y] = loadTrainingData(num_labels);
%[X_norm, mu, sigma] = featureNormalize(X_unnormed);
%X = X_norm;
initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size);
initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels);
initial_nn_params = [initial_Theta1(:) ; initial_Theta2(:)];
[J grad] = nnCostFunction(initial_nn_params, input_layer_size, hidden_layer_size, num_labels, X, y, lambda);
fprintf('\nTraining Neural Network... \n')
% After you have completed the assignment, change the MaxIter to a larger
% value to see how more training helps.
options = optimset('MaxIter', iterations);
% Create "short hand" for the cost function to be minimized
costFunction = @(p) nnCostFunction(p, input_layer_size, hidden_layer_size, num_labels, X, y, lambda);
% Now, costFunction is a function that takes in only one argument (the
% neural network parameters)
[nn_params, cost] = fmincg(costFunction, initial_nn_params, options);
% Obtain Theta1 and Theta2 back from nn_params
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
hidden_layer_size, (input_layer_size + 1));
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
num_labels, (hidden_layer_size + 1));
displayData(Theta1(:, 2:end));
[predictionData, images] = loadTrainingData(num_labels);
[h2_training, pred_training] = predict(Theta1, Theta2, predictionData);
fprintf('\nTraining Accuracy: %f\n', mean(double(pred_training' == y)) * 100);
%if length(accuracies) > 0
% accuracies = [accuracies; mean(double(pred_training' == y))];
%else
% accuracies = [mean(double(pred_training' == y))];
%end
%last_cost = cost(length(cost));
%if length(costs) > 0
% costs = [costs; last_cost];
%else
% costs = [last_cost];
%end
%endfor % Testing samples
fprintf('Loading prediction images');
[predictionData, images] = loadPredictionData();
[h2, pred] = predict(Theta1, Theta2, predictionData)
for i = 1:length(pred)
figure;
displayData(predictionData(i, :));
title (strcat(translateIndexToTile(pred(i)), " Certainty:", num2str(max(h2(i, :))*100)));
pause;
endfor
%y = provideAnswers(im_vector);
My questions are now:
Are my numbers "off" in terms of a great difference between X and the rest?
What should I do to improve this Neural Network?
If I do feature normalization, do I need to multiply the numbers back to the 0-255 range again somewhere?
What should I do to improve this Neural Network?
Use a Convolutional Neural Network (CNN) with multiple layers (eg, 5 layers). For vision problems, CNNs outperform MLPs by wide margins. Here, you are using an MLP with a single hidden layer. It is plausible that this network will not perform well on an image problem with 7 classes. One concern is the amount of training data that you have. Generally, we want at least hundreds of samples per class.
If I do feature normalization, do I need to multiply the numbers back to the 0-255 range again somewhere?
Generally, not for classification. Normalization can be viewed as a preprocessing step. However, if you working on a problem like image reconstruction, then you would need to convert back to the original domain at the end.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.