简体   繁体   中英

Bounding box prediction on CNN multiple class image classification in python

I have the training set and test of 4 types of specific objects. I also have the bound box conditions / Area of interest coordinates (x,y,w,h) in csv format. Main aim of the project is to predict the class of test image along with bounding box around the area of interest along with printing the name of the class on the image.

I have applied CNN model based on keras library. which classifies the given images of test set. what should i change in order to predict the bounding box coordinates of the given test image ?

        from keras.models import Sequential
        from keras.layers import Convolution2D
        from keras.layers import MaxPooling2D
        from keras.layers import Flatten
        from keras.layers import Dense

        #CNN initializing
        classifier= Sequential()

        #convolutional layer
        classifier.add(Convolution2D(filters = 32, kernel_size=(3,3), data_format= "channels_last", input_shape=(64, 64, 3), activation="relu"))

        #Pooling
        classifier.add(MaxPooling2D(pool_size=(2,2)))

        #addition of second convolutional layer
        classifier.add(Convolution2D(filters = 32, kernel_size=(3,3), data_format= "channels_last", activation="relu"))
        classifier.add(MaxPooling2D(pool_size=(2,2)))

        #step 3 - FLatttening
        classifier.add(Flatten())

        #step 4 - Full connection layer
        classifier.add(Dense(128, input_dim = 11, activation = 'relu'))
        #output layer
        classifier.add(Dense(units = 4, activation = 'sigmoid'))

        #compiling the CNN
        classifier.compile(optimizer='adam',loss="categorical_crossentropy",metrics =["accuracy"])

        #part 2 -Fitting the CNN to the images


        from keras.preprocessing.image import ImageDataGenerator

        train_datagen = ImageDataGenerator(rescale = 1./255,
                                           shear_range = 0.2,
                                           zoom_range = 0.2,
                                           horizontal_flip = True)

        test_datagen = ImageDataGenerator(rescale = 1./255)

        training_set = train_datagen.flow_from_directory('dataset/Train',
                                                         target_size = (64, 64),
                                                         batch_size = 32,
                                                         class_mode = 'categorical')

        test_set = test_datagen.flow_from_directory('dataset/Test',
                                                    target_size = (64, 64),
                                                    batch_size = 32,
                                                    class_mode = 'categorical')

        classifier.fit_generator(training_set,
                                 steps_per_epoch =4286/32,
                                 epochs = 25,
                                 validation_data = test_set,
                                 validation_steps = 44/32)

The task you described is object detection, which usually requires a more complicated CNN model. Check https://github.com/fizyr/keras-retinanet for one of the famous neural network architectures.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM