简体   繁体   中英

Convolution neural network?

I am working on a project for "Mood Detection".

As the first step of making a complete product we have started with image processing. I have learned from the Internet that a Convolution-Neural-Network is the best approach.

import cv2
import numpy as np
def sum_cnn(image,x,y):
    x1,y1=np.shape(image)
    temp=image

    for i in range(0,x1-x):
        for j in range(0,y1-y):
            temp1=np.sum(image[i:i+x,j:j+y])/(x*y)
            if temp1 in range(850000,1100000):
                cv2.rectangle(temp, (i, j), (i+20, j+20), (0, 255, 0), 2)
                print "\n"


    return temp


image =cv2.imread('test.jpg')
image=cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
image=np.matrix(image)

temp=sum_cnn(image,95,95)

cv2.imshow('Processed Video', temp)
cv2.imwrite('1.jpg',temp)

print "Done"

Above is the code I have written - the value (850000,1100000) is the value of the sum of a convoluted range of the Matrix (sorry the code is very crude, I started writing it this morning).

Well the question that I am asking is how to design or obtain a matrix which is going to store the weights.

1) Can I make do with just one matrix for every kind of object detection (ie a crude single layered CNN, which will give a different value for a "Car" and a "Face") OR should I have separate matrix for a "Car" and a "Face"?

2) How to deal with different sizes of the same object? One solution I saw on the Internet is to keep resizing the original image. I would like to know if there is a faster approach?

3) In back propagation what are a crude conditions that we give to get an optimized weights matrix?

first of all, if you seriously plan on using neural networks I would advise you to start with the basics . That page is really good to get started with neural networks I would say. Once have a basic understanding of neural networks you could try to use a framework or something like Theano to try and build a CNN.

To be honest I'm not quite sure what your code is about, but I'll try to answer your questions as best as I can.

  1. I guess you are talking about the weight matrices here. What a CNN does in its convolutional layers is: "extracting features". That's what everybody calls it. But tbh... it's not an easily quantifiable value. So how many filter maps (weight matrices are called filter maps in CNNs) you need depends on your use case. Therefor you will probably have to stick to trail and error with test and validation sets to tune the number of your filter maps or rather the hyper parameters in general.

  2. Here is a cool paper by Krizhevsky, Sutskever and Hinton where they do some crazy stuff with GPUs and they do fix their images to a certain size. If you find a good way to get around this restriction please tell me.

  3. What you usually want to do is to prevent overfitting. There are multiple ways and methods that researchers thought of to make this happen. Dropout, keeping the weights low or pre training just to name a few.

Again I would advise you to start with the given sources. It'll make it way easier to understand than if you're listening to me rambling on about some technical terms that I may not understand thoroughly myself.

Kind regards,

PS: Feel free to correct me

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM