简体   繁体   中英

How to extract features vectors of same size from a dataset using SIFT?

I started to study machine learning and computer vision but at the moment i have some doubts.

I have a dataset of 1000 images of different size and I want to create features matrix using SIFT and OpenCV (i'm working with python). The problem is that I've noted that SIFT extract a different number of keypoints for every image, so I obtain features vectors of different size(I write this simple code for understanding it)

sift = cv2.xfeatures2d.SIFT_create()    
for file in listing:
        img = cv2.imread(iDir+file)
        gray= cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        kp = sift.detect(gray, None)
        kp2, des = sift.compute(gray,kp)
        print(len(kp2))

Now my question is: do i have to normalize the number of features after that I have extracted with SIFT(how can i choose the best number of features?) or I have to use particular parameters? Thank you for help.

The number of keypoints (or SIFT points) will change per image. The idea of SIFT is to to find keypoints in an image that are identifiable under many conditions. In fact these keypoints should be identifiable despite scaling, rotation, and lighting differences. Naturally some images will have fewer or more of these keypoints.

Think of it like any other feature. Say your features were red circles. Some images would have lots, and some would have few or none at all. It doesn't mean anything is wrong. it just means the images have different properties.

When you wish to find similarities between images you do a pairwise comparison of the keypoint descriptors. The descriptor is a 128x1 vector assigned to every keypoint. If two keypoints have matching descriptors (with some tolerance) they are said to match.

If we go back to our red circle feature example. Our descriptor of a circle could be its radius. If we want to compare two images we have to look at our descriptor, (the radii) not the number of features found. If image A has 2 circles and image B has 3 we compare them pairwise to find matches.

A1 radius 1; B1 radius 3, not the same

A1 radius 1; B2 radius 2, not the same

A1 radius 1; B3 radius 1.5, possibly the same

A2 radius 2; B1 radius 3, not the same

A2 radius 2; B2 radius 2, THE SAME

A2 radius 2; B3 radius 1.5, possibly the same

as you can see it is ok that the number of features are different, the important part is that the descriptors are the same (and for SIFT this is true)

You do not have to alter the number of keypoints except possibly to decrease storage and to reduce search time. If you want to reduce the number of keypoints per image one posibility is thresholding the _size of a keypoint open cv gives access to the size and angle of each keypoint you could only keep "large" keypoints, and thus reduce the total number of keypoints per image

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM