camera = webcam; % Connect to the camera
nnet = alexnet; % Load the neural net
while true
picture = camera.snapshot; % Take a picture
picture = imresize(picture,[227,227]); % Resize the picture
label = classify(nnet, picture); % Classify the picture
image(picture); % Show the picture
title(char(label)); % Show the label
drawnow;
end
I found this matlab code in the internet. It displays a window with the picture from a webcam and very quickly also names the things in the picture ("keyboard","bootle","pencil","clock"...). I want to do that in python. So far I have this:
import cv2
import sys
faceCascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
video_capture = cv2.VideoCapture(0)
while True:
ret, frame = video_capture.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(30, 30),
flags=cv2.cv.CV_HAAR_SCALE_IMAGE
)
# Draw a rectangle around the faces
for (x, y, w, h) in faces:
cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2)
cv2.imshow('Video', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
This is alreay very similar, but only detecting faces. The matlab code uses alexnet. I guess this is a pre-trained network based on imagenet data ( http://www.image-net.org/ ). But it is no longer available. How would I do this in python?
(There has been a similar question here, but it is 4 yrs. old and I think there are newer techniques now).
With the "tensorflow" package and the pre-trained network "vgg16", the solution is quite easy. See https://github.com/machrisaa/tensorflow-vgg/blob/master/test_vgg16.py
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.