简体   繁体   English

在实时图像中对新对象进行分类

[英]Classify New Objects in Images Live

I have a webcam, a microphone and a python GUI. 我有一个摄像头,一个麦克风和一个Python GUI。 The user shows the camera an object and using voice command asks "what is this object?". 用户向相机显示一个对象,并使用语音命令询问“这个对象是什么?”。 The webcam takes a photo of the camera frame and pushes it to a flask endpoint. 网络摄像头会拍摄相机框架的照片,并将其推到长颈瓶终点。 I have a VGG16 model hosted in the flask app which responds with a object class based on the image from camera. 我在flask应用程序中托管了一个VGG16模型,该模型根据来自摄像机的图像响应一个对象类。

What I want to do now is, if the object is not recognised, a learning cycle should be triggered. 我现在想做的是,如果对象未被识别,则应该触发一个学习周期。 In this cycle I will tell the model what the object is (voice to text), which will be the label for the object. 在这个周期中,我将告诉模型对象是什么(语音到文本),它将成为对象的标签。 I got that working. 我工作了。

What's not working is the next time (after the learning is complete), if I show the object to the camera, the model should be able to tell me what this object is. 下次(学习完成后)不起作用的是,如果我将对象显示给相机,则模型应该能够告诉我该对象是什么。

Could someone please advise me on the following: 有人可以在以下方面给我建议:

  1. Is VGG16 (trained initially on 2 objects using transfer learning) the best model for this type of task? VGG16(最初使用转移学习在2个对象上进行训练)是否是此类任务的最佳模型? Currently, it is classifying unseen objects as one of the two classes. 当前,它将看不见的对象分类为两个类之一。

  2. How would you go about implementing this solution on the cloud (AWS, Azure etc.) 您将如何在云(AWS,Azure等)上实施此解决方案

Thank you. 谢谢。

  1. Is VGG16 (trained initially on 2 objects using transfer learning) the best model for this type of task? VGG16(最初使用转移学习在2个对象上进行训练)是否是此类任务的最佳模型? Currently, it is classifying unseen objects as one of the two classes. 当前,它将看不见的对象分类为两个类之一。

Firstly the reason for "it is classifying unseen objects as one of the two classes " is simply that, you only allow it to classify between the two classes, so what happens is that even if you show it an unknown object it is supposed to fit it on either of the classes and post its best prediction, what you should be do is that train it on 3 distinctive classes {object1, object2, unknownObject}, so you may correctly predict unknown objects, now this may be a bit problematic, and would require retraining the model, the other thing that you could do is that you should set a threshold, basically when you give it an unseen object it could give the prediction confidence as follows {75%, 25%} or maybe {51%, 49%} you should set a threshold here, that unless your model is more than 90-95% sure of the prediction you take it to be an unknown object, now i am in no way saying this should be your threshold value, you threshold could either be .9 / .95 as mentioned above, or it could even be .75, tha 首先,“将看不见的对象分类为两个类之一”的原因很简单,您只允许它在两个类之间进行分类,所以发生的事情是,即使向您显示一个未知的对象,它也应该适合将其放在任一类上并发布其最佳预测,您应该做的是在3个不同的类{object1,object2,unknownObject}上进行训练,因此您可以正确地预测未知对象,现在这可能有点问题,并且需要重新训练模型,您可以做的另一件事是您应该设置一个阈值,基本上,当您给它一个看不见的对象时,它可以给出如下的预测置信度:{75%,25%}或{51%, 49%},您应该在此处设置一个阈值,除非您的模型对预测的把握度超过90-95%,否则您将其视为未知对象,现在我绝不说这应该是您的阈值,即阈值如上所述,可以是.9 / .95,也可以是.75,例如 t is a hyper parameter and you should employ ways to figure that threshold out t是一个超参数,您应该采用一些方法来找出该阈值

  1. How would you go about implementing this solution on the cloud (AWS, Azure etc.) 您将如何在云(AWS,Azure等)上实施此解决方案

You already have a flask server, just deploy the flask server on an AWS machine, and make it directly accessible via a static public external IP, hence you can post/get a request to that specific IP and run your model from any interface 您已经拥有了烧瓶服务器,只需将烧瓶服务器部署在AWS机器上,并使其可以通过静态公共外部IP直接访问,因此您可以发布/获取对该特定IP的请求并从任何界面运行模型

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM