简体繁体 English

用图像训练人工神经网络

[英]Artificial neural network trained with image

原文 2014-02-15 05:09:11 9 1 image/ artificial-intelligence/ kinect/ software-design

I am interested in creating a software which detects an object like a pen using Microsoft Kinect. 我有兴趣创建一个使用Microsoft Kinect检测像笔这样的对象的软件。 I recollect 100 positives images an 200 negative images in order to be taken by artificial neural network. 我记得100张正面图像和200张负面图像，以便通过人工神经网络拍摄。 My question is: how can I convert these images to be the input of the ANN? 我的问题是：如何将这些图像转换为ANN的输入？ I guess that last layer has one neuron because is one output is or not pen and I guess that the input is one too I want to use 3 layer in total. 我猜最后一层有一个神经元，因为一个输出是否是笔，我猜输入也是一个我想要总共使用3层。 But I don't know if I should convert positive and negative images in matrix or what can I do? 但我不知道我是否应该在矩阵中转换正负图像，或者我该怎么做？

1 个解决方案

First of all, Welcome to Stackoverflow! 首先，欢迎来到Stackoverflow！

I've never personally dealt with using the Kinect for image recognition, but if its possible, you should scale down the image to a fairly reasonable size such as 100x100 so that its is still manageable. 我从来没有亲自处理过使用Kinect进行图像识别的问题，但是如果可能的话，你应该将图像缩小到一个相当合理的尺寸，例如100x100这样它仍然可以管理。

You should also try to convert the image to grayscale as this will also help with computational efficiency, time of development, and it's much easier to start of with than RGB. 您还应该尝试将图像转换为grayscale因为这也有助于计算效率，开发时间，并且比RGB更容易启动。

The input layer will not be 1, that's a given. 输入图层不是 1，这是给定的。 If we're referring to the image that has 100x100 dimensions, the total number of inputs should be 10000 , one for each pixel. 如果我们指的是具有100x100尺寸的图像，则输入的总数应为10000 ，每个像素一个。 Remember, you're trying to breakup the data as fine-grained as you can so the ANN can detect patterns in the data. 请记住，您正试图尽可能精细地分解数据，以便ANN可以检测数据中的模式。

The output layer should actually have 2 neurons , and for a good reason. 输出层实际上应该有2个neurons ，这是有充分理由的。 Remember, each output neuron is measuring the likelihood that the input belongs to that respective class. 请记住，每个输出神经元都在测量输入属于相应类的可能性。 By having 2 neurons, each one can represent the positive class (Yes, this is a pen) or the negative class (no, this is not a pen). 通过拥有2个神经元，每个神经元可以代表正类（是，这是笔）或负类（不，这不是笔）。 So, by having 2 neurons, you can get the probabilities that the image will belong to that class, and then you can choose the highest value as your answer. 因此，通过拥有2个神经元，您可以获得图像属于该类的概率，然后您可以选择最高值作为答案。

3 Total layers should be sufficient, you'll probably never need more than that. 3总层数应该足够，你可能永远不会需要更多。 There are some very good articles for you to determine the amount of layers to have, such as this one I hope this helps! 有一些非常好的文章供您确定要有的层数，例如这个我希望这有帮助！ Let me know if you have any further questions. 如果您有任何其他问题，请与我们联系。