简体繁体 English

如果图像相同，CNN 能否识别大小差异？

[英]Can a CNN recognize the difference in size if the images are the same?

原文 2019-10-21 19:27:59 8 2 image/ size/ conv-neural-network/ convolution

Could a CNN tell the difference between different size range of the same organism? CNN 能否区分同一生物体的不同大小范围之间的差异？ Like a puppy vs a adult or a child vs an adult?就像小狗与成人或儿童与成人一样？ Or more like a large fly vs a small fly, where they look identical but one is just larger than the other?或者更像是一只大苍蝇和一只小苍蝇，它们看起来一样，但一只比另一只大？

2 个解决方案

This is a tricky question to answer but usually theoretical CNN is able to do.这是一个很难回答的问题，但通常理论上的 CNN 能够做到。 It is mainly dependent on the training data itself.它主要取决于训练数据本身。 In case of a child vs adult, you can gather a dataset that includes alot of variances in sizes and ages in order to make sure that you will have CNN model that able to find patterns and generalize at the end.在儿童与成人的情况下，您可以收集一个包含大量大小和年龄差异的数据集，以确保您将拥有能够在最后找到模式并进行概括的 CNN model。 At the end, the CNN will learn many other features that make the classification scale or size invariant (In dependent of Size) such as shapes,colors, clothes and face features....etc.最后，CNN 将学习许多其他使分类尺度或尺寸不变（Independent of Size）的特征，例如形状、colors、衣服和面部特征......等。 Such Intra-class classification problems, it is not easily tackled with traditional supervised learning and therefore some researchers are applying an approach called " Deep Metric Learning ".这种类内分类问题，传统的监督学习不容易解决，因此一些研究人员正在应用一种称为“深度度量学习”的方法。

Metric learning is the task of learning a distance function over objects.度量学习是在对象上学习距离 function 的任务。 A metric or distance function has to obey four axioms: non-negativity, identity of indiscernibles, symmetry and subadditivity (or the triangle inequality).度量或距离 function 必须遵守四个公理：非负性、不可分辨的同一性、对称性和次可加性（或三角不等式）。 In practice, metric learning algorithms ignore the condition of identity of indiscernibles and learn a pseudo-metric.在实践中，度量学习算法忽略了不可识别的身份条件并学习了一个伪度量。 Wiki Definition维基定义

It would be better to differentiate the metric that you mention in the question.最好区分您在问题中提到的指标。 At first, it is a different task to recognize age and size.首先，识别年龄和大小是一项不同的任务。

About the age, yes, it is doable.关于年龄，是的，这是可行的。 For deep learning-based approach, you will need appropriate data.对于基于深度学习的方法，您将需要适当的数据。 For non-training based approach (old-school image processing), you would need to create some metrics for each object based on age (counting the wrinkle, white hair etc. for humans)对于基于非训练的方法（老式图像处理），您需要根据年龄为每个 object 创建一些指标（计算人类的皱纹、白发等）

About the size, unfortunately, it is still under research and it is not clear to mention if it is properly doable or not.不幸的是，关于尺寸，它仍在研究中，尚不清楚它是否可行。 Whenever we mention object size recognition from a single image, there are more things to consider.每当我们提到从单个图像中识别 object 尺寸时，需要考虑的事情就更多了。 The first thing is the perspective.首先是视角。 If the object found in the image is large with respect to the image coordinates, is it close to the camera, even though its size is tiny, hence, it is showing as large or it is really huge but too far away from the camera?如果图像中的 object 相对于图像坐标很大，它是否靠近相机，即使它的尺寸很小，因此它显示得一样大，或者它确实很大但离相机太远？ Such a problem may be overcome by knowing the object geometry in prior and by developing an algorithm based on that geometry along with deep learning.通过事先了解 object 几何形状并通过开发基于该几何形状的算法以及深度学习，可以克服此类问题。 However, current deep learning technology is not accurate enough to distinguish the dimensions and location, hence object geometry precisely yet.然而，目前的深度学习技术不足以准确区分尺寸和位置，因此 object 的几何形状尚未精确。 Another alternative would be to control the environment.另一种选择是控制环境。 For example, if you know that both objects lie on the same plane (ie on the table, next to each other) in the real world, the rest is a trivial problem to resolve.例如，如果您知道两个对象在现实世界中位于同一平面上（即在桌子上，彼此相邻），则 rest 是一个很容易解决的问题。