简体   繁体   English

使用R-CNN进行物体检测?

[英]Object detection with R-CNN?

What does R-CNN actually do? R-CNN究竟做了什么? Is it like using features extracted by CNN to detect classes in a specified window area? 是否就像使用CNN提取的功能来检测指定窗口区域中的类一样? Is there any tensorflow implementation for this? 这是否有任何tensorflow实现?

R-CNN is the daddy-algorithm for all the mentioned algos, it really provided the path for researchers to build more complex and better algorithm on top of it. R-CNN是所有提到的算法的爸爸算法,它确实为研究人员提供了在其上构建更复杂和更好的算法的途径。 I am trying to explain R-CNN and the other variants of it. 我试图解释R-CNN及其他变种。

R-CNN, or Region-based Convolutional Neural Network R-CNN,或基于区域的卷积神经网络

R-CNN consist of 3 simple steps: R-CNN包含3个简单步骤:

  • Scan the input image for possible objects using an algorithm called Selective Search, generating ~2000 region proposals 使用称为选择性搜索的算法扫描输入图像以查找可能的对象,生成~2000个区域提议
  • Run a convolutional neural net (CNN) on top of each of these region proposals 在每个区域提案的基础上运行卷积神经网络(CNN)
  • Take the output of each CNN and feed it into a) an SVM to classify the region and b) a linear regressor to tighten the bounding box of the object, if such an object exists. 获取每个CNN的输出并将其输入a)SVM以对区域进行分类,以及b)线性回归器以收紧对象的边界框(如果存在这样的对象)。

R-CNN的图片描述

Fast R-CNN: 快速R-CNN:

Fast R-CNN was immediately followed R-CNN. 快速R-CNN立即跟随R-CNN。 Fast R-CNN is faster and better by the virtue of following points: 快速R-CNN凭借以下几点更快更好:

  • Performing feature extraction over the image before proposing regions, thus only running one CNN over the entire image instead of 2000 CNN's over 2000 overlapping regions 在提议区域之前对图像执行特征提取,因此仅在整个图像上运行一个CNN而不是2000个CNN超过2000个重叠区域
  • Replacing the SVM with a softmax layer, thus extending the neural network for predictions instead of creating a new model. 用softmax层替换SVM,从而扩展神经网络以进行预测,而不是创建新模型。

快速R-CNN的图形描述

Intuitively it makes a lot of sense to remove 2000 conv layers and instead take once Convolution and make boxes on top of that. 直观地说,删除2000转换层是很有意义的,而是采取一次卷积并在其上制作框。

Faster R-CNN: 更快的R-CNN:

One of the drawbacks of Fast R-CNN was the slow selective search algorithm and Faster R-CNN introduced something called Region Proposal network(RPN). 快速R-CNN的缺点之一是选择性搜索速度慢,而快速R-CNN引入了称为区域提议网络(RPN)的东西。

Here's is the working of the RPN: 这是RPN的工作原理:

At the last layer of an initial CNN, a 3x3 sliding window moves across the feature map and maps it to a lower dimension (eg 256-d) For each sliding-window location, it generates multiple possible regions based on k fixed-ratio anchor boxes (default bounding boxes) 在初始CNN的最后一层,3x3滑动窗口在特征地图上移动并将其映射到较低维度(例如256-d)。对于每个滑动窗口位置,它基于k个固定比率锚点生成多个可能的区域框(默认边界框)

Each region proposal consists of: 每个地区的提案包括:

  • An “objectness” score for that region and 该区域的“对象性”得分
  • 4 coordinates representing the bounding box of the region In other words, we look at each location in our last feature map and consider k different boxes centered around it: a tall box, a wide box, a large box, etc. 表示区域边界框的4个坐标换句话说,我们查看最后一个要素图中的每个位置,并考虑以它为中心的k个不同的框:高框,宽框,大框等。

For each of those boxes, we output whether or not we think it contains an object, and what the coordinates for that box are. 对于每个框,我们输出我们是否认为它包含一个对象,以及该框的坐标是什么。 This is what it looks like at one sliding window location: 这是一个滑动窗口位置的样子:

区域提案网络

The 2k scores represent the softmax probability of each of the k bounding boxes being on “object.” Notice that although the RPN outputs bounding box coordinates, it does not try to classify any potential objects: its sole job is still proposing object regions. 2k分数表示每个k个边界框在“对象”上的softmax概率。请注意,虽然RPN输出边界框坐标,但它不会尝试对任何潜在对象进行分类:它的唯一工作仍然是提出对象区域。 If an anchor box has an “objectness” score above a certain threshold, that box's coordinates get passed forward as a region proposal. 如果锚箱的“对象性”得分高于某个阈值,则该框的坐标将作为区域提议传递。

Once we have our region proposals, we feed them straight into what is essentially a Fast R-CNN. 一旦我们获得了我们的区域提案,我们就会直接将它们提供给基本上是快速R-CNN的内容。 We add a pooling layer, some fully-connected layers, and finally a softmax classification layer and bounding box regressor. 我们添加了一个池化层,一些完全连接的层,最后是一个softmax分类层和边界框回归器。 In a sense, Faster R-CNN = RPN + Fast R-CNN. 从某种意义上说,更快的R-CNN = RPN +快速R-CNN。

更快的R-CNN

Linking some Tensorflow implementation: 链接一些Tensorflow实现:

https://github.com/smallcorgi/Faster-RCNN_TF https://github.com/smallcorgi/Faster-RCNN_TF

https://github.com/CharlesShang/FastMaskRCNN https://github.com/CharlesShang/FastMaskRCNN

You can find a lot of implementation of Github. 你可以找到很多Github的实现。

PS I borrowed a lot of material from Joyce Xu Medium blog. PS我从Joyce Xu Medium博客那里借了很多资料。

R-CNN is using the following algorithm: R-CNN使用以下算法:

  1. Get region proposals for object detection (using selective search). 获取对象检测的区域提议(使用选择性搜索)。
  2. For each region crop the area from the image and run it thorough a CNN which classify the object. 对于每个区域,从图像中裁剪区域并通过CNN对其进行分类,该CNN对对象进行分类。

There are more advanced algorithms that are built upon this like fast-R-CNN and faster R-CNN. 还有更高级的算法,如快速R-CNN和更快的R-CNN。

fast-R-CNN: 快速-R-CNN:

  1. Run the entire image through the CNN 通过CNN运行整个图像
  2. For each region from the region proposals extract the area using "roi polling" layer and than classify the object. 对于来自区域的每个区域,提议使用“roi轮询”层提取区域,然后对对象进行分类。

faster R-CNN: 更快的R-CNN:

  1. Run the entire image through the CNN 通过CNN运行整个图像
  2. Using the features detected using the CNN find region proposals using a object proposals network. 使用CNN检测到的功能使用对象提议网络查找区域提议。
  3. For each object proposal extract the area using "roi polling" layer and than classify the object. 对于每个对象提议,使用“roi polling”图层提取区域,然后对对象进行分类。

There are a lot of implantation in tensorflow specifically for faster R-CNN which is the most recent variant just google faster R-CNN tensorflow. 张量流中有很多植入专门用于更快的R-CNN,这是最近的变种只是谷歌更快的R-CNN张量流。

Good luck 祝好运

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么Tensorflow对象检测禁用更快的R-CNN正则化 - Why Tensorflow Object Detection disable regularization for Faster R-CNN MultiClass Object 使用 Fast R-CNN 进行检测和分类 - MultiClass Object Detection and Classification using Fast R-CNN 如何结合两个快速 R-CNN 模型进行对象检测? - How can I combine two fast R-CNN models for object detection? Tensorflow 对象检测 API - 如何通过此实现 Mask R-CNN? - Tensorflow Object Detection API - How do I implement Mask R-CNN via this? 4 步交替 RPN/更快的 R-CNN 训练? - Tensorflow 对象检测模型 - 4-step Alternating RPN / Faster R-CNN Training? - Tensorflow Object Detection Models 如何在 TF Object Detection 2.0 中分别加载保存的 Faster R-CNN 的两个阶段? - How do I load the two stages of a saved Faster R-CNN separately in TF Object Detection 2.0? R-CNN 中的说明 - Clarification in R-CNN 有没有办法修改可用的TensorFlow模型架构(例如ssd或fast r-cnn),因此它只针对一个对象检测进行了优化? - Is there any way to modify available TensorFlow models architecture (such as ssd or fast r-cnn) so it is optimized for only one object detection? Tensorflow Faster R-CNN with Resnet-50 (v1) Object 检测 API 每张图像只有一个边界框 - Tensorflow Faster R-CNN with Resnet-50 (v1) Object Detection API Results Only One Bounding Box on Per Image Tensorflow R-CNN网络小对象检测-并排对象,一个正确分类,另一个完全丢失 - Tensorflow R-CNN network Small Object Detection - side by side objects, one classified correctly, the other entirely missed
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM