简体   繁体   English

带有VNRecognizedObjectObservation的boundingBox框架不正确

[英]Incorrect frame of boundingBox with VNRecognizedObjectObservation

I'm having an issue with displaying bounding box around recognized object using Core ML & Vision. 我在使用Core ML&Vision显示识别对象周围的边界框时遇到问题。

The horizontal detection seems to be working correctly, however, vertically the box is too tall, goes over the top edge of the video, doesn't go all the way to the bottom of the video, and it doesn't follow motion of the camera correctly. 水平检测似乎工作正常,但是,垂直框太高,越过视频的顶部边缘,不会一直到视频的底部,并且它不跟随视频的运动相机正确。 Here you can see the issue: https://imgur.com/Sppww8T 在这里您可以看到问题: https//imgur.com/Sppww8T

This is how video data output is initialized: 这是视频数据输出的初始化方式:

let videoDataOutput = AVCaptureVideoDataOutput()
videoDataOutput.alwaysDiscardsLateVideoFrames = true
videoDataOutput.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_420YpCbCr8BiPlanarFullRange)]
videoDataOutput.setSampleBufferDelegate(self, queue: dataOutputQueue!)
self.videoDataOutput = videoDataOutput
session.addOutput(videoDataOutput)
let c = videoDataOutput.connection(with: .video)
c?.videoOrientation = .portrait

I've also tried other video orientations, without much success. 我也尝试了其他视频方向,但没有取得多大成功。

Performing the vision request: 执行愿景要求:

let handler = VNImageRequestHandler(cvPixelBuffer: image, options: [:])
try? handler.perform(vnRequests)

And finally once the request is processed. 最后一旦处理请求。 viewRect is set to the size of the video view: 812x375 (I know, video layer itself is a bit shorter, but that's not the issue here): viewRect设置为视频视图的大小:812x375(我知道,视频层本身有点短,但这不是问题):

let observationRect = VNImageRectForNormalizedRect(observation.boundingBox, Int(viewRect.width), Int(viewRect.height))

I've also tried doing something like (with more issues): 我也尝试过做(有更多问题)的事情:

var observationRect = observation.boundingBox
observationRect.origin.y = 1.0 - observationRect.origin.y
observationRect = videoPreviewLayer.layerRectConverted(fromMetadataOutputRect: observationRect)

I've tried to cut out as much of what I deemed to be irrelevant code as possible. 我试图尽可能多地削减我认为不相关的代码。

I've actually come across a similar issue using Apple's sample code, when the bounding box wouldn't vertically go around objects as expected: https://developer.apple.com/documentation/vision/recognizing_objects_in_live_capture Maybe that means that there is some issue with the API? 我实际上遇到过使用Apple示例代码的类似问题,当边界框不会按预期垂直移动对象时: https//developer.apple.com/documentation/vision/recognizing_objects_in_live_capture也许这意味着有一些问题与API?

I use something like this: 我使用这样的东西:

let width = view.bounds.width
let height = width * 16 / 9
let offsetY = (view.bounds.height - height) / 2
let scale = CGAffineTransform.identity.scaledBy(x: width, y: height)
let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -height - offsetY)
let rect = prediction.boundingBox.applying(scale).applying(transform)

This assumes portrait orientation and a 16:9 aspect ratio. 这假设纵向和16:9纵横比。 It assumes the .imageCropAndScaleOption = .scaleFill . 它假设.imageCropAndScaleOption = .scaleFill

Credits: The transform code was taken from this repo: https://github.com/Willjay90/AppleFaceDetection 致谢:转换代码取自此repo: https//github.com/Willjay90/AppleFaceDetection

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM