简体   繁体   English

将 MLMultiArray 转换为图像或 OpenGL / 金属纹理

[英]Converting an MLMultiArray to an image or an OpenGL / metal texture

I'm trying to do background segmentation of a live video using CoreML.我正在尝试使用 CoreML 对实时视频进行背景分割。 I used DeepLabV3 as provided by Apple.我使用了 Apple 提供的DeepLabV3 The model works ok, even though it already takes 100ms to process a 513x513 image. model 工作正常,即使处理 513x513 图像已经需要 100 毫秒。 I then want to display the output, which is a 513x513 array of int32.然后我想显示 output,它是一个 513x513 的 int32 数组。 Converting it in an image as done in CoreMLHelpers takes 300ms and I'm looking for a much faster way to display the results.将其转换为在CoreMLHelpers中完成的图像需要 300 毫秒,我正在寻找一种更快的方法来显示结果。 I was thinking that maybe it'd be faster to somehow dump this to a OpenGL or Metal texture.我在想也许以某种方式将其转储到 OpenGL 或金属纹理可能会更快。

What is the best way to handle MLMultiArray for live inputs?为实时输入处理MLMultiArray的最佳方法是什么?

My answer is based on processing the MLMultiArray in Metal我的答案是基于在 Metal 中处理 MLMultiArray

Create an MTLBuffer:创建一个 MTLBuffer:

let device = MTLCreateSystemDefaultDevice()!
let segmentationMaskBuffer: MTLBuffer = self.device.makeBuffer(length: segmentationHeight * segmentationWidth * MemoryLayout<Int32>.stride)

Copy MLMultiArray to MTLBuffer:将 MLMultiArray 复制到 MTLBuffer:

memcpy(segmentationMaskBuffer.contents(), mlOutput.semanticPredictions.dataPointer, segmentationMaskBuffer.length)

Setup Metal related variables:设置金属相关变量:

let commandQueue = device.makeCommandQueue()!
let library = device.makeDefaultLibrary()!
let function = library.makeFunction(name: "binaryMask")!
let computePipeline = try! device.makeComputePipelineState(function: function)

create a struct for segmentation size:为分段大小创建一个结构:

let segmentationWidth = 513
let segmentationHeight = 513

struct MixParams {
    var width: Int32 = Int32(segmentationWidth)
    var height: Int32 = Int32(segmentationHeight)
}

create a output texture:创建一个 output 纹理:


let textureDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .bgra8Unorm, width: width, height: height, mipmapped: false)
textureDescriptor.usage = [.shaderRead, .shaderWrite]
let outputTexture = device.makeTexture(descriptor: textureDescriptor)!

pass the mtlbuffer, outputtexture to the kernal function:将 mtlbuffer、outputtexture 传递给内核 function:

let buffer = commandQueue.makeCommandBuffer()!
let maskCommandEncoder = buffer.makeComputeCommandEncoder()!
maskCommandEncoder.setTexture(outputTexture, index: 1)
maskCommandEncoder.setBuffer(segmentationBuffer, offset: 0, index: 0)
maskCommandEncoder.setBytes(&params, length: MemoryLayout<MixParams>.size, index: 1)
let w = computePipeline.threadExecutionWidth
let h = computePipeline.maxTotalThreadsPerThreadgroup / w
let threadGroupSize = MTLSizeMake(w, h, 1)
let threadGroups = MTLSizeMake(
          (depthWidth  + threadGroupSize.width  - 1) / threadGroupSize.width,
          (depthHeight + threadGroupSize.height - 1) / threadGroupSize.height, 1)
maskCommandEncoder.setComputePipelineState(computePipeline)
maskCommandEncoder.dispatchThreadgroups(threadGroups, threadsPerThreadgroup: threadGroupSize)
maskCommandEncoder.endEncoding()

write your kernel function in Shaders.metal file:在 Shaders.metal 文件中写入您的 kernel function :

#include <metal_stdlib>
using namespace metal;
#include <CoreImage/CoreImage.h>

struct MixParams {
    int segmentationWidth;
    int segmentationHeight;
};

static inline int get_class(float2 pos, int width, int height, device int* mask) {
    const int x = int(pos.x * width);
    const int y = int(pos.y * height);
    return mask[y*width + x];
}

static float get_person_probability(float2 pos, int width, int height, device int* mask) {   
    return get_class(pos, width, height, mask) == 15;
}

kernel void binaryMask(
                      texture2d<float, access::write> outputTexture [[texture(1)]],
                      device int* segmentationMask [[buffer(0)]],
                      constant MixParams& params [[buffer(1)]],
                      uint2 gid [[thread_position_in_grid]])
{
    float width = outputTexture.get_width();
    float height = outputTexture.get_height();
    
    if (gid.x >= width ||
        gid.y >= height) return;
    
    
    const float2 pos = float2(float(gid.x) / width,
                              float(gid.y) / height);
    
    const float is_person = get_person_probability(pos, params.segmentationWidth,
                                                   params.segmentationHeight,
                                                   segmentationMask);
    
    float4 outPixel;
    
    if (is_person < 0.5f) {
        outPixel = float4(0.0,0.0,0.0,0.0);
    } else {
        outPixel = float4(1.0,1.0,1.0,1.0);
    }
    
    outputTexture.write(outPixel, gid);
}

Finally get the ciimage from output texture after encoding is finished:最后在编码完成后从 output 纹理中获取 ciimage:

let kciOptions: [CIImageOption: Any] = [CIImageOption.colorSpace: CGColorSpaceCreateDeviceRGB()]
let maskIMage = CIImage(mtlTexture: outputTexture,options: kciOptions)!.oriented(.downMirrored)

Instead of outputting an MLMultiArray you can change the model to make it output an image of type CVPixelBuffer .您可以更改 model 以使其 output 成为CVPixelBuffer MLMultiArray Then you can use CVMetalTextureCacheCreateTextureFromImage to turn the pixel buffer into an MTLTexture .然后您可以使用CVMetalTextureCacheCreateTextureFromImage将像素缓冲区转换为MTLTexture (I think this works but I don't recall if I ever tried it. Not all pixel buffer objects can be turned into textures and I'm not sure if Core ML outputs a CVPixelBuffer object with the "Metal compatibility flag" turned on.) (我认为这可行,但我不记得是否尝试过。并非所有像素缓冲区对象都可以转换为纹理,而且我不确定 Core ML 是否输出CVPixelBuffer object 并打开了“金属兼容性标志”。 )

Alternatively, you can write a compute kernel that takes in the MLMultiArray and converts it to a texture, which then gets drawn into a Metal view.或者,您可以编写一个计算 kernel 接收MLMultiArray并将其转换为纹理,然后将其绘制到金属视图中。 This has the advantage that you apply all kinds of effects to the segmentation map in the compute kernel at the same time.这样做的好处是您可以同时对计算 kernel 中的分段 map 应用各种效果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM