简体   繁体   中英

How can I read the CVPixelBuffer as 4 channel float format from a CIImage?

I'm currently trying to do some calculations on a CIImage construct. We are using a custom Core ML model on video frames, and in the meantime using GPU to translate these with CIFilters to required formats.

For one step, I need to do some calculations on two of the outputs generated by a model, and find the mean and standart deviations from the pixel data per channel.

For testing and tech preview, I was able to create a UIImage, read CVPixelData, convert and calculate on the CPU. But while trying to adapt it to the GPU I hit a rock.

The process is simple:

  • Convert CIImage BGRA to LAB format. We do not need the alpha channel, but kept as LAB-A
  • Do calculations on the pixel data.
  • Return from LAB to BGRA, and copy the alpha channel as is.

At current state, I am using a custom CIFilter + Metal kernel to convert the CIImage from RGB to LAB (and back to RGB) format. Without calculations in between, RGB > LAB > RGB conversion works as expected and returns the same image without any deformations. This tells me that the float precision is not lost.

But when I tried to read the pixel data in between, I'm not able to get the float values I was looking for. CVPixelBuffer created from the LAB formatted CIImage is giving me values that are always zero. Tried a few different OSType formats like kCVPixelFormatType_64RGBAHalf , kCVPixelFormatType_128RGBAFloat , kCVPixelFormatType_32ARGB , etc., none of them are returning the float values. But if I read data from another image I'm always getting the UInt8 values as expected...

So my question is as the title suggest " How can I read the CVPixelBuffer as a 4 channel float format from a CIImage? "

Simplified Swift and Metal code for the process is as follows.

let ciRgbToLab = CIConvertRGBToLAB() // CIFilter using metal for kernel
let ciLabToRgb = CIConvertLABToRGB() // CIFilter using metal for kernel

ciRgbToLab.inputImage = source // "source" is a CIImage
guard let sourceLab = ciRgbToLab.outputImage else { throw ... }

ciRgbToLab.inputImage = target // "target" is a CIImage
guard let targetLab = ciRgbToLab.outputImage { throw ... }

// Get the CVPixelBuffer and lock the data.
guard let sourceBuffer = sourceLab.cvPixelBuffer else { throw ... }
CVPixelBufferLockBaseAddress(sourceBuffer, CVPixelBufferLockFlags(rawValue: 0))
defer {
  CVPixelBufferUnlockBaseAddress(sourceBuffer, CVPixelBufferLockFlags(rawValue: 0))
}

// Access to the data
guard let sourceAddress = CVPixelBufferGetBaseAddress(sourceBuffer) { throw ... }
let sourceDataSize = CVPixelBufferGetDataSize(sourceBuffer)
let sourceData = sourceAddress.bindMemory(to: CGFloat.self, capacity: sourceDataSize)
// ... do calculations
// ... generates a new CIImage named "targetTransfered"

ciLabToRgb.inputImage = targetTransfered //*
guard let rgbFinal = ciLabToRgb.outputImage else  { throw ... }

//* If "targetTransfered" is replaced with "targetLab", we get the exact image as "target".
#include <metal_stdlib>
using namespace metal;

#include <CoreImage/CoreImage.h>

extern "C" {
  namespace coreimage {
    float4 xyzToLabConversion(float4 pixel) {
      ...
      return float4(l, a, b, pixel.a);
    }
    
    float4 rgbToXyzConversion(float4 pixel) {
      ...
      return float4(x, y, z, pixel.a);
    }
    
    float4 rgbToLab(sample_t s) {
      float4 xyz = rgbToXyzConversion(s);
      float4 lab = xyzToLabConversion(xyz);
      return lab;
    }
    
    float4 xyzToRgbConversion(float4 pixel) {
      ...
      return float4(R, G, B, pixel.a);
    }
    
    float4 labToXyzConversion(float4 pixel) {
      ...
      return float4(X, Y, Z, pixel.a);
    }
    
    float4 labtoRgb(sample_t s) {
      float4 xyz = labToXyzConversion(s);
      float4 rgb = xyzToRgbConversion(xyz);
      return rgb;
    }
  }
}

This is the extension I'm using to convert CIImage to CVPixelBuffer. As the image is created on device by the same source, it is always in BGRA format. I have no idea how to convert this to get float values...

extension CIImage {
    var cvPixelBuffer: CVPixelBuffer? {
    let attrs = [
                  kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
                  kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue,
                  kCVPixelBufferMetalCompatibilityKey: kCFBooleanTrue
                ] as CFDictionary

    var pixelBuffer: CVPixelBuffer?
    let status = CVPixelBufferCreate(kCFAllocatorDefault,
                                     Int(self.extent.width),
                                     Int(self.extent.height),
                                     kCVPixelFormatType_32BGRA,
                                     attrs,
                                     &pixelBuffer)

    guard status == kCVReturnSuccess else { return nil }
    guard let buffer = pixelBuffer else { return nil }

    CVPixelBufferLockBaseAddress(buffer, CVPixelBufferLockFlags.init(rawValue: 0))

    let context = CIContext()
    context.render(self, to: buffer)

    CVPixelBufferUnlockBaseAddress(buffer, CVPixelBufferLockFlags(rawValue: 0))
    return pixelBuffer
  }
}

PS: I removed the metal kernel code to fit in here. If you need a RGB > LAB > RGB conversion, send me a message, I'm happy to share the filter.

It's very strange that you get all zeros, especially when you set the format to kCVPixelFormatType_128RGBAFloat ...

However, I highly recommend you check out CIImageProcessorKernel , it's made for this very use case: adding custom (potentially CPU-based) processing steps to a Core Image pipeline. In the process function you get access to the input and output buffers either as MTLTexture , CVPixelBuffer , or even direct access to the baseAddress .

Here is an example kernel I wrote for computing the mean and variance of the input image using Metal Performance Shaders and returning them in a 2x1 pixel CIImage :

import CoreImage
import MetalPerformanceShaders


/// Processing kernel that computes the mean and the variance of a given image and stores
/// those values in a 2x1 pixel return image.
class MeanVarianceKernel: CIImageProcessorKernel {

    override class func roi(forInput input: Int32, arguments: [String : Any]?, outputRect: CGRect) -> CGRect {
        // we need to read the full extend of the input
        return arguments?["inputExtent"] as? CGRect ?? outputRect
    }

    override class var outputFormat: CIFormat {
        return .RGBAf
    }

    override class var synchronizeInputs: Bool {
        // no need to wait for CPU synchronization since the processing is also happening on the GPU
        return false
    }

    /// Convenience method for calling the `apply` method from outside.
    class func apply(to input: CIImage) -> CIImage {
        // pass the extent of the input as argument since we need to know the full extend in the ROI callback above
        return try! self.apply(withExtent: CGRect(x: 0, y: 0, width: 2, height: 1), inputs: [input], arguments: ["inputExtent": input.extent])
    }

    override class func process(with inputs: [CIImageProcessorInput]?, arguments: [String : Any]?, output: CIImageProcessorOutput) throws {
        guard
            let commandBuffer = output.metalCommandBuffer,
            let input = inputs?.first,
            let sourceTexture = input.metalTexture,
            let destinationTexture = output.metalTexture
        else {
            return
        }

        let meanVarianceShader = MPSImageStatisticsMeanAndVariance(device: commandBuffer.device)
        meanVarianceShader.encode(commandBuffer: commandBuffer, sourceTexture: sourceTexture, destinationTexture: destinationTexture)
    }

}

It can easily be added to a filter pipeline like this:

let meanVariance: CIImage = MeanVarianceKernel.apply(to: inputImage)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM