简体   繁体   English

如何将纹理缓冲区数据传递给Shader with Metal?

[英]How to pass texture buffer data to Shader with Metal?

I would like to work with texture data as a 1D array in a compute shader. 我想在计算着色器中将纹理数据用作一维数组。 I read that the best way is to pass it as a buffer instead of a 1D texture. 我读到最好的方法是将其作为缓冲区而不是1D纹理传递。

I am loading the texture with: 我正在加载纹理:

let textureLoader = MTKTextureLoader(device: device)

do {
    if let image = UIImage(named: "testImage") {
        let options = [ MTKTextureLoaderOptionSRGB : NSNumber(value: false) ]
        try kernelSourceTexture = textureLoader.newTexture(with: image.cgImage!, options: options)
            kernelDestTexture = device.makeTexture(descriptor: kernelSourceTexture!.matchingDescriptor())
    } else {
        print("Failed to load texture image from main bundle")
    }
}
catch let error {
    print("Failed to create texture from image, error \(error)")
}

And I am creating the buffer with (not sure if this is correct): 我正在创建缓冲区(不确定这是否正确):

var textureBuffer: MTLBuffer! = nil
var currentVertPtr = kernelSourceTexture!.buffer!.contents()
textureBuffer = device.makeBuffer(bytes: &currentVertPtr, length: kernelSourceTexture!.buffer!.length, options: [])
uniformBuffer.label = "textureData"

How do I pass the buffer to a compute shader? 如何将缓冲区传递给计算着色器? Do I pass it as an argument or as a uniform? 我把它作为论据或制服传递给我吗? What would the buffer's data type be? 缓冲区的数据类型是什么?

Sorry if these are dumb questions, I am just getting started with Metal and I can't find much for reading. 对不起,如果这些都是愚蠢的问题,我刚开始使用Metal,我找不到太多的阅读。 I bought and read "Metal by Example: High-performance graphics and data-parallel programming for iOS". 我买了并阅读了“金属示例:适用于iOS的高性能图形和数据并行编程”。 Side question, can anyone recommend more books on Metal? 附带问题,有人可以推荐更多有关Metal的书籍吗?

Whether you should pass the data as a buffer or texture depends somewhat on what you want to do with it in your kernel function. 是否应该将数据作为缓冲区或纹理传递取决于您在内核函数中要对其执行的操作。 If you use a buffer, you won't get several of the benefits of textures: defined behavior when sampling out of bounds, interpolation, and automatic conversion of components from the source pixel format to the component type requested in the shader. 如果使用缓冲区,则不会获得纹理的几个好处:在超出范围的采样,插值以及从源像素格式到着色器中请求的组件类型的组件自动转换时定义的行为。

But since you asked about buffers, let's talk about how to create a buffer that contains image data and how to pass it to a kernel. 但是既然你问过缓冲区,那么我们来谈谈如何创建一个包含图像数据的缓冲区以及如何将它传递给内核。

I'll assume for the sake of discussion that we want our data in the equivalent of .rgba8unorm format, where each component is a single byte. 为了讨论起见,我假设我们希望我们的数据相当于.rgba8unorm格式,其中每个组件都是一个字节。

Creating a texture just for the sake of doing this conversion is wasteful (and as Ken noted in the comments, textures aren't backed by a buffer by default, which complicates how we get their data), so let's set MTKTextureLoader aside and do it ourselves. 仅仅为了进行这种转换而创建纹理是浪费的(正如Ken在评论中指出的那样,默认情况下纹理不受缓冲区支持,这使得我们获取数据的方式变得复杂),所以让我们将MTKTextureLoader放在一边并进行操作我们自己。

Suppose we have an image in our bundle for which we have a URL. 假设我们的包中有一个图像,我们有一个URL。 Then we can use a method like the following to load it, ensure it's in the desired format, and wrap the data in an MTLBuffer with a minimal number of copies: 然后我们可以使用如下方法加载它,确保它具有所需的格式,并将数据包装在具有最小副本数量的MTLBuffer中:

func bufferWithImageData(at url: URL, resourceOptions: MTLResourceOptions, device: MTLDevice) -> MTLBuffer? {
    guard let imageSource = CGImageSourceCreateWithURL(url as CFURL, nil) else { return nil }
    if CGImageSourceGetCount(imageSource) != 1 { return nil }
    guard let image = CGImageSourceCreateImageAtIndex(imageSource, 0, nil) else { return nil }
    guard let colorspace = CGColorSpace(name: CGColorSpace.genericRGBLinear) else { return nil }

    let bitsPerComponent = UInt32(8)
    let bytesPerComponent = bitsPerComponent / 8
    let componentCount = UInt32(4)
    let bytesPerPixel = bytesPerComponent * componentCount
    let rowBytes = UInt32(image.width) * bytesPerPixel
    let imageSizeBytes = rowBytes * UInt32(image.height)

    let pageSize = UInt32(getpagesize())
    let allocSizeBytes = (imageSizeBytes + pageSize - 1) & (~(pageSize - 1))

    var dataBuffer: UnsafeMutableRawPointer? = nil
    let allocResult = posix_memalign(&dataBuffer, Int(pageSize), Int(allocSizeBytes))
    if allocResult != noErr { return nil }

    var targetFormat = vImage_CGImageFormat()
    targetFormat.bitsPerComponent = bitsPerComponent
    targetFormat.bitsPerPixel = bytesPerPixel * 8
    targetFormat.colorSpace = Unmanaged.passUnretained(colorspace)
    targetFormat.bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.premultipliedLast.rawValue)

    var imageBuffer = vImage_Buffer(data: dataBuffer, height: UInt(image.height), width: UInt(image.width), rowBytes: Int(rowBytes))
    let status = vImageBuffer_InitWithCGImage(&imageBuffer, &targetFormat, nil, image, vImage_Flags(kvImageNoAllocate))
    if status != kvImageNoError {
        free(dataBuffer)
        return nil
    }

    return device.makeBuffer(bytesNoCopy: imageBuffer.data, length: Int(allocSizeBytes), options: resourceOptions, deallocator: { (memory, size) in
        free(memory)
    })
}

(Note that you'll need to import Accelerate in order to use vImage functions.) (请注意,您需要import Accelerate才能使用vImage函数。)

Here's an example of how to call this method: 以下是如何调用此方法的示例:

let resourceOptions: MTLResourceOptions = [ .storageModeShared ]
let imageURL = Bundle.main.url(forResource: "my_image", withExtension: "png")!
let inputBuffer = bufferWithImageData(at: imageURL, resourceOptions: resourceOptions, device: device)

This may seem unnecessarily complex, but the beauty of this is that for a huge variety of input formats, we can use vImage to efficiently convert to our desired layout and color space. 这可能看起来不必要地复杂,但其优点在于,对于各种各样的输入格式,我们可以使用vImage有效地转换为我们想要的布局和色彩空间。 By changing only a couple of lines, we could go from RGBA8888 to BGRAFFFF, or many other formats. 通过仅改变几行,我们可以从RGBA8888转到BGRAFFFF,或许多其他格式。

Create your compute pipeline state and any other resources you want to work with in the usual way. 以常规方式创建计算管道状态和您要使用的任何其他资源。 You can pass the buffer you just created by assigning it to any buffer argument slot: 您可以通过将其分配给任何缓冲区参数槽来传递刚刚创建的缓冲区:

computeCommandEncoder.setBuffer(inputBuffer, offset: 0, at: 0)

Dispatch your compute grid, also in the usual way. 以通常的方式调度您的计算网格。

For completeness, here's a kernel function that operates on our buffer. 为了完整起见,这是一个在我们的缓冲区上运行的内核函数。 It's by no means the most efficient way to compute this result, but this is just for illustration: 它绝不是计算此结果的最有效方法,但这只是为了说明:

kernel void threshold(constant uchar4 *imageBuffer [[buffer(0)]],
                      device uchar *outputBuffer [[buffer(1)]],
                      uint gid [[thread_position_in_grid]])
{
    float3 p = float3(imageBuffer[gid].rgb);
    float3 k = float3(0.299, 0.587, 0.114);
    float luma = dot(p, k);
    outputBuffer[gid] = (luma > 127) ? 255 : 0;
}

Note: 注意:

  1. We take the buffer as a uchar4 , since each sequence of 4 bytes represents one pixel. 我们将缓冲区作为uchar4 ,因为每个4字节的序列代表一个像素。
  2. We index into the buffer using a parameter attributed with thread_position_in_grid , which indicates the global index into the grid we dispatched with our compute command encoder. 我们使用一个由thread_position_in_grid归属的参数索引到缓冲区,该参数指示我们使用计算命令编码器调度的网格的全局索引。 Since our "image" is 1D, this position is also one-dimensional. 由于我们的“图像”是1D,因此该位置也是一维的。
  3. In general, integer arithmetic operations are very expensive on GPUs. 通常,整数算术运算在GPU上非常昂贵。 It's possible that the time spent doing the integer->float conversions in this function dominates the extra bandwidth of operating on a buffer containing floats, at least on some processors. 在此函数中执行integer-> float转换所花费的时间可能会占用在包含浮点数的缓冲区上运行的额外带宽,至少在某些处理器上是这样。

Hope that helps. 希望有所帮助。 If you tell us more about what you're trying to do, we can make better suggestions about how to load and process your image data. 如果您告诉我们有关您要执行的操作的更多信息,我们可以就如何加载和处理您的图像数据提出更好的建议。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM