简体   繁体   中英

Access pixels beyond grid position in metal compute kernel?

I have a metal kernel function. Usually you access pixels like this:

kernel void edgeDetect(texture2d<half, access::sample> inTexture [[ texture(0) ]],
                    texture2d<half, access::write> outTexture [[ texture(1) ]],
                    device const uint *roi [[ buffer(0) ]],
                    uint2 grid [[ thread_position_in_grid ]]) {

  if (grid.x >= outTexture.get_width() || grid.y >= outTexture.get_height()) {
      return;
  }

  half c[9];
  for (int i=0; i < 3; ++i) {
    for (int j=0; j < 3; ++j) {
      c[3*i+j] = inTexture.read(grid + uint2(i-1,j-1)).x;
    }
  }

  half3 Lx = 2.0*(c[7]-c[1]) + c[6] + c[8] - c[2] - c[0];
  half3 Ly = 2.0*(c[3]-c[5]) + c[6] + c[0] - c[2] - c[8];
  half3 G = sqrt(Lx*Lx+Ly*Ly);

  outTexture.write(half4(G, 0.0), grid);
}

Now I need to access pixels in the neighbourhood of the current grid position like this:

      half4 inColor = inTexture.read(grid - uint2(-1,-1));

Basically this works, but on the thread boundaries I have "discontinuities" as shown in this image (the brick wall pattern).

由于超出“网格”的寻址而导致线程边界

This is clear since each thread is passed only it's sub-texture to process. So beyond thread boundaries I can't access pixels.

My question is: What is the concept when I need to address pixels beyond the current position in a compute kernel ? Is this possible with compute kernels at all ?

I have found the issue:

The line

c[3*i+j] = inTexture.read(grid + uint2(i-1,j-1)).x;

must be changed to:

c[3*i+j] = inTexture.read(grid + uint2(i,j)).x;

Obvisouly the position indices of -1 into the texture failed and produced the brick wall like artefacts shown in the image above.

To ensure somebody has attached it to this comment as an answer: there is no restriction on which pixels you can access in a compute shader. Your grid size affects scheduling only.

Your error is instantiating unsigned uint2 with negative numbers. At the first iteration of your loop you will attempt to construct uint2(-1, -1) , which is the same as uint2(4294967295, 4294967295) and therefore way out of bounds.

You can use int2 , or as per your self-answer just avoid negative numbers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM