简体   繁体   English

如何在 C++ 中转置 4D 张量?

[英]How to transpose a 4D tensor in C++?

I need to pre-process the input of an ML model into the correct shape.我需要将 ML 模型的输入预处理为正确的形状。 In order to do that, I need to transpose a tensor from ncnn in C++.为了做到这一点,我需要在 C++ 中从ncnn转置一个张量。 The API does not offer a transpose , so I am trying to implement my own transpose function.该 API 不提供transpose ,因此我正在尝试实现自己的 transpose 函数。

The input tensor has the shape (1, 640, 640, 3) (for batch , x , y and color ) and I need to reshape it to the shape (1, 3, 640, 640) .输入张量的形状为(1, 640, 640, 3) (用于batchxycolor ),我需要将其重塑为形状(1, 3, 640, 640)

How do I properly and efficiently transpose the tensor?如何正确有效地转置张量?

ncnn:Mat& preprocess(const cv::Mat& rgba) {
    int width = rgba.cols;
    int height = rgba.rows;

    // Build a tensor from the image input
    ncnn::Mat in = ncnn::Mat::from_pixels(rgba.data, ncnn::Mat::PIXEL_RGBA2RGB, width, height);

    // Set the current shape of the tesnor 
    in = in.reshape(1, 640, 640, 3);

    // Normalize
    const float norm_vals[3] = {1 / 255.f, 1 / 255.f, 1 / 255.f};
    in.substract_mean_normalize(0, norm_vals);

    // Prepare the transposed matrix
    ncnn::Mat transposed = new ncnn::Mat(in.w, in.c, in.h, in.d, sizeof(float));
    ncnn::Mat shape = transposed->shape();

    // Transpose
    
    for (int i = 0; i < in.w; i++) {
        for (int j = 0; j < in.h; j++) {
            for (int k = 0; k < in.d; k++) {
                for (int l = 0; l > in.c; l++) {
                    int fromIndex = ???;
                    int toIndex = ???;
                    transposed[toIndex] = in[fromIndex];
                }
            }
        }
    }

    return transposed; 
}

I'm only talking about index calculations, not the ncnn API which I'm not familiar with.我只说索引计算,不是我不熟悉的ncnn API。

You set你设置

fromIndex = i*A + j*B + k*C + l*D;
  toIndex = i*E + j*F + k*G + l*H; 

where you compute ABCDEFGH based on the source and target layout.根据源和目标布局计算ABCDEFGH How?如何?

Let's look at a simple 2D transposition first.我们先来看一个简单的 2D 转置。 Transpose a hw layout matrix to a wh layout matrix (slowest changing dimension first):将 hw 布局矩阵转置为 wh 布局矩阵(最慢变化的维度优先):

  for (int i = 0; i < h; ++i) {
      for (int j = 0; j < w; ++j) {
          int fromIndex = i * w + j * 1;
          //              ^       ^
          //              |       |
          //             i<h     j<w        <---- hw layout

          int   toIndex = j * h + i * 1;
          //              ^       ^
          //              |       |
          //             j<w     i<h        <---- wh layout
      }      
  }      

So when computing fromIndex , you start with the source layout (hw), you remove the first letter (h) and what remains (w) is your coefficient that goes with i, and you remove the next letter (w) and what remains (1) is your coefficient that goes with j.因此,在计算fromIndex时,您从源布局 (hw) 开始,删除第一个字母 (h),剩下的 (w) 是与 i 相关的系数,然后删除下一个字母 (w) 和剩下的 ( 1) 是与 j 相关的系数。 It is not hard to see that the same kind of pattern works in any number of dimensions.不难看出,同一种模式适用于任何数量的维度。 For example, if your source layout is dchw, then you have例如,如果您的源布局是 dchw,那么您有

fromIndex = i * (c*h*w) + j * (h*w) + k * (w) + l * (1);
//          ^             ^           ^         ^
//          |             |           |         |
//         i<d           j<c         k<h       l<w   <---- dchw

What about toIndex ? toIndex呢? Same thing but rearrange the letters from the slowest-changing to the fastest-changing in the target layout .同样的事情,但将目标布局中从最慢变化到变化最快的字母重新排列。 For example, if your target layout is hwcd, then the order will be klji (because i is the index that ranges over [0..d), in both source and target layouts, etc).例如,如果您的目标布局是 hwcd,那么顺序将是klji (因为 i 是范围超过 [0..d)的索引,在源布局和目标布局等中都是如此)。 So所以

  toIndex = k * (w*c*d) + l * (c*d) + j * (d) + i * (1);
  //        ^             ^           ^         ^
  //        |             |           |         |
  //       k<h           l<w         j<c       i<d   <---- hwcd

I did not use your layouts on purpose.我没有故意使用你的布局。 Do your own calculations a couple of times.自己计算几次。 You want to develop some intuition about this thing.你想对这件事产生一些直觉。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM