快速将1D阵列复制到cpp中的3D阵列

Question

Is it possible copy from 1D array to 3D with some function as memcpy? 是否可以从一维数组复制到3D，并将某些功能作为memcpy？

Now I am using a slow method : 现在我使用的是一种缓慢的方法：

for(int loop1 = 0; loop1 < numberAgents; loop1++)
    for(int loop2 = 0; loop2 < fieldWidth; loop2++)
        for(int loop3 = 0; loop3 < fieldWidth; loop3++)
            potentialField[loop1][loop2][loop3] = cpuPotentialField[loop1 * fieldWidth * fieldWidth + loop2 * fieldWidth + loop3];

This doesn't work : 这不起作用：

memPotentialField = numberAgents * fieldWidth * fieldWidth * sizeof(float);
memcpy(potentialField, cpuPotentialField, memPotentialField);

Answer 1

Multi-dimensional arrays are stored row-wise (§ 8.3.4/9), so essentially your approach with memcpy is fine (because floats are PODs). 多维数组按行存储（第8.3.4 / 9节），因此基本上你的memcpy方法很好（因为浮点数是POD）。

memcpy(&potentialField[0][0][0], cpuPotentialField,
       sizeof(potentialField)/sizeof(***potentialField));

Using std::copy is better, since it works for non-PODS too. 使用std :: copy更好，因为它也适用于非PODS。 So I would write 所以我会写

std::copy(&potentialField[0][0][0],
          &potentialField[0][0][0] + sizeof(potentialField)/sizeof(potentialField[0][0][0]),
          cpuPotentialField);

Answer 2

Unless you have a particularly bad compiler or you've forgotten to turn on optimisation (eg -O3 ) then the first method should be fine performance-wise. 除非您有一个特别糟糕的编译器或者您忘记打开优化（例如-O3 ），否则第一种方法应该是良好的性能。 However you may be able to optimise it a little by hoisting some of the multiplies: 但是你可以通过提升一些倍数来优化它：

for (int loop1 = 0; loop1 < numberAgents; loop1++)
{
    const int index1 = loop1 * fieldWidth * fieldwidth;

    for (int loop2 = 0; loop2 < fieldWidth; loop2++)
    {
        const int index2 = index1 + loop2 * fieldWidth;

        for (int loop3 = 0; loop3 < fieldWidth; loop3++)
        {
            potentialField[loop1][loop2][loop3] = cpuPotentialField[index2 + loop3];
        }
    }
}

Answer 3

You may be able to get some performance by unrolling the loop. 您可以通过展开循环来获得一些性能。 In some processors, branch or jump instructions cause the instruction pipeline to be reloaded, wasting time. 在某些处理器中，分支或跳转指令会导致重新加载指令流水线，从而浪费时间。

//...
unsigned int items_remaining = fieldWidth;
for (unsigned int loop3 = 0; loop3 < fieldWidth; ++loop3)
{
    unsigned int copy_count = 4 - (items_remaining % 4);
    switch (copy_count)
    {
        // The fall-through of these cases is intentional.
        case 4:
          potentialField[loop1][loop2][loop3] = cpuPotentialField[loop1 * fieldWidth * fieldWidth + loop2 * fieldWidth + loop3];
          ++loop3;
          --items_remaining;
        case 3:
          potentialField[loop1][loop2][loop3] = cpuPotentialField[loop1 * fieldWidth * fieldWidth + loop2 * fieldWidth + loop3];
          ++loop3;
          --items_remaining;
        case 2:
          potentialField[loop1][loop2][loop3] = cpuPotentialField[loop1 * fieldWidth * fieldWidth + loop2 * fieldWidth + loop3];
          ++loop3;
          --items_remaining;
        case 1:
          potentialField[loop1][loop2][loop3] = cpuPotentialField[loop1 * fieldWidth * fieldWidth + loop2 * fieldWidth + loop3];
          ++loop3;
          --items_remaining;
    } // End: switch
} // End: for

This is only unrolled for 4 items. 这仅针对4件商品展开。 The more items in the loop, the more efficient the loop. 循环中的项目越多，循环越有效。 As Paul R said, precomputing some of the indices would also help. 正如Paul R所说，预先计算一些指数也会有所帮助。

Some processors may have specialized copy instructions that the compiler can take advantage of, depending on the compiler. 某些处理器可能具有编译器可以利用的专用复制指令，具体取决于编译器。

快速将1D阵列复制到cpp中的3D阵列

问题描述

3 个解决方案

解决方案1
3 2012-11-25 22:27:29

解决方案2
1 2012-11-25 22:14:29

解决方案3
0 2012-11-25 22:47:10

快速将1D阵列复制到cpp中的3D阵列

问题描述

3 个解决方案

解决方案1 3 2012-11-25 22:27:29

解决方案2 1 2012-11-25 22:14:29

解决方案3 0 2012-11-25 22:47:10

解决方案1
3 2012-11-25 22:27:29

解决方案2
1 2012-11-25 22:14:29

解决方案3
0 2012-11-25 22:47:10