简体   繁体   中英

Ranges of nested for-loops when locality is improved (C++)

I have the following nested for loop:

int n = 8;
int counter = 0;

for (int i = 0; i < n; i++)
{
    for (int j = i + 1; j < n; j++)
    {
        printf("(%d, %d)\n", i, j);
        counter++;
    }
}

Which prints (0,1) to (6,7) as expected and the printf() statement is ran 28 times as indicated by counter .

I have been the set the task of improving the efficiency of this code by improving its locality (this is test code, the value of n in the actual program is much larger and i and j are used to index into two 1d arrays) and have employed what I believe to be a fairly standard technique:

int chunk = 4;

for(int i = 0; i < n; i+=chunk)
    for(int j = 0; j < n; j+=chunk)
        for (int i_chunk = 0; i_chunk < chunk; i_chunk++)
            for (int j_chunk = i_chunk + 1; j_chunk < chunk; j_chunk++)
            {
                printf("(%d, %d)\n", i+i_chunk, j+j_chunk);
                counter++;
            }

However, here printf() is only being ran 24 times because the j_chunk = i_chunk + 1 means that where before the j loop printed (0,1) to (0,7), the two iterations of the j_chunk loop where i+i_chunk == 0 print (0,1) to (0,3) and (0,5) to (0,7) missing (0,4).

I understand why it is doing this but I can't for the life of me come up with a solution; any help would be appreciated.

First you need to make sure that j is never in a lower chunk than i , so your outer loops should be:

for(int i = 0; i < n; i+=chunk)
   for(int j = i; j < n; j+=chunk)

Then you need different behaviour based on whether i and j are in the same chunk or not. If they are, j_chunk needs to allways be larger than i_chunk , otherwise you need to go through all possible combinations:

if(i==j)
{
    for (int i_chunk = 0; i_chunk < chunk; i_chunk++)
    {
        for (int j_chunk = i_chunk + 1; j_chunk < chunk; j_chunk++)
        {
            printf("(%d, %d)\n", i+i_chunk, j+j_chunk);
            counter++;
        }
    }
}
else
{
    for (int i_chunk = 0; i_chunk < chunk; i_chunk++)
    {
        for (int j_chunk = 0; j_chunk < chunk; j_chunk++)
        {
            printf("(%d, %d)\n", i+i_chunk, j+j_chunk);
            counter++;
        }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM