简体   繁体   中英

What is the most efficient way to loop over an array? (c++)

This is sort of a silly question, but it's been bothering me and I couldn't google-fu my way over it.

Consider the following array:

struct SomeDataStruct
{
    uint64_t ValueOne;
    uint64_t ValueTwo;
    uint64_t ValueThree;
};

SomeDataStruct _veryLargeArray[1024];

Now, which of these approaches are faster to loop over every element and do something with each one?

Approach 1:

for (int i = 0; i < 1024; ++i)
{
    _veryLargeArray[i].ValueOne += 1;
    _veryLargeArray[i].ValueTwo += 1;
    _veryLargeArray[i].ValueThree = _veryLargeArray[i].ValueOne + _veryLargeArray[i].ValueTwo;
}

Approach 2:

SomeDataStruct * pEndOfStruct = &(_veryLargeArray[1024]);

for (SomeDataStruct * ptr = _veryLargeArray; ptr != pEndOfStruct; ptr += 1)
{
    ptr->ValueOne += 1;
    ptr->ValueTwo += 1;
    ptr->ValueThree = ptr->ValueOne + ptr->ValueTwo;
}

I know the question seems really stupid on its surface, but what I'm wondering is does the compiler do anything smart/special with each given way of implementing the for loop? In the first case, it could be really memory intensive if the compiler actually looked up BaseArrayPointer + Offset every time, but if the compiler is smart enough if will fill the L2 cache with the entire array and treat the code between the { }'s correctly.

The second way gets around if the compiler is resolving the pointer every time, but probably makes it real hard for a compiler to figure out that if could copy the entire array into the L2 cache and walk it there.

Sorry for such a silly question, I'm having a lot of fun learning c++ and have started doing that thing where you overthink everything. Just curious if anyone knew if there was a "definitive" answer.

Unless you want to look at the intermediate assembly language output and analyze the caching behaviour of the CPU, the only way you'll be able to answer this question is to profile the code. Run it, hundreds or thousands of times and see how long it takes.

If you want the fastest code, write the simplest, most obvious version and leave it to the optimizing compiler. If you try to get fancy, with a loop like this, you risk confusing the compiler and it won't be able to optimize things.

I've seen a simple C loop compile to be faster than hand-coded assembly, and a hand-optimized C version that ended up slower than the hand-coded assembly.

On the other hand it can pay to know a bit about caching and what is going on under the hood. But usually, that happens after you've discovered that your code isn't fast enough. Doing otherwise risks premature optimization, which is the root of all evil .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM