简体   繁体   中英

Why is direct indexing of an array significantly faster than iteration?

Just some Python code for an example:

nums = [1,2,3]
start = timer()
for i in range(len(nums)):
  print(nums[i])
end = timer()

print((end-start)) #computed to 0.0697546862831


start = timer()
print(nums[0])
print(nums[1])
print(nums[2])
end = timer()

print((end-start)) #computed to 0.0167170338524

I can grasp that some extra time will be taken in the loop because the value of i must be incremented a few times, but the difference between the running times of these two different methods seems a lot bigger than I expected. Is there something else happening underneath the hood that I'm not considering?

Short answer: it isn't, unless the loop is very small. The for loop has a small overhead, but the way you're doing it is inefficient. By using range(len(nums)) you're effectively creating another list and iterating through that, then doing the same index lookups anyway. Try this:

for i in nums:
    print(i)

Results for me were as expected:

>>> import timeit
>>> timeit.timeit('nums[0];nums[1];nums[2]', setup='nums = [1,2,3]')
0.10711812973022461
>>> timeit.timeit('for i in nums:pass', setup='nums = [1,2,3]')
0.13474011421203613
>>> timeit.timeit('for i in range(len(nums)):pass', setup='nums = [1,2,3]')
0.42371487617492676

With a bigger list the advantage of the loop becomes apparent, because the incremental cost of accessing an element by index outweighs the one-off cost of the loop:

>>> timeit.timeit('for i in nums:pass', setup='nums = range(0,100)')
1.541944980621338
timeit.timeit(';'.join('nums[%s]' % i for i in range(0,100)), setup='nums = range(0,100)')
2.5244338512420654

In python 3, which puts a greater emphasis on iterators over indexable lists, the difference is even greater:

>>> timeit.timeit('for i in nums:pass', setup='nums = range(0,100)')
1.6542046590038808
>>> timeit.timeit(';'.join('nums[%s]' % i for i in range(0,100)), setup='nums = range(0,100)')
10.331634456000756

With such a small array you're probably measuring noise first, and then the overhead of calling range() . Note that range not only has to increment a variable a few times, it also creates an object that holds its state (the current value) because it's a generator. The function call and object creation are two things you don't pay for in the second example and for very short iterations they will probably dwarf three array accesses.

Essentially your second snippet does loop unrolling , which is a viable and frequent technique of speeding up performance-critical code.

The for loop have a cost in any case, and the one you write is especially costly. Here is four versions, using timeit for measure time:

from timeit import timeit

NUMS = [1, 2, 3]


def one():
    for i in range(len(NUMS)):
          NUMS[i]


def one_no_access():
    for i in range(len(NUMS)):
          i


def two():
    NUMS[0]
    NUMS[1]
    NUMS[2]


def three():
    for i in NUMS:
        i


for func in (one, one_no_access, two, three):
    print(func.__name__ + ':', timeit(func))

Here is the found times:

one: 1.0467438200000743
one_no_access: 0.8853238560000136
two: 0.3143197629999577
three: 0.3478466749998006

The one_no_access show the cost of the expression range(len(NUMS)) . While lists in python are stocked contiguously in memory, the random access of elements is in O(1) , explaining two as the quicker.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM