简体   繁体   中英

Two very different but very consistent results from Python timeit

In a slightly contrived experiment I wanted to compare some of Python's built-in functions to those of numpy. When I started timing these though, I found something bizarre.

When I wrote the following:

import timeit
timeit.timeit('import math; math.e**2', number=1000000)

I would get two different results in almost random alternation in a very statistically significant way.

This alternates between 2 seconds, and 0.5 seconds.

This confused me so I ran some experiments to figure out what was going on and I was only more confused. So I tried the following experiments:

[timeit.timeit('import math; math.e**2', number=1000000) for i in xrange(100)]

which led entirely to the 0.5 number. I then tried seeding this with a generator:

test = (timeit.timeit('import math; math.e**2', number=1000000) for i in xrange(100))
[item for item in test]

which led to a list entirely full of the 2.0 number.

On the suggestion of alecxe I changed my timeit statement to:

timeit.timeit('math.e**2', 'import math', number=1000000)

which similarly alternated between about 0.1 and 0.4 seconds, but when I reran the experiment comparing generators and list comprehensions, but this time the results were flipped. That is to say that the generator expression regularly came up with the 0.1 second number, while the list comprehension returned a full list of the 0.4 second number.

Direct console output:

>>> test = (timeit.timeit('math.e**2', 'import math', number=1000000) for i in xrange(100))
>>> test.next()
0.15114784240722656

>>> timeit.timeit('math.e**2', 'import math', number=1000000)
0.44176197052001953
>>> 

Edit: I'm using Ubuntu 12.04 running dwm, and I've seen these results both in xterm and a gnome-terminal. I'm using python 2.7.3

Does anybody know what's going on here? This seems really bizarre to me.

Turns out there were a couple things happening here, though apparently some of these quirks my be specific to my machine, but nevertheless I figure it's worth posting them in case someone is puzzled by the same thing.

Firstly, there's a different between the two timeit functions in that the:

timeit.timeit('math.e**2', 'import math', number=1000000)

the import statements are lazily loaded. This becomes obvious if you try the following experiment:

timeit.timeit('1+1', 'import math', number=1000000)

versus:

timeit.timeit('1+1', number=1000000)

So when it was directly run in the list comprehension it looks like this import statement was being loaded for every entry. (Exact reasons for this are probably related to my configuration).

Past that, going back to the original question, it looks like 3/4 of the time was actually spent import math, so I'm guessing that when the equation was generated, there was no cache storage between iterations, while there was import caching within the list comprehension (again, the exact reason for this is probably configuration specific)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM