简体   繁体   English

时间的奇怪结果

[英]strange result from timeit

I tried to repeat the functionality of IPython %time, but for some strange reason, results of testing of some function are horrific. 我试图重复IPython%time的功能,但是由于某些奇怪的原因,某些功能的测试结果令人震惊。

IPython: IPython:

In [11]: from random import shuffle
   ....: import numpy as np
   ....: def numpy_seq_el_rank(seq, el):
   ....:     return sum(seq < el)
   ....:
   ....: seq = np.array(xrange(10000))
   ....: shuffle(seq)
   ....:

In [12]: %timeit numpy_seq_el_rank(seq, 10000//2)
10000 loops, best of 3: 46.1 µs per loop

Python: 蟒蛇:

from timeit import timeit, repeat

def my_timeit(code, setup, rep, loops):
    result = repeat(code, setup=setup, repeat=rep, number=loops)
    return '%d loops, best of %d: %0.9f sec per loop'%(loops, rep, min(result))

np_setup = '''
from random import shuffle
import numpy as np
def numpy_seq_el_rank(seq, el):
    return sum(seq < el)

seq = np.array(xrange(10000))
shuffle(seq)
'''
np_code = 'numpy_seq_el_rank(seq, 10000//2)'

print 'Numpy seq_el_rank:\n\t%s'%my_timeit(code=np_code, setup=np_setup, rep=3, loops=100)

And its output: 及其输出:

Numpy seq_el_rank:
    100 loops, best of 3: 1.655324947 sec per loop

As you can see, in python i made 100 loops instead 10000 ( and get 35000 times slower result ) as in ipython, because it takes really long time. 如您所见,在python中,我制作了100个循环,而不是在10000个循环中( 并获得了35000倍的慢速结果 ),因为它要花很长时间。 Can anybody explain why result in python is so slow? 谁能解释为什么python结果太慢?

UPD: Here is cProfile.run('my_timeit(code=np_code, setup=np_setup, rep=3, loops=10000)') output: UPD:这是cProfile.run('my_timeit(code=np_code, setup=np_setup, rep=3, loops=10000)')输出:

         30650 function calls in 4.987 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    4.987    4.987 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 <timeit-src>:2(<module>)
        3    0.001    0.000    4.985    1.662 <timeit-src>:2(inner)
      300    0.006    0.000    4.961    0.017 <timeit-src>:7(numpy_seq_el_rank)
        1    0.000    0.000    4.987    4.987 Lab10.py:47(my_timeit)
        3    0.019    0.006    0.021    0.007 random.py:277(shuffle)
        1    0.000    0.000    0.002    0.002 timeit.py:121(__init__)
        3    0.000    0.000    4.985    1.662 timeit.py:185(timeit)
        1    0.000    0.000    4.985    4.985 timeit.py:208(repeat)
        1    0.000    0.000    4.987    4.987 timeit.py:239(repeat)
        2    0.000    0.000    0.000    0.000 timeit.py:90(reindent)
        3    0.002    0.001    0.002    0.001 {compile}
        3    0.000    0.000    0.000    0.000 {gc.disable}
        3    0.000    0.000    0.000    0.000 {gc.enable}
        3    0.000    0.000    0.000    0.000 {gc.isenabled}
        1    0.000    0.000    0.000    0.000 {globals}
        3    0.000    0.000    0.000    0.000 {isinstance}
        3    0.000    0.000    0.000    0.000 {len}
        3    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    29997    0.001    0.000    0.001    0.000 {method 'random' of '_random.Random' objects}
        2    0.000    0.000    0.000    0.000 {method 'replace' of 'str' objects}
        1    0.000    0.000    0.000    0.000 {min}
        3    0.003    0.001    0.003    0.001 {numpy.core.multiarray.array}
        1    0.000    0.000    0.000    0.000 {range}
      300    4.955    0.017    4.955    0.017 {sum}
        6    0.000    0.000    0.000    0.000 {time.clock}

Well, one issue is that you're misreading the results. 好吧,一个问题是您误读了结果。 ipython is telling you how long it took each of the 10,000 iterations for the set of 10,000 iterations with the lowest total time. ipython告诉您10,000次迭代中的10,000次迭代中的每一次花费了总时间最少的时间。 The timeit.repeat module is reporting how long the whole round of 100 iterations took (again, for the shortest of three). timeit.repeat模块报告了整个100次迭代所花的时间(同样,是最短的三个)。 So the real discrepancy is 46.1 µs per loop (ipython) vs. 16.5 ms per loop (python), still a factor of ~350x difference, but not 35,000x. 因此,实际差异是每个循环(ipython)为46.1 µs,而每个循环(python)为16.5 ms,仍然相差约350倍,但不是35,000倍。

You didn't show profiling results for ipython . 您没有显示ipython分析结果。 Is it possible that in your ipython session, you did either from numpy import sum or from numpy import * ? 是否有可能在您的ipython会话中,您是from numpy import sum还是from numpy import * If so, you'd have been timing the numpy.sum (which is optimized for numpy arrays and would run several orders of magnitude faster ), while your python code (which isolated the globals in a way that ipython does not) ran the normal sum (that has to convert all the values to Python int s and sum them). 如果是这样,您将一直在计时numpy.sum (针对numpy数组进行了优化,并且将以更快的速度运行几个数量级 ),而您的python代码(以ipython不会隔离的方式隔离了全局变量)可以正常运行sum (必须将所有值转换为Python int并求和)。

If you check your profiling output, virtually all of your work is being done in sum ; 如果检查概要分析输出,则实际上所有工作都是sum完成的。 if that part of your code was sped up by several orders of magnitude, the total time would reduce similarly. 如果您的代码的那部分加快了几个数量级,则总时间将类似地减少。 That would explain the "real" discrepancy; 那将解释“真正的”差异; in the test case linked above, it was a 40x difference, and that was for a smaller array (the smaller the array, the less numpy can "show off") with more complex values (vs. summing 0s and 1s here I believe). 在上面链接的测试用例中,有40倍的差异,这是针对较小的数组(数组越小, numpy可以“显示”的数字越少),其值更复杂(我相信这里的总和为0和1) 。

The remainder (if any) is probably an issue of how the code is being eval ed slightly differently, or possibly weirdness with the random shuffle (for consistent tests, you'd want to seed random with a consistent seed to make the "randomness" repeatable) but I doubt that's a difference of more than a few percent. 其余的(如果有的话)可能是关于代码的eval略有不同的问题,或者可能是random混洗很奇怪(对于一致的测试,您希望使用一致的种子对random种子进行“随机性”处理)可重复性),但我怀疑两者之间的差异是否超过百分之几。

There could be any number of reasons this code is running slower in one implementation of python than another. 有多种原因可能导致该代码在python的一种实现中运行速度比另一种慢。 One may be optimized differently than another, one may pre-compile certain parts while the other is fully interpreted. 一个可以与另一个进行不同的优化,一个可以预编译某些部分,而另一部分可以完全解释。 The only way to figure out why is to profile your code. 找出原因的唯一方法是分析您的代码。

https://docs.python.org/2/library/profile.html https://docs.python.org/2/library/profile.html

import cProfile

cProfile.run('repeat(code, setup=setup, repeat=rep, number=loops)')

Will give a result similar to 将给出类似的结果

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    0.000    0.000 <stdin>:1(testing)
     1    0.000    0.000    0.000    0.000 <string>:1(<module>)
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
     1    0.000    0.000    0.000    0.000 {method 'upper' of 'str' objects}

Which shows you when function calls were made, how many times they were made and how long they took. 它显示了何时进行函数调用,进行了多少次以及花费了多长时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM