简体   繁体   English

Python中的基准测试:为什么我的代码在重复时运行得更慢?

[英]Benchmarking in Python: Why does my code run slower with repetition?

I have a simple Sieve of Eratosthanes implementation as follows: 我有一个简单的Eratosthanes Sieve实现如下:

# Generate all primes less than k
def sieve(k):
    s = [True] * k
    s[0] = s[1] = False
    for i in range(4, k, 2):
        s[i] = False

    for i in range(3, int(sqrt(k)) + 2, 2):
        if s[i]:            
            for j in range(i ** 2, k, i * 2):
                s[j] = False

    return [2] + [ i for i in range(3, k, 2) if s[i] ]

I am benchmarking this code by repeatedly generating primes under 10M: 我通过在10M以下重复生成质数来对此代码进行基准测试:

st = time()
for x in range(1000):
    rt = time()
    sieve(10000000)
    print "%3d %.2f %.2f" % (x, time() - rt, (time() - st) / (x + 1))

I am confused, as the time taken per test run increases markedly: 我很困惑,因为每次测试运行所花费的时间显着增加:

run   t  avg
  0 1.49 1.49
  1 1.79 1.66
  2 2.23 1.85
  3 2.72 2.07
  4 2.67 2.20
  5 2.87 2.31
  6 3.05 2.42
  7 3.57 2.56
  8 3.38 2.65
  9 3.48 2.74
 10 3.81 2.84
 11 3.75 2.92
 12 3.85 2.99
 13 4.14 3.07
 14 4.02 3.14
 15 4.05 3.20
 16 4.48 3.28
 17 4.41 3.34
 18 4.19 3.39
 19 4.22 3.43
 20 4.65 3.49

However, changing every instance of range to xrange eliminates the issue: 但是,将range每个实例更改为xrange消除此问题:

run   t  avg
  0 1.26 1.26
  1 1.23 1.28
  2 1.24 1.27
  3 1.25 1.26
  4 1.23 1.26
  5 1.23 1.25
  6 1.25 1.25
  7 1.25 1.25
  8 1.23 1.25
  9 1.25 1.25
 10 1.24 1.25

Why is this the case? 为什么会这样? Is it really all GC overhead? 它真的是所有的GC开销吗? 3x slow down after 20 runs seems like a lot... 20次运行后减速3倍似乎很多......

This is not (yet) an answer, but just a collection of organized experiments. 这不是(还)答案,而只是一系列有组织的实验。

This is fascinating, really. 这真是令人着迷。 It seems that there's some very dubious thing going on with Python's memory allocator. 似乎Python的内存分配器正在发生一些非常可疑的事情。

Here's my attempt to reduce the testcase: 这是我尝试减少测试用例:

def sieve(k):
    s = [True] * k

    for i in xrange(3, int(sqrt(k)) + 2, 2):
        for j in range(i ** 2, k, i * 2):
            s[j] = False

    return [ i for i in range(3, k, 2) if s[i] ]

st = time()
for x in range(1000):
    rt = time()
    sieve(10000000)
    print "%3d %.2f %.2f" % (x, time() - rt, (time() - st) / (x + 1))

Note that if I remove if s[i] , make the inner range an xrange , make the return value a generator, or pass in the inner for loop (or make it s[j] = True ), the behaviour disappears and the times are flat. 注意,如果我删除if s[i] ,使内部range成为xrange ,使返回值成为生成器,或者pass内部for循环(或使其成为s[j] = True ),行为消失并且时间很平坦。

The memory usage of Python increases steadily as the function runs, eventually reaching a plateau (at which point the running times start to plateau too, at about 250% of their initial values). 随着函数运行,Python的内存使用量稳步增加,最终达到稳定水平(此时运行时间也开始稳定,大约为其初始值的250%)。

My hypothesis is that the large number of inner range s (of decreasing size), plus the final array, cause some sort of worst-case heap fragmentation making it very hard to continue allocating objects. 我的假设是,大量的内部range (大小减小)加上最终数组导致某种最坏情况的堆碎片,使得继续分配对象变得非常困难。

My recommendation would be to make a reduced test case and file it as a bug with the Python developers (bugs.python.org). 我的建议是制作一个简化的测试用例并将其作为Python开发人员的bug(bugs.python.org)归档。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM