提高heapq效率

Question

因此，我发现自己利用了heapq进行了一些计算。 但是，对于我正在解决的问题，它运行缓慢，因为堆变得很大。

我以为我可以加快速度。 与其创建一个巨大的堆，不如创建一个大堆。 但是，令我惊讶的是，“更高效”的代码要慢得多。 高效的代码中会有更多的开销，但是我真的认为这样做会赢得很多好处。 解决了问题之后，我得到了两个执行相同净计算的函数。 f1是“天真”（和更快）版本。 f2是“改进的”（但速度较慢）版本。 我在两者中都进行了随机数生成，但是我使用了相同的种子，因此实际上是同一回事。

import random
import heapq
def f1():
    random.seed(1)
    Q=[0]
    while Q:
        value = heapq.heappop(Q)
        #print value
        if value<0.5:
            for counter in range(16):
                heapq.heappush(Q,value + 0.1 + (random.random()/1000))
    print value

def f2():
    random.seed(1)
    Q=[[0]]
    while Q:
        subQ = heapq.heappop(Q)
        value = heapq.heappop(subQ)
        #print value
        if subQ:
            heapq.heappush(Q,subQ)
        newQ = []
        if value<0.5:
            for counter in range(16):
                newQ.append(value + 0.1 + (random.random()/1000))
            heapq.heapify(newQ)
            heapq.heappush(Q,newQ)
    print value

为什么堆（ f2 ）的运行速度明显慢？ 它应该调用heappush相同的次数，并且heappop两倍。 但是堆的大小应该小得多，因此我希望它运行得更快。

Answer 1

所以我只是没有足够努力地推动代码。 这是一些修改后的代码。 当subQ变得很大时，就会出现我所追求的收益。

def f1(m,n):
    random.seed(1)
    Q=[0]
    for counter in range(m):
        value = heapq.heappop(Q)
        #print value
        for newcounter in range(n):
            heapq.heappush(Q,random.random())
    print value #should be the same for both methods, so this is just a test

def f2(m,n):
    random.seed(1)
    Q=[[0]]
    for counter in range(1000000):
        subQ = heapq.heappop(Q)
        value = heapq.heappop(subQ)
        #print value
        if subQ:
            heapq.heappush(Q,subQ)
        newQ = []
        for newcounter in range(n):
            newQ.append(random.random())
        heapq.heapify(newQ)
        heapq.heappush(Q,newQ)
    print value #should be the same for both methods, so this is just a test

当我剖析f1(1000000,10)和f2(1000000,10)我得到了10.7秒和14.8秒的运行时间。 相关详细信息是：

f1：

ncalls tottime percall cumtime percall filename：lineno（function）

1000000 1.793 0.000 1.793 0.000 {_heapq.heappop}

10000000 3.856 0.000 3.856 0.000 {_heapq.heappush}

f2：

1000000 1.095 0.000 1.095 0.000 {_heapq.heapify}

2000000 2.628 0.000 2.628 0.000 {_heapq.heappop}

1999999 2.245 0.000 2.245 0.000 {_heapq.heappush}

10000000 1.114 0.000 1.114 0.000 {“附加”“列表”对象的方法}

因此net f2会因为额外的heappop以及heapify和append而heapify 。 它在heappush效果更好。

但是当我用更大的内部循环挑战它并运行f1(1000,100000)和f2(1000,100000)我们得到

f1：

1000 0.015 0.000 0.015 0.000 {_heapq.heappop}

100000000 28.730 0.000 28.730 0.000 {_heapq.heappush}

f2：

1000 19.952 0.020 19.952 0.020 {_heapq.heapify}

2000 0.011 0.000 0.011 0.000 {_heapq.heappop}

1999 0.006 0.000 0.006 0.000 {_heapq.heappush}

100000000 6.977 0.000 6.977 0.000 {“附加”“列表”对象的方法}

因此，我们现在在heappush上heappush ，而且f2运行速度已经足够快（69秒对75秒）。

事实证明，我只是没有足够努力地编写代码。 我需要事情变得足够大，以至于许多对heappush的调用变得比许多对heapify的调用慢。

提高heapq效率

问题描述

1 个解决方案

解决方案1
0 2015-10-03 02:29:05

提高heapq效率

问题描述

1 个解决方案

解决方案1 0 2015-10-03 02:29:05

解决方案1
0 2015-10-03 02:29:05