为什么我的代码在pypy中比默认的python解释器慢？

Question

Given a number of players n , I need to find H , the list of all tuples where each tuple is a combination of coalitions (of the players, eg (1,2,3) is the coalition of players 1, 2 and 3. ((1,2,3),(4,5),(6,)) is a combination of coalitions - which are also tuples) that respects this rule: each player appears only and exactly once (ie in only one coalition). 给定多个参与者n ，我需要找到H ，所有元组的列表，其中每个元组是联盟的组合（参与者的角色，例如（1,2,3）是参与者1、2和3的联盟。（（1,2,3），（4,5），（6，））是遵守此规则的联盟的组合-也是元组）：每个玩家仅出现一次且仅出现一次（即仅出现在一个联盟中）。 PS Each combination of coalitions is called layout in the code. PS联盟的每种组合在代码中称为布局。

At the beginning I wrote a snippet in which I computed all combinations of all coalitions and for each combination I checked the rule. 在开始时，我编写了一个代码段，在其中计算了所有联盟的所有组合，并针对每个组合检查了规则。 Problem is that for 5-6 players the number of combinations of coalitions was already so big that my computer went phut. 问题在于，对于5-6名玩家来说，联盟的组合数量已经非常庞大，以至于我的电脑无法使用。 In order to avoid aa big part of the computation (all possible combinations, the loop and the ifs) I wrote the following (which I tested and it's equivalent to the previous snippet): 为了避免很大一部分计算（所有可能的组合，循环和ifs），我编写了以下代码（我进行了测试，它等效于之前的代码片段）：

from itertools  import combinations, combinations_with_replacement, product, permutations

players = range(1,n+1)
coalitions = [[coal for coal in list(combinations(players,length))] for length in players]

H = [tuple(coalitions[0]),(coalitions[-1][0],)]
combs = [comb for length in xrange(2,n) for comb in combinations_with_replacement(players,length) if sum(comb) == n]
perms = list(permutations(players))
layouts = set(frozenset(frozenset(perm[i:i+x]) for (i,x) in zip([0]+[sum(comb[:y]) for y in xrange(1,len(comb))],comb)) for comb in combs for perm in perms)
H.extend(tuple(tuple(tuple(coal) for coal in layout) for layout in layouts))
print H

EXPLANATION: say n = 3 解释：说n = 3

First I create all possible coalitions: 首先，我创建所有可能的联盟：

coalitions = [[(1,),(2,),(3,)],[(1,2),(1,3),(2,3)],[(1,2,3)]]

Then I initialize H with the obvious combinations: each player in his own coalition and every player in the biggest coalition. 然后，我用明显的组合来初始化H：他自己的联盟中的每个玩家和最大联盟中的每个玩家。

H = [((1,),(2,),(3,)),((1,2,3),)]

Then I compute all the possible forms of the layouts: 然后，我计算所有可能的布局形式：

combs = [(1,2)]   #(1,2) represents a layout in which there is 
                  #one 1-player coalition and one 2-player coalition.

I compute the permutations (perms). 我计算排列（烫发）。 Finally for each perm and for each comb I calculate the different possible layouts. 最后，对于每个烫发和每个梳子，我计算出不同的可能布局。 I set the result ( layouts ) in order to delete duplicates and add to H. 我set结果（ layouts ）以删除重复项并添加到H。

H = [((1,),(2,),(3,)),((1,2,3),),((1,2),(3,)),((1,3),(2,)),((2,3),(1,))]

Here's the comparison: 比较如下：

python script.py

4: 0.000520944595337 seconds 4：0.000520944595337秒
5: 0.0038321018219 seconds 5：0.0038321018219秒
6: 0.0408189296722 seconds 6：0.0408189296722秒
7: 0.431486845016 seconds 7：0.431486845016秒
8: 6.05224680901 seconds 8：6.05224680901秒
9: 76.4520540237 seconds 9：76.4520540237秒

pypy script.py

4: 0.00342392921448 seconds 4：0.00342392921448秒
5: 0.0668039321899 seconds 5：0.0668039321899秒
6: 0.311077833176 seconds 6：0.311077833176秒
7: 1.13124799728 seconds 7：1.13124799728秒
8: 11.5973010063 seconds 8：11.5973010063秒
9: went phut 9：去phut

Why is pypy that slower? 为什么pypy这么慢？ What should I change? 我应该改变什么？

Answer 1

First, I want to point out that you are studying the Bell numbers , which might ease the next part of your work, after you're done generating all the subsets. 首先，我想指出的是，您正在研究贝尔数，当您生成完所有子集后，这可能会减轻您的下一部分工作。 For example, it's easy to know how large each Bell set will be; 例如，很容易知道每个Bell集合的大小。 OEIS has the sequence of Bell numbers already. OEIS已经具有贝尔编号序列。

I hand-wrote the loops to generate the Bell sets; 我手工编写了循环以生成Bell集。 here is my code: 这是我的代码：

cache = {0: (), 1: ((set([1]),),)}

def bell(x):
    # Change these lines to alter memoization.
    if x in cache:
        return cache[x]
    previous = bell(x - 1)
    new = []
    for sets in previous:
        r = []
        for mark in range(len(sets)):
            l = [s | set([x]) if i == mark else s for i, s in enumerate(sets)]
            r.append(tuple(l))
        new.extend(r)
        new.append(sets + (set([x]),))
    cache[x] = tuple(new)
    return new

I included some memoization here for practical purposes. 出于实际目的，我在此处包括一些备忘。 However, by commenting out some code, and writing some other code, you can obtain the following un-memoized version, which I used for benchmarks: 但是，通过注释掉一些代码并编写其他代码，您可以获得以下未存储的版本，该版本用于基准测试：

def bell(x):
    if x == 0:
        return ()
    if x == 1:
        return ((set([1]),),)
    previous = bell(x - 1)
    new = []
    for sets in previous:
        r = []
        for mark in range(len(sets)):
            l = [s | set([x]) if i == mark else s for i, s in enumerate(sets)]
            r.append(tuple(l))
        new.extend(r)
        new.append(sets + (set([x]),))
    cache[x] = tuple(new)
    return new

My numbers are based on a several-year-old Thinkpad that I do most of my work on. 我的数字基于我从事大部分工作的Thinkpad的使用。 Most of the smaller cases are way too fast to measure reliably (not even a single millisecond per trial for the first few), so my benchmarks are testing bell(9) through bell(11) . 大多数较小的案例太快而无法可靠地进行测量（对于前几个案例，每个试验甚至都不到一毫秒），因此我的基准测试是测试bell(9)到bell(11) 。

Benchmarks for CPython 2.7.11, using the standard timeit module: 使用标准timeit模块的CPython 2.7.11基准：

$ python -mtimeit -s 'from derp import bell' 'bell(9)'
10 loops, best of 3: 31.5 msec per loop
$ python -mtimeit -s 'from derp import bell' 'bell(10)'
10 loops, best of 3: 176 msec per loop
$ python -mtimeit -s 'from derp import bell' 'bell(11)'
10 loops, best of 3: 1.07 sec per loop

And on PyPy 4.0.1, also using timeit : 在PyPy 4.0.1上，也使用timeit ：

$ pypy -mtimeit -s 'from derp import bell' 'bell(9)'
100 loops, best of 3: 14.3 msec per loop
$ pypy -mtimeit -s 'from derp import bell' 'bell(10)'
10 loops, best of 3: 90.8 msec per loop
$ pypy -mtimeit -s 'from derp import bell' 'bell(11)'
10 loops, best of 3: 675 msec per loop

So, the conclusion that I've come to is that itertools is not very fast when you try to use it outside of its intended idioms. 因此，我得出的结论是，当您尝试在预期的习惯用法之外使用itertools时，速度不是很快。 Bell numbers are interesting combinatorically but they do not naturally arise from any simple composition of itertools widgets that I can find. 组合时，贝尔号很有趣，但是它们并不是自然而然地从我可以找到的itertools小部件的任何简单组成中得出的。

In response to your original query of what to do to make it faster: Just open-code it. 响应您对如何使它更快运行的原始查询：只需对其进行开放编码。 Hope this helps! 希望这可以帮助！

~ C. 〜C.

Answer 2

Here's a Pypy issue on itertools.product . 这是itertools.product上的Pypy问题。

https://bitbucket.org/pypy/pypy/issues/1677/itertoolsproduct-slower-than-nested-fors https://bitbucket.org/pypy/pypy/issues/1677/itertoolsproduct-slower-than-nested-fors

Note that our goal is to ensure that itertools is not massively slower than plain Python, but we don't really care about making it exactly as fast (or faster) as plain Python. 请注意，我们的目标是确保itertools不会比普通Python慢很多，但我们并不真正在意使其与普通Python一样快（或更快）。 As long as it's not massively slower, it's fine. 只要它不会大大降低速度，就可以了。 (At least I don't agree with you about whether a) or b) is easier to read :-) （至少我不同意a）或b）更容易阅读:-)

Without studying your code in detail, it looks like it makes heavy use of the itertools combinations, permutations and product functions. 无需详细研究代码，它似乎大量使用了itertools组合，排列和乘积函数。 In regular CPython those are written in compiled C code, with the intention of making them fast. 在常规CPython中，这些都是用编译后的C代码编写的，目的是使它们快速运行。 Pypy does not implement the C code, so it shouldn't be surprising that these functions are slower. Pypy没有实现C代码，因此这些功能变慢也就不足为奇了。

为什么我的代码在pypy中比默认的python解释器慢？

问题描述

2 个解决方案

解决方案1
3 已采纳 2016-03-22 20:42:15

解决方案2
1 2016-03-21 16:09:19

为什么我的代码在pypy中比默认的python解释器慢？

问题描述

2 个解决方案

解决方案1 3 已采纳 2016-03-22 20:42:15

解决方案2 1 2016-03-21 16:09:19

解决方案1
3 已采纳 2016-03-22 20:42:15

解决方案2
1 2016-03-21 16:09:19