为什么等效的Python代码要慢得多

Question

有人可以解释为什么以下琐碎的代码（Euclid算法的实现找到最大的共同点）比Ruby中的等效代码慢3倍？

iter_gcd.py的内容：

from sys import argv,stderr

def gcd(m, n):
    if n > m:
        m, n = n, m
    while n != 0:
        rem = m % n
        m = n
        n = rem
    return m

# in Python3 code there is xrange replaced with range function
def main(a1, a2):
    comp = 0
    for j in xrange(a1, 1, -1):
        for i in xrange(1, a2):
            comp += gcd(i,j)

    print(comp)

if __name__ == '__main__':
    if len(argv) != 3:
        stderr.write('usage: {0:s} num1 num2\n'.format(argv[0]))
        exit(1)
    else:
        main(int(argv[1]), int(argv[2]))

iter_gcd.rb的内容：

def gcd(m, n)
    while n != 0
        rem = m % n
        m = n
        n = rem
    end
    return m
end

def main(a1, a2)
    comp = 0
    a1.downto 2 do
        |j|
        1.upto (a2 - 1) do
            |i|
            comp += gcd(i,j)
        end
    end
    puts comp
end

 if __FILE__ == $0
    if ARGV.length != 2
        $stderr.puts('usage: %s num1 num2' % $0)
        exit(1)
    else
        main(ARGV[0].to_i, ARGV[1].to_i)
    end
end

执行时间测量：

$ time python iter_gcd.py 4000 3000
61356305

real    0m22.890s
user    0m22.867s
sys     0m0.006s

$ python -V
Python 2.6.4


$ time python3 iter_gcd.py 4000 3000
61356305

real    0m18.634s
user    0m18.615s
sys     0m0.009s

$ python3 -V
Python 3.1.2


$ time ruby iter_gcd.rb 4000 3000
61356305

real    0m7.619s
user    0m7.616s
sys     0m0.003s

$ ruby -v
ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-linux]

只是好奇为什么我得到这样的结果。 我认为CPython在大多数情况下比MRI更快，甚至是YARV上的新Ruby 1.9，但这个“微基准”确实让我感到惊讶。

顺便说一句，我知道我可以使用像fractions.gcd这样的专用库函数，但我想比较这些基本和普通语言结构的实现。

我是否遗漏了一些东西，或者是下一代Ruby的实现在绝对速度方面有多大改进？

Answer 1

摘要

“因为Python中的函数调用开销远大于Ruby。”

细节

作为一个微基准测试，这对于正确使用这两种语言的性能并没有多少说明。 可能你会想要重写程序以利用Python和Ruby的优势，但这确实说明了Python目前的一个弱点。 速度差异的根本原因来自函数调用开销。 我做了一些测试来说明。 请参阅下面的代码和更多详细信息。 对于Python测试，我使用2000作为两个gcd参数。

Interpreter: Python 2.6.6
Program type: gcd using function call
Total CPU time: 29.336 seconds

Interpreter: Python 2.6.6
Program type: gcd using inline code
Total CPU time: 13.194 seconds

Interpreter: Python 2.6.6
Program type: gcd using inline code, with dummy function call
Total CPU  time: 30.672 seconds

这告诉我们，gcd函数所做的计算不是时间差的最大因素，而是函数调用本身。 使用Python 3.1，差异是相似的：

Interpreter: Python 3.1.3rc1
Program type: gcd using function call
Total CPU time: 30.920 seconds

Interpreter: Python 3.1.3rc1
Program type: gcd using inline code
Total CPU time: 15.185 seconds

Interpreter: Python 3.1.3rc1
Program type: gcd using inline code, with dummy function call
Total CPU time: 33.739 seconds

同样，实际计算不是最大的贡献者，它是函数调用本身。 在Ruby中，函数调用开销要小得多。 （注意：我必须对程序的Ruby版本使用较小的参数（200），因为Ruby分析器确实会降低实时性能。但这并不会影响CPU时间性能。）

Interpreter: ruby 1.9.2p0 (2010-08-18 revision 29036) [i486-linux]
Program type: gcd using function call
Total CPU time: 21.66 seconds

Interpreter: ruby 1.9.2p0 (2010-08-18 revision 29036) [i486-linux]
Program type: gcd using inline code
Total CPU time: 21.31 seconds

Interpreter: ruby 1.8.7 (2010-08-16 patchlevel 302) [i486-linux]
Program type: gcd using function call
Total CPU time: 27.00 seconds

Interpreter: ruby 1.8.7 (2010-08-16 patchlevel 302) [i486-linux]
Program type: gcd using inline code
Total CPU time: 24.83 seconds

注意Ruby 1.8和1.9都不会受到gcd函数调用的影响 - 函数调用与内联版本或多或少相等。 Ruby 1.9似乎更好一点，函数调用和内联版本之间的差异较小。

所以这个问题的答案是：“因为Python中的函数调用开销远大于Ruby中的函数调用开销”。

码

# iter_gcd -- Python 2.x version, with gcd function call
#             Python 3.x version uses range instead of xrange
from sys import argv,stderr

def gcd(m, n):
    if n > m:
        m, n = n, m
    while n != 0:
        rem = m % n
        m = n
        n = rem
    return m

def main(a1, a2):
    comp = 0
    for j in xrange(a1, 1, -1):
        for i in xrange(1, a2):
            comp += gcd(i,j)
    print(comp)

if __name__ == '__main__':
    if len(argv) != 3:
        stderr.write('usage: {0:s} num1 num2\n'.format(argv[0]))
        exit(1)
    else:
        main(int(argv[1]), int(argv[2]))

# iter_gcd -- Python 2.x version, inline calculation
#             Python 3.x version uses range instead of xrange
from sys import argv,stderr

def main(a1, a2):
    comp = 0
    for j in xrange(a1, 1, -1):
        for i in xrange(1, a2):
            if i < j:
                m, n = j, i
            else:
                m, n = i, j
            while n != 0:
                rem = m % n
                m = n
                n = rem
            comp += m
    print(comp)

if __name__ == '__main__':
    if len(argv) != 3:
        stderr.write('usage: {0:s} num1 num2\n'.format(argv[0]))
        exit(1)
    else:
        main(int(argv[1]), int(argv[2]))

# iter_gcd -- Python 2.x version, inline calculation, dummy function call
#             Python 3.x version uses range instead of xrange
from sys import argv,stderr

def dummyfunc(n, m):
    a = n + m

def main(a1, a2):
    comp = 0
    for j in xrange(a1, 1, -1):
        for i in xrange(1, a2):
            if i < j:
                m, n = j, i
            else:
                m, n = i, j
            while n != 0:
                rem = m % n
                m = n
                n = rem
            comp += m
            dummyfunc(i, j)
    print(comp)

if __name__ == '__main__':
    if len(argv) != 3:
        stderr.write('usage: {0:s} num1 num2\n'.format(argv[0]))
        exit(1)
    else:
        main(int(argv[1]), int(argv[2]))

# iter_gcd -- Ruby version, with gcd function call

def gcd(m, n)
    if n > m
        m, n = n, m
    end
    while n != 0
        rem = m % n
        m = n
        n = rem
    end
    return m
end

def main(a1, a2)
    comp = 0
    a1.downto 2 do
        |j|
        1.upto a2-1 do
            |i|
            comp += gcd(i,j)
        end
    end
    puts comp
end

 if __FILE__ == $0
    if ARGV.length != 2
        $stderr.puts('usage: %s num1 num2' % $0)
        exit(1)
    else
        main(ARGV[0].to_i, ARGV[1].to_i)
    end
end

# iter_gcd -- Ruby version, with inline gcd

def main(a1, a2)
    comp = 0
    a1.downto 2 do |j|
        1.upto a2-1 do |i|
            m, n = i, j
            if n > m
                m, n = n, m
            end
            while n != 0
                rem = m % n
                m = n
                n = rem
            end
            comp += m
        end
    end
    puts comp
end

 if __FILE__ == $0
    if ARGV.length != 2
        $stderr.puts('usage: %s num1 num2' % $0)
        exit(1)
    else
        main(ARGV[0].to_i, ARGV[1].to_i)
    end
end

测试运行

最后，用于运行Python和Ruby的命令以获取用于比较的数字的pythonX.X -m cProfile iter_gcdX.py 2000 2000 for Python和rubyX.X -rprofile iter_gcdX.rb 200 200 for Ruby。 造成这种差异的原因是Ruby分析器增加了很多开销。 结果仍然有效，因为我正在比较函数调用和内联代码之间的区别，而不是Python和Ruby之间的区别。

也可以看看

为什么python比Ruby更慢，即使这个非常简单的“测试”呢？

这个python代码有什么问题，为什么它比ruby运行得那么慢？

计算机语言基准游戏

谷歌搜索：ruby python函数调用速度更快

Answer 2

我可以确认ruby1.9在我的机器上对于这个“microbenchmark”比CPython更快：

| Interpreter                     | Time, s | Ratio |
|---------------------------------+---------+-------|
| python-2.6 (cython_gcd.gcd_int) |     2.8 |  0.33 |
| pypy-1.4                        |     3.5 |  0.41 |
| jython-2.5 (java "1.6.0_20")    |     4.7 |  0.55 |
| python-2.6 (cython_gcd.gcd)     |     5.6 |  0.65 |
| ruby-1.9                        |     8.6 |  1.00 |
| jython-2.5                      |     8.9 |  1.03 |
| python-3.2                      |    11.0 |  1.28 |
| python-2.6                      |    15.9 |  1.85 |
| ruby-1.8                        |    42.6 |  4.95 |
#+TBLFM: $3=$2/@6$2;%.2f

Profiler（ python -mcProfile iter_gcd.py 4000 3000 ）显示80％的时间花在调用gcd()函数上，所以确实区别在于gcd()函数。

我使用Cython， cython_gcd.pyx为Python编写了cython_gcd扩展：

def gcd(m, n):
    while n:
        n, m = m % n, n
    return m

def gcd_int(int m, int n):
    while n:
        n, m = m % n, n
    return m

它在iter_gcd.py用于from cython_gcd import gcd, gcd_int ，如下所示。

要尝试扩展，请运行： python setup.py build_ext --inplace ，其中setup.py ：

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

ext_modules = [Extension("cython_gcd", ["cython_gcd.pyx"])]

setup(
  name = 'Hello world app',
  cmdclass = {'build_ext': build_ext},
  ext_modules = ext_modules
)

要全局安装扩展，请运行python setup.py install 。

Answer 3

我似乎记得ruby处理整数的方式与Python不同，所以我的猜测就是Python只花费大量时间分配内存，而Ruby只是改变了整数。

对于它的价值，使用Pypy 1.4可以将我的系统上的Python版本的运行时间从大约15秒减少到3秒以下。

Answer 4

我无法复制你的结果。 python代码似乎比ruby代码快4倍：

2010-12-07 13:49:55:~/tmp$ time python  iter_gcd.py 4000 3000
61356305

real    0m14.655s
user    0m14.633s
sys 0m0.012s

2010-12-07 13:43:26:~/tmp$ time ruby iter_gcd.rb 4000 3000
iter_gcd.rb:14: warning: don't put space before argument parentheses
61356305

real    0m54.298s
user    0m53.955s
sys 0m0.028s

版本：

2010-12-07 13:50:12:~/tmp$ ruby --version
ruby 1.8.7 (2010-06-23 patchlevel 299) [i686-linux]
2010-12-07 13:51:52:~/tmp$ python --version
Python 2.6.6

此外，python代码可以快8％：

def gcd(m, n):
    if n > m:
        m, n = n, m
    while n:
        n, m = m % n, n
    return m

def main(a1, a2):
    print sum(
        gcd(i,j)
        for j in xrange(a1, 1, -1)
        for i in xrange(1, a2)
    )

if __name__ == '__main__':
    from sys import argv
    main(int(argv[1]), int(argv[2]))

后来：当我安装并使用ruby 1.9.1时，ruby代码更快：

2010-12-07 14:01:08:~/tmp$ ruby1.9.1 --version
ruby 1.9.2p0 (2010-08-18 revision 29036) [i686-linux]
2010-12-07 14:01:30:~/tmp$ time ruby1.9.1 iter_gcd.rb 4000 3000
61356305

real    0m12.137s
user    0m12.037s
sys 0m0.020s

我认为你的问题确实是，“为什么ruby 1.9.x比ruby 1.8.x快得多？”

为什么等效的Python代码要慢得多

问题描述

4 个解决方案

解决方案1
36 已采纳 2010-12-07 21:07:19

摘要

细节

码

测试运行

也可以看看

解决方案2
22 2010-11-29 18:11:53

解决方案3
10 2010-11-29 16:20:49

解决方案4
0 2010-12-07 18:59:09

为什么等效的Python代码要慢得多

问题描述

4 个解决方案

解决方案1 36 已采纳 2010-12-07 21:07:19

摘要

细节

码

测试运行

也可以看看

解决方案2 22 2010-11-29 18:11:53

解决方案3 10 2010-11-29 16:20:49

解决方案4 0 2010-12-07 18:59:09

解决方案1
36 已采纳 2010-12-07 21:07:19

解决方案2
22 2010-11-29 18:11:53

解决方案3
10 2010-11-29 16:20:49

解决方案4
0 2010-12-07 18:59:09