简体   繁体   English

为什么在 Python 中创建 class 比实例化 class 慢得多?

[英]Why is creating a class in Python so much slower than instantiating a class?

I found that creation of a class is way slower than instantiation of a class.我发现创建 class 比实例化 class 慢得多。

>>> from timeit import Timer as T
>>> def calc(n):
...     return T("class Haha(object): pass").timeit(n)

<<After several these 'calc' things, at least one of them have a big number, eg. 100000>>

>>> calc(9000)
15.947055101394653
>>> calc(9000)
17.39099097251892
>>> calc(9000)
18.824054956436157
>>> calc(9000)
20.33335590362549

Yeah, create 9000 classes took 16 secs, and becomes even slower in the subsequent calls.是的,创建 9000 个类需要 16 秒,并且在后续调用中变得更慢。

And this:还有这个:

>>> T("type('Haha', b, d)", "b = (object, ); d = {}").timeit(9000)

gives similar results.给出类似的结果。

But instantiation don't suffer:但是实例化不会受到影响:

>>> T("Haha()", "class Haha(object): pass").timeit(5000000)
0.8786070346832275

5000000 instances in less than one sec.不到一秒 5000000 个实例。

What makes the creation this expensive?是什么让创作如此昂贵?

And why the creation process become slower?为什么创建过程变慢了?

EDIT:编辑:

How to reproduce:如何重现:

start a fresh python process, the initial several "calc(10000)"s give a number of 0.5 on my machine.开始一个新的 python 进程,最初的几个“calc(10000)”在我的机器上给出了 0.5 的数字。 And try some bigger values, calc(100000), it can't end in even 10secs, interrupt it, and calc(10000), gives a 15sec.并尝试一些更大的值,calc(100000),它甚至不能在 10 秒内结束,中断它,而 calc(10000),给出 15 秒。

EDIT:编辑:

Additional fact:附加事实:

If you gc.collect() after 'calc' becomes slow, you can get the 'normal' speed at beginning, but the timing will increasing in subsequent calls如果你在 'calc' 变慢后 gc.collect() ,你可以在开始时获得 '正常' 速度,但在后续调用中时间会增加

>>> from a import calc
>>> calc(10000)
0.4673938751220703
>>> calc(10000)
0.4300072193145752
>>> calc(10000)
0.4270968437194824
>>> calc(10000)
0.42754602432250977
>>> calc(10000)
0.4344758987426758
>>> calc(100000)
^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "a.py", line 3, in calc
    return T("class Haha(object): pass").timeit(n)
  File "/usr/lib/python2.7/timeit.py", line 194, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
KeyboardInterrupt
>>> import gc
>>> gc.collect()
234204
>>> calc(10000)
0.4237039089202881
>>> calc(10000)
1.5998330116271973
>>> calc(10000)
4.136359930038452
>>> calc(10000)
6.625348806381226

This might give you the intuition:这可能会给你直觉:

>>> class Haha(object): pass
...
>>> sys.getsizeof(Haha)
904
>>> sys.getsizeof(Haha())
64

Class object is much more complex and expensive structure than an instance of that class. Class object 的结构比 class 的一个实例要复杂和昂贵得多

Ahahaha!啊哈哈哈! Gotcha!明白了!

Was this perchance done on a Python version without this patch ?这是在没有此补丁的 Python 版本上完成的吗? (HINT: IT WAS ) (提示:它是

Check the line numbers if you want proof.如果您需要证明,请检查行号。

Marcin was right : when the results look screwy you've probably got a screwy benchmark. Marcin 是对的:当结果看起来不正常时,您可能已经有了一个不正常的基准。 Run gc.disable() and the results reproduce themselves.运行gc.disable() ,结果会自行重现。 It just shows that when you disable garbage collection you get garbage results!它只是表明,当您禁用垃圾收集时,您会得到垃圾结果!


To be more clear, the reason running the long benchmark broke things is that:更清楚地说,运行长基准测试失败的原因是:

  • timeit disables garbage collections, so overly large benchmarks take much (exponentially) longer timeit禁用垃圾 collections,因此过大的基准测试需要(指数)更长的时间

  • timeit wasn't restoring garbage collection on exceptions timeit没有在异常时恢复垃圾收集

  • You quit the long-running process with an asynchronous exception, turning off garbage collection您以异步异常退出长时间运行的进程,关闭垃圾收集

A quick dis of the following functions:以下功能的快速说明:

def a():
    class Haha(object):
         pass



def b():
    Haha()

gives:给出:

2           0 LOAD_CONST               1 ('Haha')
            3 LOAD_GLOBAL              0 (object)
            6 BUILD_TUPLE              1
            9 LOAD_CONST               2 (<code object Haha at 0x7ff3e468bab0, file "<stdin>", line 2>)
            12 MAKE_FUNCTION            0
            15 CALL_FUNCTION            0
            18 BUILD_CLASS         
            19 STORE_FAST               0 (Haha)
            22 LOAD_CONST               0 (None)
            25 RETURN_VALUE        

and

2           0 LOAD_GLOBAL              0 (Haha)
            3 CALL_FUNCTION            0
            6 POP_TOP             
            7 LOAD_CONST               0 (None)
            10 RETURN_VALUE        

accordingly.因此。

By the looks of it, it simply does more stuff when creating a class. It has to initialize class, add it to dicts, and wherever else, while in case of Haha() is just calls a function.从它的外观来看,它只是在创建 class 时做了更多的事情。它必须初始化 class,将其添加到字典,以及其他任何地方,而在Haha()的情况下只是调用 function。

As you noticed doing garbage collection when it gets's too slow speeds stuff up again, so Marcin's right in saying that it's probably memory fragmentation issue.正如您所注意到的那样,当垃圾收集速度太慢时会再次加快速度,所以 Marcin 说这可能是 memory 碎片问题是正确的。

It isn't : Only your contrived tests show slow class creation.它不是:只有你设计的测试显示 class 创建缓慢。 In fact, as @Veedrac shows in his answer, this result is an artifact of timeit disabling garbage collection.事实上,正如@Veedrac 在他的回答中所展示的那样,这个结果是 timeit 禁用垃圾收集的产物。

Downvoters: Show me a non-contrived example where class creation is slow. Downvoters:给我一个非人为的例子,其中 class 创建速度很慢。

In any case, your timings are affected by the load on your system at the time.无论如何,您的时间安排会受到当时系统负载的影响。 They are really only useful for comparisons performed at pretty much the same time.它们实际上只对几乎同时进行的比较有用。 I get about 0.5s for 9000 class creations.对于 9000 class 创作,我得到大约 0.5 秒。 In fact, it's about 0.3s on ideone, even when performed repeatedly: http://ideone.com/Du859 .事实上,在 ideone 上大约是 0.3 秒,即使重复执行也是如此: http://ideone.com/Du859 There isn't even an upward trend.甚至没有上升趋势。

So, in summary, it is much slower on your computer than others, and there is no upwards trend on other computers for repeated tests (as per your original claim).因此,总而言之,它在您的计算机上比其他计算机慢得多,并且在其他计算机上重复测试没有上升趋势(根据您最初的说法)。 Testing massive numbers of instantiations does show slowing down, presumably because the process consumes a lot of memory. You have shown that allocating a huge amount of memory slows a process down.测试大量实例确实显示速度变慢,大概是因为该过程消耗了大量 memory。您已经表明分配大量 memory 会减慢进程速度。 Well done.做得好。

That ideone code in full:完整的 ideone 代码:

from timeit import Timer as T
def calc(n):
return T("class Haha(object): pass").timeit(n)

for i in xrange(30):
print calc(9000)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM