简体繁体 English

Python的deepcopy（）的运行时复杂性是多少？

[英]What is the runtime complexity of Python's deepcopy()?

原文 2012-01-21 22:48:23 7 3 python/ complexity-theory/ deep-copy

I'm trying to improve the speed of an algorithm and, after looking at which operations are being called, I'm having difficulty pinning down exactly what's slowing things up. 我正在努力提高算法的速度，在查看正在调用哪些操作之后，我很难确定哪些操作正在放慢速度。 I'm wondering if Python's deepcopy() could possibly be the culprit or if I should look a little further into my own code. 我想知道Python的deepcopy（）是否可能是罪魁祸首，或者我是否应该进一步了解自己的代码。

3 个解决方案

Looking at the code (you can too), it goes through every object in the tree of referenced objects (eg dict's keys and values, object member variables, ...) and does two things for them: 查看代码（你也可以），它遍历引用对象树中的每个对象（例如dict的键和值，对象成员变量，......）并为它们做两件事：

see if it's already been copied, by looking it in id-indexed memo dict 通过在id-indexed memo dict中查看它是否已被复制
copy of the object if not 如果没有，则复制该对象

The second one is O(1) for simple objects. 第二个是简单对象的O（1） 。 For composite objects, the same routine handles them, so over all n objects in the tree, that's O(n) . 对于复合对象，相同的例程处理它们，因此树中的所有n个对象都是O（n） 。 The first part, looking an object up in a dict, is O(1) on average, but O(n) amortized worst case . 第一部分，在字典中查找对象，平均为O（1） ，但O（n）最坏情况下摊销。

So at best, on average, deepcopy is linear. 因此，平均而言， deepcopy是线性的。 The keys used in memo are id() values, ie memory locations, so they are not randomly distributed over the key space (the "average" part above) and it may behave worse, up to the O(n^2) worst case. memo中使用的密钥是id()值，即内存位置，因此它们不是随机分布在密钥空间（上面的“平均”部分）上，并且可能表现更差，直到O（n ^ 2）最坏的情况。 I did observe some performance degradations in real use, but for the most part, it behaved as linear. 我确实观察到了实际使用中的一些性能下降，但在大多数情况下，它表现为线性的。

That's the complexity part, but the constant is large and deepcopy is anything but cheap and could very well be causing your problems. 这是复杂性的一部分，但常数很大，而且deepcopy不是很便宜，很可能会导致你的问题。 The only sure way to know is to use a profiler -- do it. 唯一确定的方法是使用分析器 - 这样做。 FWIW, I'm currently rewriting terribly slow code that spends 98% of its execution time in deepcopy . FWIW，我目前正在重写非常慢的代码，它将98％的执行时间花在deepcopy 。

What are you using deepcopy for? 你在做什么使用deepcopy ？ As the name suggests, deepcopy copies the object, and all subobjects recursively, so it is going to take an amount of time proportional to the size of the object you are copying. 顾名思义， deepcopy以递归方式复制对象和所有子对象，因此需要花费一定的时间与要复制的对象的大小成比例。 (with a bit of overhead to deal with circular references) （有一点处理循环引用的开销）

There isn't really any way to speed it up, if you are going to copy everything, you need to copy everything. 没有任何方法可以加快速度，如果要复制所有内容，则需要复制所有内容。

One question to ask, is do you need to copy everything, or can you just copy part of the structure. 要问的一个问题是，您需要复制所有内容，还是只需复制部分结构。

The complexity of deepcopy() is dependant upon the size (number of elements/children) of the object being copied. deepcopy()的复杂性取决于要复制的对象的大小（元素/子元素的数量）。

If your algorithm's inputs do not affect the size of the object(s) being copied, then you should consider the call to deeopcopy() to be O(1) for the purposes of determining complexity, since each invocation's execution time is relatively static. 如果算法的输入不影响正在复制的对象的大小，那么为了确定复杂性，您应该deeopcopy()的调用视为O(1) ，因为每个调用的执行时间都是相对静态的。

(If your algorithm's inputs do have an effect on the size of the object(s) being copied, you'll have to elaborate how. Then the complexity of the algorithm can be evaluated.) （如果算法的输入对正在复制的对象的大小有影响，则必须详细说明。然后可以评估算法的复杂性。）