简体   繁体   English

Python语句x=x+1是如何实现的?

[英]How is Python statement x=x+1 implemented?

In C, a statement x=x+1 will change the content at the same memory that is allocated for x.在 C 中,语句 x=x+1 将更改分配给 x 的同一 memory 的内容。 But in Python, since a variable can have different types, x at the left and right side of = may be of different types, which means they may refer to different pieces of memory.但是在Python中,由于一个变量可以有不同的类型,所以=的左右两边的x可能是不同的类型,也就是说它们可能指的是不同的memory。 If so, after x changes its reference from the old memory to the new memory, the old memory can be reclaimed by the garbage collection mechanism.如果是这样,在 x 将其引用从旧的 memory 更改为新的 memory 后,旧的 memory 可以通过垃圾收集机制回收。 If it is the case, the following code may trigger the garbage collection process many times thus is very low efficient:如果是这样的话,下面的代码可能会多次触发垃圾回收过程,效率很低:

for i in range(1000000000)
    i=i+1

Is my guess correct?我的猜测正确吗?

Update:更新:

I need to correct the typo in the code to make the question clearer:我需要更正代码中的错字以使问题更清楚:

x=0
for i in range(1000000000)
    x=x+1

@SvenMarnach, do you mean the integers 0,1,2,...,999999999 (which the label x once referred to) all exist in memory if garbage collection is not activated? @SvenMarnach,如果未激活垃圾收集,您的意思是整数 0,1,2,...,999999999(label x 曾经提到过)都存在于 memory 中吗?

id can be used to track the 'allocation' of memory to objects. id可用于跟踪 memory 到对象的“分配”。 It should be used with caution, but here I think it's illuminating.它应该谨慎使用,但在这里我认为它很有启发性。 id is a bit like a c pointer - that is, some how related to 'where' the object is located in memory. id有点像c指针 - 也就是说,与 object 位于 memory 的“位置”有什么关系。

In [18]: for i in range(0,1000,100): 
    ...:     print(i,id(i)) 
    ...:     i = i+1 
    ...:     print(i,id(i)) 
    ...:                                                                        
0 10914464
1 10914496
100 10917664
101 10917696
200 10920864
201 10920896
300 140186080959760
301 140185597404720
400 140186080959760
401 140185597404720
...
900 140186080959760
901 140185597404720
In [19]: id(1)                                                                  
Out[19]: 10914496

Small integers (<256) are cached - that is, integer 1 , once created is 'reused'.缓存小整数 (<256) - 即 integer 1 ,一旦创建就被“重用”。

In [20]: id(202)                                                                
Out[20]: 10920928     # same id as in the loop
In [21]: id(302)                                                                
Out[21]: 140185451618128   # different id
In [22]: id(901)                                                                
Out[22]: 140185597404208
In [23]: id(i)                                                                  
Out[23]: 140185597404720   #  = 901, but different id 

In this loop, the first few iterations create or reuse small integers.在这个循环中,前几次迭代创建或重用小整数。 But it appears that when creating larger integers, it is 'reusing' memory.但似乎在创建更大的整数时,它正在“重用”memory。 It may not be full blown garbage collection, but the code is somehow optimized to avoid unnecessary memory use.它可能不是完整的垃圾收集,但代码以某种方式优化以避免不必要的 memory 使用。

Generally as Python programmers don't focus on those details.通常作为 Python 程序员不会关注这些细节。 Write clean reliable Python code.编写干净可靠的 Python 代码。 In this example, modifying an iteration variable in the loop is poor practice (even if it is just an example).在这个例子中,在循环中修改一个迭代变量是不好的做法(即使它只是一个例子)。

You are mostly correct, though I think a few clarifications may help.您大多是正确的,尽管我认为一些澄清可能会有所帮助。

First, the concept of variables in C in Python is rather different.首先,Python中的C中的变量概念比较不同。 In C, a variable generally references a fixed location in memory, as you stated yourself.如您所说,在 C 中,变量通常引用 memory 中的固定位置。 In Python, a variable is just a label that can be attached to any object.在 Python 中,变量只是可以附加到任何 object 的 label。 An object could have multiple such labels, or none at all, and labels can be freely moved between objects.一个 object 可以有多个这样的标签,或者根本没有,标签可以在对象之间自由移动。 An assignment in C copies a new value to a memory location, while an assignment in Python attaches a new label to an object. An assignment in C copies a new value to a memory location, while an assignment in Python attaches a new label to an object.

Integers are also very different in both languages.两种语言中的整数也有很大不同。 In C, an integer has a fixed size, and stores an integer value in a format native to the hardware.在 C 中,integer 具有固定大小,并以硬件固有的格式存储 integer 值。 In Python, integers have arbitrary precision.在 Python 中,整数具有任意精度。 They are stored as array of "digits" (usually 30-bit integers in CPython) together with a Python type header storing type information.它们与存储类型信息的 Python 类型 header 一起存储为“数字”数组(在 CPython 中通常为 30 位整数)。 Bigger integers will occupy more memory than smaller integers.较大的整数将比较小的整数占用更多的 memory。

Moreover, integer objects in Python are immutable – they can't be changed once created.此外,Python 中的 integer 对象是不可变的——一旦创建就无法更改。 This means every arithmetic operation creates a new integer object.这意味着每个算术运算都会创建一个新的 integer object。 So the loop in your code indeed creates a new integer object in each iteration.因此,您的代码中的循环确实在每次迭代中创建了一个新的 integer object 。

However, this isn't the only overhead.然而,这并不是唯一的开销。 It also creates a new integer object for i in each iteration, which is dropped at the end of the loop body.它还在每次迭代中为i创建一个新的 integer object,它在循环体的末尾被丢弃。 And the arithmetic operation is dynamic – Python needs to look up the type of x and its __add__() method in each iteration to figure out how to add objects of this type.而且算术运算是动态的——Python 需要在每次迭代中查找x的类型及其__add__()方法,以确定如何添加这种类型的对象。 And function call overhead in Python is rather high.并且 Python 中的 function 调用开销相当高。

Garbage collection and memory allocation on the other hand are rather fast in CPython.另一方面,垃圾收集和 memory 分配在 CPython 中相当快。 Garbage collection for integers relies completely on reference counting (no reference cycles possible here), which is fast.整数的垃圾收集完全依赖于引用计数(这里没有引用循环),这很快。 And for allocation, CPython uses an arena allocator for small objects that can quickly reuse memory slots without calling the system allocator.对于分配,CPython 为小对象使用了一个 arena 分配器,可以快速重用 memory 插槽,而无需调用系统分配器。

So in summary, yes, compared to the same code in C, this code will run awfully slow in Python.总而言之,是的,与 C 中的相同代码相比,此代码在 Python 中运行速度非常慢。 A modern C compiler would simply compute the result of this loop at compile time and load the result to a register, so it would finish basically immediately.现代 C 编译器会在编译时简单地计算此循环的结果并将结果加载到寄存器,因此它基本上会立即完成。 If raw speed for integer arithmetic is what you want, don't write that code in Python.如果 integer 算法的原始速度是您想要的,请不要在 Python 中编写该代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM