为什么 python memory 分配会这样？

Question

Consider the following code which is used for finding all valid parentheses placements with n parentheses.考虑以下代码，该代码用于查找所有带有 n 个括号的有效括号位置。

def paren(n):
    ans = []
    def helper(string, left, right,n,foo):
        if len(string)==2*n:
            ans.append(string)
        if left > 0:
            helper(string+'(', left-1,right,n)
        if right > left:
            helper(string+')', left,right-1,n)
    helper('',n,n,n)

If we add a list (that has no practical use in the function) we get如果我们添加一个列表（在函数中没有实际用途），我们得到

def paren(n):
    ans = []
    def helper(string, left, right,n,foo):
        print(hex(id(foo)), foo)
        if len(string)==2*n:
            ans.append(string)
        if left > 0:
            helper(string+'(', left-1,right,n,[])
        if right > left:
            helper(string+')', left,right-1,n,[])
    helper('',n,n,n,[])
    
paren(2)

OUTPUT:

0x2e5e2446288 []
0x2e5e28e3508 []
0x2e5e28e3688 []
0x2e5e26036c8 []
0x2e5e27bafc8 []
0x2e5e28e3688 []
0x2e5e26036c8 []
0x2e5e27bafc8 []

Whereas if we explicitly pass foo each time then we get而如果我们每次都明确地传递 foo 那么我们得到

def paren(n):
    ans = []
    def helper(string, left, right,n,foo):
        print(hex(id(foo)), foo)
        if len(string)==2*n:
            ans.append(string)
        if left > 0:
            helper(string+'(', left-1,right,n,foo)
        if right > left:
            helper(string+')', left,right-1,n,foo)
    helper('',n,n,n,[])
    
paren(2)

OUTPUT:

0x1c2cfec6288 []
0x1c2cfec6288 []
0x1c2cfec6288 []
0x1c2cfec6288 []
0x1c2cfec6288 []
0x1c2cfec6288 []
0x1c2cfec6288 []
0x1c2cfec6288 []

In the first case we get a different object in memory, why is this compared to the second case when we don't I think it is to do with with the fact that we create a new list rather than passing the function argument?在第一种情况下，我们在 memory 中得到不同的 object，为什么与第二种情况相比，我认为这与我们创建一个新列表而不是传递 ZC1C425268E687385D1AB14A507C 参数有关？

However when we add something to foo we get the same behaviour as with the first case:然而，当我们向 foo 添加一些东西时，我们会得到与第一种情况相同的行为：

def paren(n):
    ans = []
    def helper(string, left, right,n,foo):
        print(hex(id(foo)), foo)
        if len(string)==2*n:
            ans.append(string)
        if left > 0:
            helper(string+'(', left-1,right,n,foo+['bar'])
        if right > left:
            helper(string+')', left,right-1,n,foo+['bar'])
    helper('',n,n,n,[])
    
paren(2)

OUTPUT:

0x269572e6288 []
0x26959283548 ['bar']
0x26957363688 ['bar', 'bar']
0x2695925ae88 ['bar', 'bar', 'bar']
0x26957363408 ['bar', 'bar', 'bar', 'bar']
0x2695925ae88 ['bar', 'bar']
0x26957363408 ['bar', 'bar', 'bar']
0x269592833c8 ['bar', 'bar', 'bar', 'bar']

But strangely if we pass some int for foo, I will take 5 for demonstration we get:但奇怪的是，如果我们为 foo 传递一些 int，我将取 5 进行演示，我们得到：

def paren(n):
    ans = []
    def helper(string, left, right,n,foo):
        print(hex(id(foo)), foo)
        if len(string)==2*n:
            ans.append(string)
        if left > 0:
            helper(string+'(', left-1,right,n,5)
        if right > left:
            helper(string+')', left,right-1,n,5)
    helper('',n,n,n,5)
    
paren(2)

OUTPUT:

0x7ffef47293c0 5
0x7ffef47293c0 5
0x7ffef47293c0 5
0x7ffef47293c0 5
0x7ffef47293c0 5
0x7ffef47293c0 5
0x7ffef47293c0 5
0x7ffef47293c0 5

ie the same point in memory.即 memory 中的同一点。

However if I replace the 5 in the above code with a larger int for instance 2550 I get the following:但是，如果我用更大的 int 替换上述代码中的 5，例如 2550，我会得到以下结果：

0x2519f6d4790 2550
0x2519f9ec6f0 2550
0x2519f9ec6f0 2550
0x2519f9ec6f0 2550
0x2519f9ec6f0 2550
0x2519f9ec6f0 2550
0x2519f9ec6f0 2550
0x2519f9ec6f0 2550

So initially it is stored at a different memory address but each subsequent call is at the same address.所以最初它存储在不同的 memory 地址，但每个后续调用都在同一个地址。 Why is this changing from the case foo=5 what is going on here?为什么这与 foo=5 的情况不同，这里发生了什么？

Also in examples where the memory address is changing between calls I do see the same memory addresses being used on more than one occasion for example:同样在 memory 地址在调用之间发生变化的示例中，我确实看到相同的 memory 地址在多个场合使用，例如：

...
0x2695925ae88 ['bar', 'bar', 'bar']
...
0x2695925ae88 ['bar', 'bar']
...

Why is this the case?为什么会这样？ Is python using previously used memory addresses to store new variables once the old ones are no longer on the recursion call stack?一旦旧变量不再在递归调用堆栈中，python 是否使用以前使用的 memory 地址来存储新变量？

My mind is really fuzzy on these behaviours so if anyone could help me out that would be great!我对这些行为真的很模糊，所以如果有人能帮助我，那就太好了！

I have heard of things like pass by reference and pass by value, but I am not too sure what they mean and if it relates to this python example.我听说过诸如按引用传递和按值传递之类的事情，但我不太确定它们的含义以及它是否与此 python 示例有关。

Thank you.谢谢你。

Answer 1

First code: the call stack keeps a reference to each foo , so you have many lists in memory at once, each with a unique ID.第一个代码：调用堆栈保留对每个foo的引用，因此您在 memory 中有许多列表，每个列表都有一个唯一的 ID。

Second code: you pass the same list (that was initially empty) to each recursive call.第二个代码：您将相同的列表（最初为空）传递给每个递归调用。

Third code: Cpython, as an implementation-specific optimization, caches small int constants for reuse第三个代码：Cpython，作为特定于实现的优化，缓存小的int常量以供重用

Fourth code Cpython does not cache large (ie, greater than 256), so each occurrence of 2550 creates a new int object.第四个代码 Cpython不会缓存大（即大于 256），因此每次出现 2550 都会创建一个新的int object。

Answer 2

I will try to explain this behavior to you with a generic explanation.我将尝试用通用的解释向您解释这种行为。

Python differentiates between reference data types and primitive data types. Python 区分参考数据类型和原始数据类型。 Reference data types save references in the memory to a stored value and not the value them self.引用数据类型将 memory 中的引用保存为存储值，而不是它们自身的值。 On the other hand primitive (or value) data types save the value itself.另一方面，原始（或值）数据类型保存值本身。

If you pass an array to a method it will create a new instance of this array every time.如果你将一个数组传递给一个方法，它每次都会创建这个数组的一个新实例。 So doing some_function([]) generates a new array and passes it to the function.因此，执行some_function([])会生成一个新数组并将其传递给 function。 This means calling some_function([]) and then some_function([]) again will create one array on each call.这意味着调用some_function([])然后再次调用 some_function([] some_function([])将在每次调用时创建一个数组。 However, if you create an array like foo = [] then foo will reference to the array and calling some_function(foo) and then some_function(foo) will execute some_function on this array.但是，如果您创建一个像foo = []这样的数组，那么 foo 将引用该数组并调用some_function(foo)然后some_function(foo)将在该数组上执行some_function 。

This is not happening for values like float, int, boolean, but for values like objects and arrays.对于 float、int、boolean 之类的值不会发生这种情况，但对于对象和 arrays 之类的值会发生这种情况。

Follow up research can be done by following keywords: reference data type, value data type, built in types.可以通过以下关键字进行后续研究：引用数据类型、值数据类型、内置类型。

I hope my answer was helpful in understanding the stated problem.我希望我的回答有助于理解所述问题。

为什么 python memory 分配会这样？

问题描述

2 个解决方案

解决方案1
1 2020-07-06 18:10:55

解决方案2
0 2020-07-06 18:17:24

为什么 python memory 分配会这样？

问题描述

2 个解决方案

解决方案1 1 2020-07-06 18:10:55

解决方案2 0 2020-07-06 18:17:24

解决方案1
1 2020-07-06 18:10:55

解决方案2
0 2020-07-06 18:17:24