简体   繁体   English

PHP的阵列内存使用管理如何工作?

[英]How does PHP’s array memory usage management work?

I'm trying to figure out how does PHP load arrays to memory and when does passing an array consume memory. 我试图找出PHP如何将数组加载到内存以及何时传递数组消耗内存。

So I've got this little bit of code running: note that the input array is less important in this example: 所以我运行了一些代码:注意输入数组在这个例子中不太重要:

<?php

echo $this->getMemoryUsage();
$arr = $query->result_array(); // array of arrays from codeigniter
echo $this->getMemoryUsage();

This consumes exactly 250 kB of memory, this means the array is roughly 250 kB in size, roughly . 这会消耗的存储器恰好250 KB时,这意味着该数组的大小约250 KB, 大致

So I ran the following code: 所以我运行了以下代码:

<?php

echo $this->getMemoryUsage();
$arr = $query->result_array(); // array of arrays from codeigniter

$arr[0]['id'] = 'changing this value';

$foo = $arr;
$foo[2]['id'] = 'changing this value again';

$bar = $foo;
$bar[4]['id'] = 'changing this value again and again';

$far = $bar;
$far[5]['id'] = 'changing this value again and again and again';

echo $this->getMemoryUsage();

According to what I read and was told, PHP doesn't actually copy the array, it only references the original array, but once a change is made PHP has to copy the entire array. 根据我读到并被告知,PHP实际上并不复制数组,它只引用原始数组, 一旦做出更改,PHP就必须复制整个数组。

Imagine my surprise when the above code consumes exactly 500 kB of RAM. 想象一下,当上面的代码消耗恰好500 kB的RAM时,我感到很惊讶。

Can anyone explain what's going on here? 谁能解释一下这里发生了什么?

Just to be clear, all these indices (0–5 and id ) already exist in the original array, I'm just modifying the value. 为了清楚起见,所有这些索引(0-5和id )已经存在于原始数组中,我只是修改了这个值。 The original value is some integer. 原始值是一些整数。

EDIT 编辑

Just to clear the involvement of $this->result(); 只是为了清除$ this-> result()的参与; Here's another test I've conducted : 这是我进行的另一项测试:

    echo $this->getMemoryUsage();
    $arr = $query->result_array(); // array of arrays from codeigniter
//$arr[0]['id'] = 'changing this value';

    $foo = $arr;
    $foo[2]['id'] = 'changing this value again';

    //$bar = $foo;
    //$bar[4]['id'] = 'changing this value again and again';
    //
    //$far = $bar;
    //$far[4]['id'] = 'changing this value again and again and again';

    echo $this->getMemoryUsage();

This time the output is exactly 250 kB - Just like the original trial without any changes 这次输出正好是250 kB - 就像原始试验没有任何变化一样

EDIT #2 编辑#2

As requested, I've ran the code from here on my setup, to make sure results are consistent : http://pastebin.com/cYNg4cg7 根据要求,我在设置中运行了代码,以确保结果一致: http//pastebin.com/cYNg4cg7

These are the results : 这些是结果:

DECLARATION: 4608 kB 声明:4608 kB
FINAL: 8904 kB 最终:8904 kB
DIFF TO DECLARATION: 4296 kB DIFF to DECLARATION:4296 kB

So even though the declaration was 4608 and the array was passed and changed 4 times, it's still only less than doubled the memory footprint. 因此即使声明是4608并且数组已经通过并且更改了4次,它仍然只是内存占用量的两倍。

EDIT #3 编辑#3

I've ran the memory changes after each allocation : 我在每次分配后运行了内存更改:

DECLARATION: 5144 kB 声明:5144 kB
allocating A0 added : 144 kB 分配A0添加:144 kB
allocating A1 added : 1768 kB 分配A1添加:1768 kB
allocating A2 added : 1768 kB 分配A2添加:1768 kB
allocating A3 added : 1768 kB 分配A3添加:1768 kB
FINAL: 10744 kB 最终:10744 kB
DIFF TO DECLARATION: 5600 kB DIFF到声明:5600 kB

Each following operation after the first costs exactly the same, which seems to indicate the exact same size is being copied. 每个后续操作后的第一个成本完全相同,这似乎表明正在复制完全相同的大小。 This seems to support Austin's answer, The only thing that doesn't add up now is the size that's allocated, But that's a different question . 这似乎支持奥斯汀的答案,现在唯一没有加起来的是分配的大小,但这是一个不同的问题。

Seems like Austin's on the ball, I'll accept it if no other answer comes by. 看起来像奥斯汀的球,如果没有其他答案,我会接受它。

Here's what I think is going on: 这是我的想法:

PHP arrays are copy on write as you say, but each level of a multi-dimensional array is separately copy on write. 正如你所说,PHP数组是写入时的复制,但是多维数组的每个级别都是在写入时单独复制。 PHP is very smart about reusing parts of a multi-dimensional array and not just the whole thing. PHP非常聪明地重用多维数组的部分而不仅仅是整个部分。 (This is similar to some file systems that support snapshots, like ZFS.) (这类似于支持快照的一些文件系统,如ZFS。)

Example: say we have this array 示例:说我们有这个数组

$x = array('foo' => array(1, 2, 3), 'bar' => array(4, 5, 6));

This is stored in memory not as a single chunk, but as separate chunks here labeled A , B , C , and $x : 它存储在内存中而不是作为单个块,而是作为单独的块标记为ABC$x

array(1, 2, 3) //A
array(4, 5, 6) //B
array('foo' => {pointer to A}, 'bar' => {pointer to B}) //C
{pointer to C} //$x

Now lets make a copy of $x : 现在让我们复制$x

$y = $x;

This uses very little extra memory, because all it has to do is create another pointer to C : 这使用非常少的额外内存,因为它所要做的就是创建另一个指向C指针:

array(1, 2, 3) //A
array(4, 5, 6) //B
array('foo' => {pointer to A}, 'bar' => {pointer to B}) //C
{pointer to C} //$x
{pointer to C} //$y

Now lets change $y : 现在让我们改变$y

$y['foo'][0] = 10;

Here's what DOESN'T happen: 这是不会发生的事情:

array(1, 2, 3) //A
array(10, 2, 3) //A2
array(4, 5, 6) //B
array(4, 5, 6) //B2
array('foo' => {pointer to A}, 'bar' => {pointer to B}) //C
array('foo' => {pointer to A2}, 'bar' => {pointer to B2}) //C2
{pointer to C} //$x
{pointer to C2} //$y

Notice that B and B2 are identical. 请注意, BB2是相同的。 There's no need to keep the same thing twice, so what actually happens is this: 没有必要保持两次相同的东西,所以实际发生的是这样的:

array(1, 2, 3) //A
array(10, 2, 3) //A2
array(4, 5, 6) //B
array('foo' => {pointer to A}, 'bar' => {pointer to B}) //C
array('foo' => {pointer to A2}, 'bar' => {pointer to B}) //C2
{pointer to C} //$x
{pointer to C2} //$y

In this simple case, the benefit is pretty small, but imagine that instead of three numbers, the 'bar' array contained thousands of numbers. 在这个简单的情况下,好处很小,但想象一下, 'bar'数组包含数千个数字,而不是三个数字。 You end up saving huge amounts of memory. 最终节省了大量内存。

Relating this to your original code, try printing out the memory usage not only at the start and the end, but after every new array assignment. 将此与原始代码相关联,尝试不仅在开始和结束时打印内存使用,而是在每次新阵列分配后打印出内存使用情况。 You'll see that the memory usage increases by only a fraction of what the original array takes up after each step. 您将看到内存使用量仅增加了原始阵列在每个步骤后占用的一小部分。 This is because only part of the array is being copied, not the whole thing. 这是因为只复制了部分数组,而不是整个数组。 Specifically, the first-level array and the specific sub array you change get copied, but the other sub arrays do not get copied. 具体来说,您更改的第一级数组和特定子数组将被复制,但不会复制其他子数组。

The fact that the final amount of memory used is twice as much as the starting amount seems to be a coincidence due to the particular setup of your code and the number of copies of the array you make. 由于您的代码的特定设置和您所创建的阵列的副本数量,所使用的最终内存量是起始量的两倍似乎是巧合。

(In reality, PHP can do even better than what I describe here (it will probably keep only one copy of 'foo' and 'bar' , etc.), but for the most part it boils down to the same sort of trick.) (实际上,PHP可以比我在这里描述的更好(它可能只保留'foo''bar'等的一个副本),但在大多数情况下,它归结为同样的技巧。 )

If you want a more dramatic demonstration of this, do something like this: 如果你想要一个更戏剧性的演示,请做这样的事情:

$base = memory_get_usage();
$x = array('small' => array('this is small'), 'big' => array());
for ($i = 0; $i < 1000000; $i++) {
    $x['big'][] = $i;
}
echo (memory_get_usage() - $base).PHP_EOL; //a lot of memory
$y = $x;
$y['small'][0] = 'now a bit bigger';
echo (memory_get_usage() - $base).PHP_EOL; //a bit more memory
$z = $x;
$z['big'][0] = 2;
echo (memory_get_usage() - $base).PHP_EOL; //a LOT more memory

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM