简体   繁体   English

常见的Perl内存/参考泄漏模式?

[英]Common Perl memory/reference leak patterns?

I'm chasing a couple of potential memory leaks in a Perl code base and I'd like to know about common pitfalls with regards to memory (mis-)management in Perl. 我在Perl代码库中追逐几个潜在的内存泄漏,我想知道Perl中有关内存(错误)管理的常见缺陷。

What are common leak patterns you have observed in Perl code? 您在Perl代码中观察到的常见泄漏模式是什么?

Circular references are by far the most common the canonical cause of leaks. 到目前为止, 循环引用是泄漏的典型原因。

sub leak {
    my ($foo, $bar);
    $foo = \$bar;
    $bar = \$foo;
}

Perl uses reference counting garbage collection. Perl使用引用计数垃圾收集。 This means that perl keeps a count of what pointers to any variable exist at a given time. 这意味着perl会保留在给定时间存在指向任何变量的指针的计数。 If the variable goes out of scope and the count is 0, the variable is cleared. 如果变量超出范围且计数为0,则清除变量。

In the example code above, $foo and $bar are never collected and a copy will persist after every invocation of leak() because both variables have a reference count of 1. 在上面的示例代码中,永远不会收集$foo$bar ,并且每次调用leak()后都会保留一个副本,因为两个变量的引用计数都是1。

The easiest way to prevent this issue is to use weak references. 防止此问题的最简单方法是使用弱引用。 Weak references are references that you follow to access data, but do not count for garbage collection. 弱引用是您访问数据时所遵循的引用,但不计入垃圾回收。

use Scalar::Util qw(weaken);

sub dont_leak {
    my ($foo, $bar);
    $foo = \$bar;
    $bar = \$foo;
    weaken $bar;
}

In dont_leak() , $foo has a reference count of 0, $bar has a ref count of 1. When we leave the scope of the subroutine, $foo is returned to the pool, and its reference to $bar is cleared. dont_leak()$foo的引用计数为0, $bar的引用计数为1.当我们离开子例程的范围时, $foo返回到池中,并且它对$bar引用被清除。 This drops the ref count on $bar to 0, which means that $bar can also return to the pool. 这会将$bar上的引用计数降为0,这意味着$bar也可以返回池中。

Update: brain d foy asked if I have any data to back up my assertion that circular references are common. 更新:脑子问我是否有任何数据来支持循环引用很常见的断言。 No, I don't have any statistics to show that circular references are common. 不,我没有任何统计数据显示循环引用很常见。 They are the most commonly talked about and best documented form of perl memory leaks. 它们是perl内存泄漏最常被谈论和最佳记录形式。

My experience is that they do happen. 我的经验是他们确实发生了。 Here's a quick rundown on the memory leaks I have seen over a decade of working with Perl. 这是我在使用Perl十多年后看到的内存泄漏的快速概述。

I've had problems with pTk apps developing leaks. 我遇到了pTk应用程序开发泄漏的问题。 Some leaks I was able to prove were due to circular references that cropped up when Tk passes window references around. 我能够证明的一些泄漏是由于当Tk通过窗口参考时出现的循环引用。 I've also seen pTk leaks whose cause I could never track down. 我也看到了pTk泄漏,其原因我永远无法追查。

I've seen the people misunderstand weaken and wind up with circular references by accident. 我看到人们的误解weaken了,偶然发现了循环引用。

I've seen unintentional cycles crop up when too many poorly thought out objects get thrown together in a hurry. 当太多经过深思熟虑的物体被匆忙抛到一起时,我已经看到无意的循环。

On one occasion I found memory leaks that came from an XS module that was creating large, deep data structures. 有一次,我发现来自XS模块的内存泄漏正在创建大而深的数据结构。 I was never able to get a reproducible test case that was smaller than the whole program. 我从来没有能够获得比整个程序更小的可重现的测试用例。 But when I replaced the module with another serializer, the leaks went away. 但是当我用另一个串行器替换模块时,泄漏就消失了。 So I know those leaks came from the XS. 所以我知道这些漏洞来自XS。

So, in my experience cycles are a major source of leaks. 因此,根据我的经验,周期是泄漏的主要来源。

Fortunately, there is a module to help track them down. 幸运的是, 有一个模块可以帮助追踪它们。

As to whether big global structures that never get cleaned up constitute "leaks", I agree with brian. 至于从未得到清理的大型全球结构是否构成“泄密”,我同意布莱恩的意见。 They quack like leaks (we have ever-growing process memory usage due to a bug), so they are leaks. 他们像泄漏一样嘎嘎叫(由于一个bug,我们的进程内存使用量不断增长),所以它们是泄漏的。 Even so, I don't recall ever seeing this particular problem in the wild. 即便如此,我也记得在野外看不到这个特殊的问题。

Based on what I see on Stonehenge's site, I guess brian sees a lot of sick code from people he is training or preforming curative miracles for. 根据我在巨石阵的网站上看到的内容,我猜布莱恩看到了很多来自他正在训练的人或者为他们制作治疗奇迹的病假代码。 So his sample set is easily much bigger and varied than mine, but it has its own selection bias. 所以他的样本集比我的样本集更容易变化,但它有自己的选择偏差。

Which cause of leaks is most common? 哪种泄漏原因最常见? I don't think we'll ever really know. 我认为我们真的不知道。 But we can all agree that circular references and global data junkyards are anti-patterns that need to be eliminated where possible, and handled with care and caution in the few cases where they make sense. 但是我们都同意循环引用和全局数据垃圾是反模式,需要在可能的情况下消除,并在有意义的少数情况下谨慎处理。

If the problem is in the Perl code, you might have a reference that points to itself, or a parent node. 如果问题出在Perl代码中,则可能有一个指向自身的引用或父节点。

Usually it comes in the form of an object, that reference a parent object. 通常它以对象的形式出现,引用父对象。

{ package parent;
  sub new{ bless { 'name' => $_[1] }, $_[0] }
  sub add_child{
    my($self,$child_name) = @_;
    my $child = child->new($child_name,$self);
    $self->{$child_name} = $child;   # saves a reference to the child
    return $child;
  }
}
{ package child;
  sub new{
    my($class,$name,$parent) = @_;
    my $self = bless {
      'name' => $name,
      'parent' => $parent # saves a reference to the parent
    }, $class;
    return $self;
  }
}
{
  my $parent = parent->new('Dad');
  my $child  = parent->add_child('Son');

  # At this point both of these are true
  # $parent->{Son}{parent} == $parent
  # $child->{parent}{Son}  == $child

  # Both of the objects **would** be destroyed upon leaving
  # the current scope, except that the object is self-referential
}

# Both objects still exist here, but there is no way to access either of them.

The best way to fix this is to use Scalar::Util::weaken . 解决这个问题的最好方法是使用Scalar :: Util :: weaken

use Scalar::Util qw'weaken';
{ package child;
  sub new{
    my($class,$name,$parent) = @_;
    my $self = bless {
      'name' => $name,
      'parent' => $parent
    }, $class;

    weaken ${$self->{parent}};

    return $self;
  }
}

I would recommend dropping the reference to the parent object, from the child, if at all possible. 如果可能的话,我建议从子进程中删除对父对象的引用。

I've had problems with XS in the past, both my own hand-rolled stuff and CPAN modules, where memory is leaked from within the C code if it's not properly managed. 我以前遇到过XS的问题,包括我自己的手工卷制和CPAN模块,如果管理不当,内存会从C代码中泄露出来。 I never managed to track the leaks down; 我从未设法追踪泄漏; the project was on a tight deadline and had a fixed operational lifetime, so I papered over the issue with a daily cron reboot. 该项目处于紧迫的最后期限并且具有固定的运行寿命,因此我通过每日cron重启来解决问题。 cron is truly wonderful. cron真是太棒了。

Some modules from CPAN use circular references to do their work, eg HTML::TreeBuilder (which represents HTML tree). CPAN的一些模块使用循环引用来完成它们的工作,例如HTML :: TreeBuilder (代表HTML树)。 They will require you to run some destroying method/routine at the end. 它们将要求你在最后运行一些破坏方法/例程。 Just read the docs :) 刚看完文档:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM